recognition feature sets: Topics by Science.gov

Sample records for recognition feature sets

Operator for object recognition and scene analysis by estimation of set occupancy with noisy and incomplete data sets

NASA Astrophysics Data System (ADS)

Rees, S. J.; Jones, Bryan F.

1992-11-01

Once feature extraction has occurred in a processed image, the recognition problem becomes one of defining a set of features which maps sufficiently well onto one of the defined shape/object models to permit a claimed recognition. This process is usually handled by aggregating features until a large enough weighting is obtained to claim membership, or an adequate number of located features are matched to the reference set. A requirement has existed for an operator or measure capable of a more direct assessment of membership/occupancy between feature sets, particularly where the feature sets may be defective representations. Such feature set errors may be caused by noise, by overlapping of objects, and by partial obscuration of features. These problems occur at the point of acquisition: repairing the data would then assume a priori knowledge of the solution. The technique described in this paper offers a set theoretical measure for partial occupancy defined in terms of the set of minimum additions to permit full occupancy and the set of locations of occupancy if such additions are made. As is shown, this technique permits recognition of partial feature sets with quantifiable degrees of uncertainty. A solution to the problems of obscuration and overlapping is therefore available.
Robust Point Set Matching for Partial Face Recognition.

PubMed

Weng, Renliang; Lu, Jiwen; Tan, Yap-Peng

2016-03-01

Over the past three decades, a number of face recognition methods have been proposed in computer vision, and most of them use holistic face images for person identification. In many real-world scenarios especially some unconstrained environments, human faces might be occluded by other objects, and it is difficult to obtain fully holistic face images for recognition. To address this, we propose a new partial face recognition approach to recognize persons of interest from their partial faces. Given a pair of gallery image and probe face patch, we first detect keypoints and extract their local textural features. Then, we propose a robust point set matching method to discriminatively match these two extracted local feature sets, where both the textural information and geometrical information of local features are explicitly used for matching simultaneously. Finally, the similarity of two faces is converted as the distance between these two aligned feature sets. Experimental results on four public face data sets show the effectiveness of the proposed approach.
Action recognition using mined hierarchical compound features.

PubMed

Gilbert, Andrew; Illingworth, John; Bowden, Richard

2011-05-01

The field of Action Recognition has seen a large increase in activity in recent years. Much of the progress has been through incorporating ideas from single-frame object recognition and adapting them for temporal-based action recognition. Inspired by the success of interest points in the 2D spatial domain, their 3D (space-time) counterparts typically form the basic components used to describe actions, and in action recognition the features used are often engineered to fire sparsely. This is to ensure that the problem is tractable; however, this can sacrifice recognition accuracy as it cannot be assumed that the optimum features in terms of class discrimination are obtained from this approach. In contrast, we propose to initially use an overcomplete set of simple 2D corners in both space and time. These are grouped spatially and temporally using a hierarchical process, with an increasing search area. At each stage of the hierarchy, the most distinctive and descriptive features are learned efficiently through data mining. This allows large amounts of data to be searched for frequently reoccurring patterns of features. At each level of the hierarchy, the mined compound features become more complex, discriminative, and sparse. This results in fast, accurate recognition with real-time performance on high-resolution video. As the compound features are constructed and selected based upon their ability to discriminate, their speed and accuracy increase at each level of the hierarchy. The approach is tested on four state-of-the-art data sets, the popular KTH data set to provide a comparison with other state-of-the-art approaches, the Multi-KTH data set to illustrate performance at simultaneous multiaction classification, despite no explicit localization information provided during training. Finally, the recent Hollywood and Hollywood2 data sets provide challenging complex actions taken from commercial movie sequences. For all four data sets, the proposed hierarchical approach outperforms all other methods reported thus far in the literature and can achieve real-time operation.
Modeling Spoken Word Recognition Performance by Pediatric Cochlear Implant Users using Feature Identification

PubMed Central

Frisch, Stefan A.; Pisoni, David B.

2012-01-01

Objective Computational simulations were carried out to evaluate the appropriateness of several psycholinguistic theories of spoken word recognition for children who use cochlear implants. These models also investigate the interrelations of commonly used measures of closed-set and open-set tests of speech perception. Design A software simulation of phoneme recognition performance was developed that uses feature identification scores as input. Two simulations of lexical access were developed. In one, early phoneme decisions are used in a lexical search to find the best matching candidate. In the second, phoneme decisions are made only when lexical access occurs. Simulated phoneme and word identification performance was then applied to behavioral data from the Phonetically Balanced Kindergarten test and Lexical Neighborhood Test of open-set word recognition. Simulations of performance were evaluated for children with prelingual sensorineural hearing loss who use cochlear implants with the MPEAK or SPEAK coding strategies. Results Open-set word recognition performance can be successfully predicted using feature identification scores. In addition, we observed no qualitative differences in performance between children using MPEAK and SPEAK, suggesting that both groups of children process spoken words similarly despite differences in input. Word recognition ability was best predicted in the model in which phoneme decisions were delayed until lexical access. Conclusions Closed-set feature identification and open-set word recognition focus on different, but related, levels of language processing. Additional insight for clinical intervention may be achieved by collecting both types of data. The most successful model of performance is consistent with current psycholinguistic theories of spoken word recognition. Thus it appears that the cognitive process of spoken word recognition is fundamentally the same for pediatric cochlear implant users and children and adults with normal hearing. PMID:11132784
Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction

PubMed Central

Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio

2014-01-01

Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686
Iris recognition based on key image feature extraction.

PubMed

Ren, X; Tian, Q; Zhang, J; Wu, S; Zeng, Y

2008-01-01

In iris recognition, feature extraction can be influenced by factors such as illumination and contrast, and thus the features extracted may be unreliable, which can cause a high rate of false results in iris pattern recognition. In order to obtain stable features, an algorithm was proposed in this paper to extract key features of a pattern from multiple images. The proposed algorithm built an iris feature template by extracting key features and performed iris identity enrolment. Simulation results showed that the selected key features have high recognition accuracy on the CASIA Iris Set, where both contrast and illumination variance exist.
Speech recognition features for EEG signal description in detection of neonatal seizures.

PubMed

Temko, A; Boylan, G; Marnane, W; Lightbody, G

2010-01-01

In this work, features which are usually employed in automatic speech recognition (ASR) are used for the detection of neonatal seizures in newborn EEG. Three conventional ASR feature sets are compared to the feature set which has been previously developed for this task. The results indicate that the thoroughly-studied spectral envelope based ASR features perform reasonably well on their own. Additionally, the SVM Recursive Feature Elimination routine is applied to all extracted features pooled together. It is shown that ASR features consistently appear among the top-rank features.
Emotional recognition from the speech signal for a virtual education agent

NASA Astrophysics Data System (ADS)

Tickle, A.; Raghu, S.; Elshaw, M.

2013-06-01

This paper explores the extraction of features from the speech wave to perform intelligent emotion recognition. A feature extract tool (openSmile) was used to obtain a baseline set of 998 acoustic features from a set of emotional speech recordings from a microphone. The initial features were reduced to the most important ones so recognition of emotions using a supervised neural network could be performed. Given that the future use of virtual education agents lies with making the agents more interactive, developing agents with the capability to recognise and adapt to the emotional state of humans is an important step.
When false recognition is out of control: the case of facial conjunctions.

PubMed

Jones, Todd C; Bartlett, James C

2009-03-01

In three experiments, a dual-process approach to face recognition memory is examined, with a specific focus on the idea that a recollection process can be used to retrieve configural information of a studied face. Subjects could avoid, with confidence, a recognition error to conjunction lure faces (each a reconfiguration of features from separate studied faces) or feature lure faces (each based on a set of old features and a set of new features) by recalling a studied configuration. In Experiment 1, study repetition (one vs. eight presentations) was manipulated, and in Experiments 2 and 3, retention interval over a short number of trials (0-20) was manipulated. Different measures converged on the conclusion that subjects were unable to use a recollection process to retrieve configural information in an effort to temper recognition errors for conjunction or feature lure faces. A single process, familiarity, appears to be the sole process underlying recognition of conjunction and feature faces, and familiarity contributes, perhaps in whole, to discrimination of old from conjunction faces.
Speech emotion recognition methods: A literature review

NASA Astrophysics Data System (ADS)

Basharirad, Babak; Moradhaseli, Mohammadreza

2017-10-01

Recently, attention of the emotional speech signals research has been boosted in human machine interfaces due to availability of high computation capability. There are many systems proposed in the literature to identify the emotional state through speech. Selection of suitable feature sets, design of a proper classifications methods and prepare an appropriate dataset are the main key issues of speech emotion recognition systems. This paper critically analyzed the current available approaches of speech emotion recognition methods based on the three evaluating parameters (feature set, classification of features, accurately usage). In addition, this paper also evaluates the performance and limitations of available methods. Furthermore, it highlights the current promising direction for improvement of speech emotion recognition systems.
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology

PubMed Central

Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang

2016-01-01

Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease. PMID:27977767
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology.

PubMed

Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang

2016-01-01

Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease.
Gene/protein name recognition based on support vector machine using dictionary as features.

PubMed

Mitsumori, Tomohiro; Fation, Sevrani; Murata, Masaki; Doi, Kouichi; Doi, Hirohumi

2005-01-01

Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in information extraction. We developed an automated recognition system based on the SVM algorithm and evaluated it in Task 1.A of BioCreAtIvE, a competition for automated gene/protein name recognition. In the work presented here, our recognition system uses the feature set of the word, the part-of-speech (POS), the orthography, the prefix, the suffix, and the preceding class. We call these features "internal resource features", i.e., features that can be found in the training data. Additionally, we consider the features of matching against dictionaries to be external resource features. We investigated and evaluated the effect of these features as well as the effect of tuning the parameters of the SVM algorithm. We found that the dictionary matching features contributed slightly to the improvement in the performance of the f-score. We attribute this to the possibility that the dictionary matching features might overlap with other features in the current multiple feature setting. During SVM learning, each feature alone had a marginally positive effect on system performance. This supports the fact that the SVM algorithm is robust on the high dimensionality of the feature vector space and means that feature selection is not required.
Online Feature Transformation Learning for Cross-Domain Object Category Recognition.

PubMed

Zhang, Xuesong; Zhuang, Yan; Wang, Wei; Pedrycz, Witold

2017-06-09

In this paper, we introduce a new research problem termed online feature transformation learning in the context of multiclass object category recognition. The learning of a feature transformation is viewed as learning a global similarity metric function in an online manner. We first consider the problem of online learning a feature transformation matrix expressed in the original feature space and propose an online passive aggressive feature transformation algorithm. Then these original features are mapped to kernel space and an online single kernel feature transformation (OSKFT) algorithm is developed to learn a nonlinear feature transformation. Based on the OSKFT and the existing Hedge algorithm, a novel online multiple kernel feature transformation algorithm is also proposed, which can further improve the performance of online feature transformation learning in large-scale application. The classifier is trained with k nearest neighbor algorithm together with the learned similarity metric function. Finally, we experimentally examined the effect of setting different parameter values in the proposed algorithms and evaluate the model performance on several multiclass object recognition data sets. The experimental results demonstrate the validity and good performance of our methods on cross-domain and multiclass object recognition application.
iFER: facial expression recognition using automatically selected geometric eye and eyebrow features

NASA Astrophysics Data System (ADS)

Oztel, Ismail; Yolcu, Gozde; Oz, Cemil; Kazan, Serap; Bunyak, Filiz

2018-03-01

Facial expressions have an important role in interpersonal communications and estimation of emotional states or intentions. Automatic recognition of facial expressions has led to many practical applications and became one of the important topics in computer vision. We present a facial expression recognition system that relies on geometry-based features extracted from eye and eyebrow regions of the face. The proposed system detects keypoints on frontal face images and forms a feature set using geometric relationships among groups of detected keypoints. Obtained feature set is refined and reduced using the sequential forward selection (SFS) algorithm and fed to a support vector machine classifier to recognize five facial expression classes. The proposed system, iFER (eye-eyebrow only facial expression recognition), is robust to lower face occlusions that may be caused by beards, mustaches, scarves, etc. and lower face motion during speech production. Preliminary experiments on benchmark datasets produced promising results outperforming previous facial expression recognition studies using partial face features, and comparable results to studies using whole face information, only slightly lower by ˜ 2.5 % compared to the best whole face facial recognition system while using only ˜ 1 / 3 of the facial region.
Secondary iris recognition method based on local energy-orientation feature

NASA Astrophysics Data System (ADS)

Huo, Guang; Liu, Yuanning; Zhu, Xiaodong; Dong, Hongxing

2015-01-01

This paper proposes a secondary iris recognition based on local features. The application of the energy-orientation feature (EOF) by two-dimensional Gabor filter to the extraction of the iris goes before the first recognition by the threshold of similarity, which sets the whole iris database into two categories-a correctly recognized class and a class to be recognized. Therefore, the former are accepted and the latter are transformed by histogram to achieve an energy-orientation histogram feature (EOHF), which is followed by a second recognition with the chi-square distance. The experiment has proved that the proposed method, because of its higher correct recognition rate, could be designated as the most efficient and effective among its companion studies in iris recognition algorithms.
Face recognition system using multiple face model of hybrid Fourier feature under uncontrolled illumination variation.

PubMed

Hwang, Wonjun; Wang, Haitao; Kim, Hyunwoo; Kee, Seok-Cheol; Kim, Junmo

2011-04-01

The authors present a robust face recognition system for large-scale data sets taken under uncontrolled illumination variations. The proposed face recognition system consists of a novel illumination-insensitive preprocessing method, a hybrid Fourier-based facial feature extraction, and a score fusion scheme. First, in the preprocessing stage, a face image is transformed into an illumination-insensitive image, called an "integral normalized gradient image," by normalizing and integrating the smoothed gradients of a facial image. Then, for feature extraction of complementary classifiers, multiple face models based upon hybrid Fourier features are applied. The hybrid Fourier features are extracted from different Fourier domains in different frequency bandwidths, and then each feature is individually classified by linear discriminant analysis. In addition, multiple face models are generated by plural normalized face images that have different eye distances. Finally, to combine scores from multiple complementary classifiers, a log likelihood ratio-based score fusion scheme is applied. The proposed system using the face recognition grand challenge (FRGC) experimental protocols is evaluated; FRGC is a large available data set. Experimental results on the FRGC version 2.0 data sets have shown that the proposed method shows an average of 81.49% verification rate on 2-D face images under various environmental variations such as illumination changes, expression changes, and time elapses.
Appearance-based face recognition and light-fields.

PubMed

Gross, Ralph; Matthews, Iain; Baker, Simon

2004-04-01

Arguably the most important decision to be made when developing an object recognition algorithm is selecting the scene measurements or features on which to base the algorithm. In appearance-based object recognition, the features are chosen to be the pixel intensity values in an image of the object. These pixel intensities correspond directly to the radiance of light emitted from the object along certain rays in space. The set of all such radiance values over all possible rays is known as the plenoptic function or light-field. In this paper, we develop a theory of appearance-based object recognition from light-fields. This theory leads directly to an algorithm for face recognition across pose that uses as many images of the face as are available, from one upwards. All of the pixels, whichever image they come from, are treated equally and used to estimate the (eigen) light-field of the object. The eigen light-field is then used as the set of features on which to base recognition, analogously to how the pixel intensities are used in appearance-based face and object recognition.
Weighted score-level feature fusion based on Dempster-Shafer evidence theory for action recognition

NASA Astrophysics Data System (ADS)

Zhang, Guoliang; Jia, Songmin; Li, Xiuzhi; Zhang, Xiangyin

2018-01-01

The majority of human action recognition methods use multifeature fusion strategy to improve the classification performance, where the contribution of different features for specific action has not been paid enough attention. We present an extendible and universal weighted score-level feature fusion method using the Dempster-Shafer (DS) evidence theory based on the pipeline of bag-of-visual-words. First, the partially distinctive samples in the training set are selected to construct the validation set. Then, local spatiotemporal features and pose features are extracted from these samples to obtain evidence information. The DS evidence theory and the proposed rule of survival of the fittest are employed to achieve evidence combination and calculate optimal weight vectors of every feature type belonging to each action class. Finally, the recognition results are deduced via the weighted summation strategy. The performance of the established recognition framework is evaluated on Penn Action dataset and a subset of the joint-annotated human metabolome database (sub-JHMDB). The experiment results demonstrate that the proposed feature fusion method can adequately exploit the complementarity among multiple features and improve upon most of the state-of-the-art algorithms on Penn Action and sub-JHMDB datasets.
Chinese License Plates Recognition Method Based on A Robust and Efficient Feature Extraction and BPNN Algorithm

NASA Astrophysics Data System (ADS)

Zhang, Ming; Xie, Fei; Zhao, Jing; Sun, Rui; Zhang, Lei; Zhang, Yue

2018-04-01

The prosperity of license plate recognition technology has made great contribution to the development of Intelligent Transport System (ITS). In this paper, a robust and efficient license plate recognition method is proposed which is based on a combined feature extraction model and BPNN (Back Propagation Neural Network) algorithm. Firstly, the candidate region of the license plate detection and segmentation method is developed. Secondly, a new feature extraction model is designed considering three sets of features combination. Thirdly, the license plates classification and recognition method using the combined feature model and BPNN algorithm is presented. Finally, the experimental results indicate that the license plate segmentation and recognition both can be achieved effectively by the proposed algorithm. Compared with three traditional methods, the recognition accuracy of the proposed method has increased to 95.7% and the consuming time has decreased to 51.4ms.

Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition.

PubMed

Yuan, Chunfeng; Li, Xi; Hu, Weiming; Ling, Haibin; Maybank, Stephen J

2014-02-01

In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved.
Recognition of blurred images by the method of moments.

PubMed

Flusser, J; Suk, T; Saic, S

1996-01-01

The article is devoted to the feature-based recognition of blurred images acquired by a linear shift-invariant imaging system against an image database. The proposed approach consists of describing images by features that are invariant with respect to blur and recognizing images in the feature space. The PSF identification and image restoration are not required. A set of symmetric blur invariants based on image moments is introduced. A numerical experiment is presented to illustrate the utilization of the invariants for blurred image recognition. Robustness of the features is also briefly discussed.
Rotation, scale, and translation invariant pattern recognition using feature extraction

NASA Astrophysics Data System (ADS)

Prevost, Donald; Doucet, Michel; Bergeron, Alain; Veilleux, Luc; Chevrette, Paul C.; Gingras, Denis J.

1997-03-01

A rotation, scale and translation invariant pattern recognition technique is proposed.It is based on Fourier- Mellin Descriptors (FMD). Each FMD is taken as an independent feature of the object, and a set of those features forms a signature. FMDs are naturally rotation invariant. Translation invariance is achieved through pre- processing. A proper normalization of the FMDs gives the scale invariance property. This approach offers the double advantage of providing invariant signatures of the objects, and a dramatic reduction of the amount of data to process. The compressed invariant feature signature is next presented to a multi-layered perceptron neural network. This final step provides some robustness to the classification of the signatures, enabling good recognition behavior under anamorphically scaled distortion. We also present an original feature extraction technique, adapted to optical calculation of the FMDs. A prototype optical set-up was built, and experimental results are presented.
Subject-specific and pose-oriented facial features for face recognition across poses.

PubMed

Lee, Ping-Han; Hsu, Gee-Sern; Wang, Yun-Wen; Hung, Yi-Ping

2012-10-01

Most face recognition scenarios assume that frontal faces or mug shots are available for enrollment to the database, faces of other poses are collected in the probe set. Given a face from the probe set, one needs to determine whether a match in the database exists. This is under the assumption that in forensic applications, most suspects have their mug shots available in the database, and face recognition aims at recognizing the suspects when their faces of various poses are captured by a surveillance camera. This paper considers a different scenario: given a face with multiple poses available, which may or may not include a mug shot, develop a method to recognize the face with poses different from those captured. That is, given two disjoint sets of poses of a face, one for enrollment and the other for recognition, this paper reports a method best for handling such cases. The proposed method includes feature extraction and classification. For feature extraction, we first cluster the poses of each subject's face in the enrollment set into a few pose classes and then decompose the appearance of the face in each pose class using Embedded Hidden Markov Model, which allows us to define a set of subject-specific and pose-priented (SSPO) facial components for each subject. For classification, an Adaboost weighting scheme is used to fuse the component classifiers with SSPO component features. The proposed method is proven to outperform other approaches, including a component-based classifier with local facial features cropped manually, in an extensive performance evaluation study.
Facial recognition using multisensor images based on localized kernel eigen spaces.

PubMed

Gundimada, Satyanadh; Asari, Vijayan K

2009-06-01

A feature selection technique along with an information fusion procedure for improving the recognition accuracy of a visual and thermal image-based facial recognition system is presented in this paper. A novel modular kernel eigenspaces approach is developed and implemented on the phase congruency feature maps extracted from the visual and thermal images individually. Smaller sub-regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are then projected into higher dimensional spaces using kernel methods. The proposed localized nonlinear feature selection procedure helps to overcome the bottlenecks of illumination variations, partial occlusions, expression variations and variations due to temperature changes that affect the visual and thermal face recognition techniques. AR and Equinox databases are used for experimentation and evaluation of the proposed technique. The proposed feature selection procedure has greatly improved the recognition accuracy for both the visual and thermal images when compared to conventional techniques. Also, a decision level fusion methodology is presented which along with the feature selection procedure has outperformed various other face recognition techniques in terms of recognition accuracy.
Low-contrast underwater living fish recognition using PCANet

NASA Astrophysics Data System (ADS)

Sun, Xin; Yang, Jianping; Wang, Changgang; Dong, Junyu; Wang, Xinhua

2018-04-01

Quantitative and statistical analysis of ocean creatures is critical to ecological and environmental studies. And living fish recognition is one of the most essential requirements for fishery industry. However, light attenuation and scattering phenomenon are present in the underwater environment, which makes underwater images low-contrast and blurry. This paper tries to design a robust framework for accurate fish recognition. The framework introduces a two stage PCA Network to extract abstract features from fish images. On a real-world fish recognition dataset, we use a linear SVM classifier and set penalty coefficients to conquer data unbalanced issue. Feature visualization results show that our method can avoid the feature distortion in boundary regions of underwater image. Experiments results show that the PCA Network can extract discriminate features and achieve promising recognition accuracy. The framework improves the recognition accuracy of underwater living fishes and can be easily applied to marine fishery industry.
Invariant-feature-based adaptive automatic target recognition in obscured 3D point clouds

NASA Astrophysics Data System (ADS)

Khuon, Timothy; Kershner, Charles; Mattei, Enrico; Alverio, Arnel; Rand, Robert

2014-06-01

Target recognition and classification in a 3D point cloud is a non-trivial process due to the nature of the data collected from a sensor system. The signal can be corrupted by noise from the environment, electronic system, A/D converter, etc. Therefore, an adaptive system with a desired tolerance is required to perform classification and recognition optimally. The feature-based pattern recognition algorithm architecture as described below is particularly devised for solving a single-sensor classification non-parametrically. Feature set is extracted from an input point cloud, normalized, and classifier a neural network classifier. For instance, automatic target recognition in an urban area would require different feature sets from one in a dense foliage area. The figure above (see manuscript) illustrates the architecture of the feature based adaptive signature extraction of 3D point cloud including LIDAR, RADAR, and electro-optical data. This network takes a 3D cluster and classifies it into a specific class. The algorithm is a supervised and adaptive classifier with two modes: the training mode and the performing mode. For the training mode, a number of novel patterns are selected from actual or artificial data. A particular 3D cluster is input to the network as shown above for the decision class output. The network consists of three sequential functional modules. The first module is for feature extraction that extracts the input cluster into a set of singular value features or feature vector. Then the feature vector is input into the feature normalization module to normalize and balance it before being fed to the neural net classifier for the classification. The neural net can be trained by actual or artificial novel data until each trained output reaches the declared output within the defined tolerance. In case new novel data is added after the neural net has been learned, the training is then resumed until the neural net has incrementally learned with the new novel data. The associative memory capability of the neural net enables the incremental learning. The back propagation algorithm or support vector machine can be utilized for the classification and recognition.
Cognitive and artificial representations in handwriting recognition

NASA Astrophysics Data System (ADS)

Lenaghan, Andrew P.; Malyan, Ron

1996-03-01

Both cognitive processes and artificial recognition systems may be characterized by the forms of representation they build and manipulate. This paper looks at how handwriting is represented in current recognition systems and the psychological evidence for its representation in the cognitive processes responsible for reading. Empirical psychological work on feature extraction in early visual processing is surveyed to show that a sound psychological basis for feature extraction exists and to describe the features this approach leads to. The first stage of the development of an architecture for a handwriting recognition system which has been strongly influenced by the psychological evidence for the cognitive processes and representations used in early visual processing, is reported. This architecture builds a number of parallel low level feature maps from raw data. These feature maps are thresholded and a region labeling algorithm is used to generate sets of features. Fuzzy logic is used to quantify the uncertainty in the presence of individual features.
Segmental Rescoring in Text Recognition

DTIC Science & Technology

2014-02-04

description relates to rescoring text hypotheses in text recognition based on segmental features. Offline printed text and handwriting recognition (OHR) can... Handwriting , College Park, Md., 2006, which is incorporated by reference here. For the set of training images 202, a character modeler 208 receives
3D facial expression recognition using maximum relevance minimum redundancy geometrical features

NASA Astrophysics Data System (ADS)

Rabiu, Habibu; Saripan, M. Iqbal; Mashohor, Syamsiah; Marhaban, Mohd Hamiruce

2012-12-01

In recent years, facial expression recognition (FER) has become an attractive research area, which besides the fundamental challenges, it poses, finds application in areas, such as human-computer interaction, clinical psychology, lie detection, pain assessment, and neurology. Generally the approaches to FER consist of three main steps: face detection, feature extraction and expression recognition. The recognition accuracy of FER hinges immensely on the relevance of the selected features in representing the target expressions. In this article, we present a person and gender independent 3D facial expression recognition method, using maximum relevance minimum redundancy geometrical features. The aim is to detect a compact set of features that sufficiently represents the most discriminative features between the target classes. Multi-class one-against-one SVM classifier was employed to recognize the seven facial expressions; neutral, happy, sad, angry, fear, disgust, and surprise. The average recognition accuracy of 92.2% was recorded. Furthermore, inter database homogeneity was investigated between two independent databases the BU-3DFE and UPM-3DFE the results showed a strong homogeneity between the two databases.
Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

PubMed

Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie

2015-01-01

Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
Feature Selection for Wearable Smartphone-Based Human Activity Recognition with Able bodied, Elderly, and Stroke Patients

PubMed Central

2015-01-01

Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations. PMID:25885272
Comparing Pattern Recognition Feature Sets for Sorting Triples in the FIRST Database

NASA Astrophysics Data System (ADS)

Proctor, D. D.

2006-07-01

Pattern recognition techniques have been used with increasing success for coping with the tremendous amounts of data being generated by automated surveys. Usually this process involves construction of training sets, the typical examples of data with known classifications. Given a feature set, along with the training set, statistical methods can be employed to generate a classifier. The classifier is then applied to process the remaining data. Feature set selection, however, is still an issue. This paper presents techniques developed for accommodating data for which a substantive portion of the training set cannot be classified unambiguously, a typical case for low-resolution data. Significance tests on the sort-ordered, sample-size-normalized vote distribution of an ensemble of decision trees is introduced as a method of evaluating relative quality of feature sets. The technique is applied to comparing feature sets for sorting a particular radio galaxy morphology, bent-doubles, from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) database. Also examined are alternative functional forms for feature sets. Associated standard deviations provide the means to evaluate the effect of the number of folds, the number of classifiers per fold, and the sample size on the resulting classifications. The technique also may be applied to situations for which, although accurate classifications are available, the feature set is clearly inadequate, but is desired nonetheless to make the best of available information.
Generalizations of the subject-independent feature set for music-induced emotion recognition.

PubMed

Lin, Yuan-Pin; Chen, Jyh-Horng; Duann, Jeng-Ren; Lin, Chin-Teng; Jung, Tzyy-Ping

2011-01-01

Electroencephalogram (EEG)-based emotion recognition has been an intensely growing field. Yet, how to achieve acceptable accuracy on a practical system with as fewer electrodes as possible is less concerned. This study evaluates a set of subject-independent features, based on differential power asymmetry of symmetric electrode pairs [1], with emphasis on its applicability to subject variability in music-induced emotion classification problem. Results of this study have evidently validated the feasibility of using subject-independent EEG features to classify four emotional states with acceptable accuracy in second-scale temporal resolution. These features could be generalized across subjects to detect emotion induced by music excerpts not limited to the music database that was used to derive the emotion-specific features.
EEG-based recognition of video-induced emotions: selecting subject-independent feature set.

PubMed

Kortelainen, Jukka; Seppänen, Tapio

2013-01-01

Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.
Software for Partly Automated Recognition of Targets

NASA Technical Reports Server (NTRS)

Opitz, David; Blundell, Stuart; Bain, William; Morris, Matthew; Carlson, Ian; Mangrich, Mark; Selinsky, T.

2002-01-01

The Feature Analyst is a computer program for assisted (partially automated) recognition of targets in images. This program was developed to accelerate the processing of high-resolution satellite image data for incorporation into geographic information systems (GIS). This program creates an advanced user interface that embeds proprietary machine-learning algorithms in commercial image-processing and GIS software. A human analyst provides samples of target features from multiple sets of data, then the software develops a data-fusion model that automatically extracts the remaining features from selected sets of data. The program thus leverages the natural ability of humans to recognize objects in complex scenes, without requiring the user to explain the human visual recognition process by means of lengthy software. Two major subprograms are the reactive agent and the thinking agent. The reactive agent strives to quickly learn the user's tendencies while the user is selecting targets and to increase the user's productivity by immediately suggesting the next set of pixels that the user may wish to select. The thinking agent utilizes all available resources, taking as much time as needed, to produce the most accurate autonomous feature-extraction model possible.
Biologically inspired emotion recognition from speech

NASA Astrophysics Data System (ADS)

Caponetti, Laura; Buscicchio, Cosimo Alessandro; Castellano, Giovanna

2011-12-01

Emotion recognition has become a fundamental task in human-computer interaction systems. In this article, we propose an emotion recognition approach based on biologically inspired methods. Specifically, emotion classification is performed using a long short-term memory (LSTM) recurrent neural network which is able to recognize long-range dependencies between successive temporal patterns. We propose to represent data using features derived from two different models: mel-frequency cepstral coefficients (MFCC) and the Lyon cochlear model. In the experimental phase, results obtained from the LSTM network and the two different feature sets are compared, showing that features derived from the Lyon cochlear model give better recognition results in comparison with those obtained with the traditional MFCC representation.
Palm vein recognition based on directional empirical mode decomposition

NASA Astrophysics Data System (ADS)

Lee, Jen-Chun; Chang, Chien-Ping; Chen, Wei-Kuei

2014-04-01

Directional empirical mode decomposition (DEMD) has recently been proposed to make empirical mode decomposition suitable for the processing of texture analysis. Using DEMD, samples are decomposed into a series of images, referred to as two-dimensional intrinsic mode functions (2-D IMFs), from finer to large scale. A DEMD-based 2 linear discriminant analysis (LDA) for palm vein recognition is proposed. The proposed method progresses through three steps: (i) a set of 2-D IMF features of various scale and orientation are extracted using DEMD, (ii) the 2LDA method is then applied to reduce the dimensionality of the feature space in both the row and column directions, and (iii) the nearest neighbor classifier is used for classification. We also propose two strategies for using the set of 2-D IMF features: ensemble DEMD vein representation (EDVR) and multichannel DEMD vein representation (MDVR). In experiments using palm vein databases, the proposed MDVR-based 2LDA method achieved recognition accuracy of 99.73%, thereby demonstrating its feasibility for palm vein recognition.
Cross-domain expression recognition based on sparse coding and transfer learning

NASA Astrophysics Data System (ADS)

Yang, Yong; Zhang, Weiyi; Huang, Yong

2017-05-01

Traditional facial expression recognition methods usually assume that the training set and the test set are independent and identically distributed. However, in actual expression recognition applications, the conditions of independent and identical distribution are hardly satisfied for the training set and test set because of the difference of light, shade, race and so on. In order to solve this problem and improve the performance of expression recognition in the actual applications, a novel method based on transfer learning and sparse coding is applied to facial expression recognition. First of all, a common primitive model, that is, the dictionary is learnt. Then, based on the idea of transfer learning, the learned primitive pattern is transferred to facial expression and the corresponding feature representation is obtained by sparse coding. The experimental results in CK +, JAFFE and NVIE database shows that the transfer learning based on sparse coding method can effectively improve the expression recognition rate in the cross-domain expression recognition task and is suitable for the practical facial expression recognition applications.
Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

PubMed

Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

2017-01-01

Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

Combination of minimum enclosing balls classifier with SVM in coal-rock recognition

PubMed Central

Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

2017-01-01

Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition. PMID:28937987
Gaussian mixture models-based ship target recognition algorithm in remote sensing infrared images

NASA Astrophysics Data System (ADS)

Yao, Shoukui; Qin, Xiaojuan

2018-02-01

Since the resolution of remote sensing infrared images is low, the features of ship targets become unstable. The issue of how to recognize ships with fuzzy features is an open problem. In this paper, we propose a novel ship target recognition algorithm based on Gaussian mixture models (GMMs). In the proposed algorithm, there are mainly two steps. At the first step, the Hu moments of these ship target images are calculated, and the GMMs are trained on the moment features of ships. At the second step, the moment feature of each ship image is assigned to the trained GMMs for recognition. Because of the scale, rotation, translation invariance property of Hu moments and the power feature-space description ability of GMMs, the GMMs-based ship target recognition algorithm can recognize ship reliably. Experimental results of a large simulating image set show that our approach is effective in distinguishing different ship types, and obtains a satisfactory ship recognition performance.
Effectiveness of feature and classifier algorithms in character recognition systems

NASA Astrophysics Data System (ADS)

Wilson, Charles L.

1993-04-01

At the first Census Optical Character Recognition Systems Conference, NIST generated accuracy data for more than character recognition systems. Most systems were tested on the recognition of isolated digits and upper and lower case alphabetic characters. The recognition experiments were performed on sample sizes of 58,000 digits, and 12,000 upper and lower case alphabetic characters. The algorithms used by the 26 conference participants included rule-based methods, image-based methods, statistical methods, and neural networks. The neural network methods included Multi-Layer Perceptron's, Learned Vector Quantitization, Neocognitrons, and cascaded neural networks. In this paper 11 different systems are compared using correlations between the answers of different systems, comparing the decrease in error rate as a function of confidence of recognition, and comparing the writer dependence of recognition. This comparison shows that methods that used different algorithms for feature extraction and recognition performed with very high levels of correlation. This is true for neural network systems, hybrid systems, and statistically based systems, and leads to the conclusion that neural networks have not yet demonstrated a clear superiority to more conventional statistical methods. Comparison of these results with the models of Vapnick (for estimation problems), MacKay (for Bayesian statistical models), Moody (for effective parameterization), and Boltzmann models (for information content) demonstrate that as the limits of training data variance are approached, all classifier systems have similar statistical properties. The limiting condition can only be approached for sufficiently rich feature sets because the accuracy limit is controlled by the available information content of the training set, which must pass through the feature extraction process prior to classification.
Subauditory Speech Recognition based on EMG/EPG Signals

NASA Technical Reports Server (NTRS)

Jorgensen, Charles; Lee, Diana Dee; Agabon, Shane; Lau, Sonie (Technical Monitor)

2003-01-01

Sub-vocal electromyogram/electro palatogram (EMG/EPG) signal classification is demonstrated as a method for silent speech recognition. Recorded electrode signals from the larynx and sublingual areas below the jaw are noise filtered and transformed into features using complex dual quad tree wavelet transforms. Feature sets for six sub-vocally pronounced words are trained using a trust region scaled conjugate gradient neural network. Real time signals for previously unseen patterns are classified into categories suitable for primitive control of graphic objects. Feature construction, recognition accuracy and an approach for extension of the technique to a variety of real world application areas are presented.
Analysis of the IJCNN 2011 UTL Challenge

DTIC Science & Technology

2012-01-13

large datasets from various application domains: handwriting recognition, image recognition, video processing, text processing, and ecology. The goal...http //clopinet.com/ul). We made available large datasets from various application domains handwriting recognition, image recognition, video...evaluation sets consist of 4096 examples each. Dataset Domain Features Sparsity Devel. Transf. AVICENNA Handwriting 120 0% 150205 50000 HARRY Video 5000 98.1
Deep Learning Methods for Underwater Target Feature Extraction and Recognition

PubMed Central

Peng, Yuan; Qiu, Mengran; Shi, Jianfei; Liu, Liangliang

2018-01-01

The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel frequency cepstral coefficients are used as a method of underwater acoustic signal feature extraction. In this paper, a method for feature extraction and identification of underwater noise data based on CNN and ELM is proposed. An automatic feature extraction method of underwater acoustic signals is proposed using depth convolution network. An underwater target recognition classifier is based on extreme learning machine. Although convolution neural networks can execute both feature extraction and classification, their function mainly relies on a full connection layer, which is trained by gradient descent-based; the generalization ability is limited and suboptimal, so an extreme learning machine (ELM) was used in classification stage. Firstly, CNN learns deep and robust features, followed by the removing of the fully connected layers. Then ELM fed with the CNN features is used as the classifier to conduct an excellent classification. Experiments on the actual data set of civil ships obtained 93.04% recognition rate; compared to the traditional Mel frequency cepstral coefficients and Hilbert-Huang feature, recognition rate greatly improved. PMID:29780407
Learning and Recognition of Clothing Genres From Full-Body Images.

PubMed

Hidayati, Shintami C; You, Chuang-Wen; Cheng, Wen-Huang; Hua, Kai-Lung

2018-05-01

According to the theory of clothing design, the genres of clothes can be recognized based on a set of visually differentiable style elements, which exhibit salient features of visual appearance and reflect high-level fashion styles for better describing clothing genres. Instead of using less-discriminative low-level features or ambiguous keywords to identify clothing genres, we proposed a novel approach for automatically classifying clothing genres based on the visually differentiable style elements. A set of style elements, that are crucial for recognizing specific visual styles of clothing genres, were identified based on the clothing design theory. In addition, the corresponding salient visual features of each style element were identified and formulated with variables that can be computationally derived with various computer vision algorithms. To evaluate the performance of our algorithm, a dataset containing 3250 full-body shots crawled from popular online stores was built. Recognition results show that our proposed algorithms achieved promising overall precision, recall, and -score of 88.76%, 88.53%, and 88.64% for recognizing upperwear genres, and 88.21%, 88.17%, and 88.19% for recognizing lowerwear genres, respectively. The effectiveness of each style element and its visual features on recognizing clothing genres was demonstrated through a set of experiments involving different sets of style elements or features. In summary, our experimental results demonstrate the effectiveness of the proposed method in clothing genre recognition.
RecceMan: an interactive recognition assistance for image-based reconnaissance: synergistic effects of human perception and computational methods for object recognition, identification, and infrastructure analysis

NASA Astrophysics Data System (ADS)

El Bekri, Nadia; Angele, Susanne; Ruckhäberle, Martin; Peinsipp-Byma, Elisabeth; Haelke, Bruno

2015-10-01

This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR (automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely. State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human perception and computational methods in a synergistic way, both are unified in an interactive assistance system. RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain object type based on the object features that originated from the image signatures. The infrastructure analysis mode pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system offers him the possibility to assign the image signatures to features given by sample images. The underlying data set contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible solution set of objects is automatically reduced and matches only the objects that contain the selected features. Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly automated methods to extract image signatures and assign them to the corresponding features. This research includes methods for automatically determining the orientation of an object and geometric features like width and length of the object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive recognition assistance system.
Gesture recognition for smart home applications using portable radar sensors.

PubMed

Wan, Qian; Li, Yiran; Li, Changzhi; Pal, Ranadip

2014-01-01

In this article, we consider the design of a human gesture recognition system based on pattern recognition of signatures from a portable smart radar sensor. Powered by AAA batteries, the smart radar sensor operates in the 2.4 GHz industrial, scientific and medical (ISM) band. We analyzed the feature space using principle components and application-specific time and frequency domain features extracted from radar signals for two different sets of gestures. We illustrate that a nearest neighbor based classifier can achieve greater than 95% accuracy for multi class classification using 10 fold cross validation when features are extracted based on magnitude differences and Doppler shifts as compared to features extracted through orthogonal transformations. The reported results illustrate the potential of intelligent radars integrated with a pattern recognition system for high accuracy smart home and health monitoring purposes.
An audiovisual emotion recognition system

NASA Astrophysics Data System (ADS)

Han, Yi; Wang, Guoyin; Yang, Yong; He, Kun

2007-12-01

Human emotions could be expressed by many bio-symbols. Speech and facial expression are two of them. They are both regarded as emotional information which is playing an important role in human-computer interaction. Based on our previous studies on emotion recognition, an audiovisual emotion recognition system is developed and represented in this paper. The system is designed for real-time practice, and is guaranteed by some integrated modules. These modules include speech enhancement for eliminating noises, rapid face detection for locating face from background image, example based shape learning for facial feature alignment, and optical flow based tracking algorithm for facial feature tracking. It is known that irrelevant features and high dimensionality of the data can hurt the performance of classifier. Rough set-based feature selection is a good method for dimension reduction. So 13 speech features out of 37 ones and 10 facial features out of 33 ones are selected to represent emotional information, and 52 audiovisual features are selected due to the synchronization when speech and video fused together. The experiment results have demonstrated that this system performs well in real-time practice and has high recognition rate. Our results also show that the work in multimodules fused recognition will become the trend of emotion recognition in the future.
A DFT-Based Method of Feature Extraction for Palmprint Recognition

NASA Astrophysics Data System (ADS)

Choge, H. Kipsang; Karungaru, Stephen G.; Tsuge, Satoru; Fukumi, Minoru

Over the last quarter century, research in biometric systems has developed at a breathtaking pace and what started with the focus on the fingerprint has now expanded to include face, voice, iris, and behavioral characteristics such as gait. Palmprint is one of the most recent additions, and is currently the subject of great research interest due to its inherent uniqueness, stability, user-friendliness and ease of acquisition. This paper describes an effective and procedurally simple method of palmprint feature extraction specifically for palmprint recognition, although verification experiments are also conducted. This method takes advantage of the correspondences that exist between prominent palmprint features or objects in the spatial domain with those in the frequency or Fourier domain. Multi-dimensional feature vectors are formed by extracting a GA-optimized set of points from the 2-D Fourier spectrum of the palmprint images. The feature vectors are then used for palmprint recognition, before and after dimensionality reduction via the Karhunen-Loeve Transform (KLT). Experiments performed using palmprint images from the ‘PolyU Palmprint Database’ indicate that using a compact set of DFT coefficients, combined with KLT and data preprocessing, produces a recognition accuracy of more than 98% and can provide a fast and effective technique for personal identification.
Information Theory for Gabor Feature Selection for Face Recognition

NASA Astrophysics Data System (ADS)

Shen, Linlin; Bai, Li

2006-12-01

A discriminative and robust feature—kernel enhanced informative Gabor feature—is proposed in this paper for face recognition. Mutual information is applied to select a set of informative and nonredundant Gabor features, which are then further enhanced by kernel methods for recognition. Compared with one of the top performing methods in the 2004 Face Verification Competition (FVC2004), our methods demonstrate a clear advantage over existing methods in accuracy, computation efficiency, and memory cost. The proposed method has been fully tested on the FERET database using the FERET evaluation protocol. Significant improvements on three of the test data sets are observed. Compared with the classical Gabor wavelet-based approaches using a huge number of features, our method requires less than 4 milliseconds to retrieve a few hundreds of features. Due to the substantially reduced feature dimension, only 4 seconds are required to recognize 200 face images. The paper also unified different Gabor filter definitions and proposed a training sample generation algorithm to reduce the effects caused by unbalanced number of samples available in different classes.
Foreign Language Analysis and Recognition (FLARe)

DTIC Science & Technology

2016-10-08

10 7 Chinese CER ...Rates ( CERs ) were obtained with each feature set: (1) 19.2%, (2) 17.3%, and (3) 15.3%. Based on these results, a GMM-HMM speech recognition system...These systems were evaluated on the HUB4 and HKUST test partitions. Table 7 shows the CER obtained on each test set. Whereas including the HKUST data
Software for Partly Automated Recognition of Targets

NASA Technical Reports Server (NTRS)

Opitz, David; Blundell, Stuart; Bain, William; Morris, Matthew; Carlson, Ian; Mangrich, Mark

2003-01-01

The Feature Analyst is a computer program for assisted (partially automated) recognition of targets in images. This program was developed to accelerate the processing of high-resolution satellite image data for incorporation into geographic information systems (GIS). This program creates an advanced user interface that embeds proprietary machine-learning algorithms in commercial image-processing and GIS software. A human analyst provides samples of target features from multiple sets of data, then the software develops a data-fusion model that automatically extracts the remaining features from selected sets of data. The program thus leverages the natural ability of humans to recognize objects in complex scenes, without requiring the user to explain the human visual recognition process by means of lengthy software. Two major subprograms are the reactive agent and the thinking agent. The reactive agent strives to quickly learn the user s tendencies while the user is selecting targets and to increase the user s productivity by immediately suggesting the next set of pixels that the user may wish to select. The thinking agent utilizes all available resources, taking as much time as needed, to produce the most accurate autonomous feature-extraction model possible.
Facial soft biometric features for forensic face recognition.

PubMed

Tome, Pedro; Vera-Rodriguez, Ruben; Fierrez, Julian; Ortega-Garcia, Javier

2015-12-01

This paper proposes a functional feature-based approach useful for real forensic caseworks, based on the shape, orientation and size of facial traits, which can be considered as a soft biometric approach. The motivation of this work is to provide a set of facial features, which can be understood by non-experts such as judges and support the work of forensic examiners who, in practice, carry out a thorough manual comparison of face images paying special attention to the similarities and differences in shape and size of various facial traits. This new approach constitutes a tool that automatically converts a set of facial landmarks to a set of features (shape and size) corresponding to facial regions of forensic value. These features are furthermore evaluated in a population to generate statistics to support forensic examiners. The proposed features can also be used as additional information that can improve the performance of traditional face recognition systems. These features follow the forensic methodology and are obtained in a continuous and discrete manner from raw images. A statistical analysis is also carried out to study the stability, discrimination power and correlation of the proposed facial features on two realistic databases: MORPH and ATVS Forensic DB. Finally, the performance of both continuous and discrete features is analyzed using different similarity measures. Experimental results show high discrimination power and good recognition performance, especially for continuous features. A final fusion of the best systems configurations achieves rank 10 match results of 100% for ATVS database and 75% for MORPH database demonstrating the benefits of using this information in practice. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Simulation of millimeter-wave body images and its application to biometric recognition

NASA Astrophysics Data System (ADS)

Moreno-Moreno, Miriam; Fierrez, Julian; Vera-Rodriguez, Ruben; Parron, Josep

2012-06-01

One of the emerging applications of the millimeter-wave imaging technology is its use in biometric recognition. This is mainly due to some properties of the millimeter-waves such as their ability to penetrate through clothing and other occlusions, their low obtrusiveness when collecting the image and the fact that they are harmless to health. In this work we first describe the generation of a database comprising 1200 synthetic images at 94 GHz obtained from the body of 50 people. Then we extract a small set of distance-based features from each image and select the best feature subsets for person recognition using the SFFS feature selection algorithm. Finally these features are used in body geometry authentication obtaining promising results.
An Individual Finger Gesture Recognition System Based on Motion-Intent Analysis Using Mechanomyogram Signal

PubMed Central

Ding, Huijun; He, Qing; Zhou, Yongjin; Dan, Guo; Cui, Song

2017-01-01

Motion-intent-based finger gesture recognition systems are crucial for many applications such as prosthesis control, sign language recognition, wearable rehabilitation system, and human–computer interaction. In this article, a motion-intent-based finger gesture recognition system is designed to correctly identify the tapping of every finger for the first time. Two auto-event annotation algorithms are firstly applied and evaluated for detecting the finger tapping frame. Based on the truncated signals, the Wavelet packet transform (WPT) coefficients are calculated and compressed as the features, followed by a feature selection method that is able to improve the performance by optimizing the feature set. Finally, three popular classifiers including naive Bayes (NBC), K-nearest neighbor (KNN), and support vector machine (SVM) are applied and evaluated. The recognition accuracy can be achieved up to 94%. The design and the architecture of the system are presented with full system characterization results. PMID:29167655
A dynamical pattern recognition model of gamma activity in auditory cortex

PubMed Central

Zavaglia, M.; Canolty, R.T.; Schofield, T.M.; Leff, A.P.; Ursino, M.; Knight, R.T.; Penny, W.D.

2012-01-01

This paper describes a dynamical process which serves both as a model of temporal pattern recognition in the brain and as a forward model of neuroimaging data. This process is considered at two separate levels of analysis: the algorithmic and implementation levels. At an algorithmic level, recognition is based on the use of Occurrence Time features. Using a speech digit database we show that for noisy recognition environments, these features rival standard cepstral coefficient features. At an implementation level, the model is defined using a Weakly Coupled Oscillator (WCO) framework and uses a transient synchronization mechanism to signal a recognition event. In a second set of experiments, we use the strength of the synchronization event to predict the high gamma (75–150 Hz) activity produced by the brain in response to word versus non-word stimuli. Quantitative model fits allow us to make inferences about parameters governing pattern recognition dynamics in the brain. PMID:22327049
[Several mechanisms of visual gnosis disorders in local brain lesions].

PubMed

Meerson, Ia A

1981-01-01

The object of the studies were peculiarities of recognizing visual images by patients with local cerebral lesions under conditions of incomplete sets of the image features, disjunction of the latter, distortion of their spatial arrangement, and unusual spatial orientation of the image as a whole. It was found that elimination of even one essential feature sharply hampered the recognition of the image both by healthy individuals (control), and patients with extraoccipital lesions, whereas elimination of several nonessential features only slowed down the process. In distinction from this the difficulties of the recognition of incomplete images by patients with occipital lesions were directly proportional to the number of the eliminated features irrespective of the latters' significance, i.e. these patients were unable to evaluate the hierarchy of the features. The recognition process in these patients were followed the way of scanning individual features. The reaccumulation and summation. The recognition of the fragmental, spatially distorted and unusually oriented images was found to be affected selectively in patients with parietal lobe affections. The patients with occipital lesions recognized such images practically as good as the ordinary ones.
A mechatronics platform to study prosthetic hand control using EMG signals.

PubMed

Geethanjali, P

2016-09-01

In this paper, a low-cost mechatronics platform for the design and development of robotic hands as well as a surface electromyogram (EMG) pattern recognition system is proposed. This paper also explores various EMG classification techniques using a low-cost electronics system in prosthetic hand applications. The proposed platform involves the development of a four channel EMG signal acquisition system; pattern recognition of acquired EMG signals; and development of a digital controller for a robotic hand. Four-channel surface EMG signals, acquired from ten healthy subjects for six different movements of the hand, were used to analyse pattern recognition in prosthetic hand control. Various time domain features were extracted and grouped into five ensembles to compare the influence of features in feature-selective classifiers (SLR) with widely considered non-feature-selective classifiers, such as neural networks (NN), linear discriminant analysis (LDA) and support vector machines (SVM) applied with different kernels. The results divulged that the average classification accuracy of the SVM, with a linear kernel function, outperforms other classifiers with feature ensembles, Hudgin's feature set and auto regression (AR) coefficients. However, the slight improvement in classification accuracy of SVM incurs more processing time and memory space in the low-level controller. The Kruskal-Wallis (KW) test also shows that there is no significant difference in the classification performance of SLR with Hudgin's feature set to that of SVM with Hudgin's features along with AR coefficients. In addition, the KW test shows that SLR was found to be better in respect to computation time and memory space, which is vital in a low-level controller. Similar to SVM, with a linear kernel function, other non-feature selective LDA and NN classifiers also show a slight improvement in performance using twice the features but with the drawback of increased memory space requirement and time. This prototype facilitated the study of various issues of pattern recognition and identified an efficient classifier, along with a feature ensemble, in the implementation of EMG controlled prosthetic hands in a laboratory setting at low-cost. This platform may help to motivate and facilitate prosthetic hand research in developing countries.

Emotion-independent face recognition

NASA Astrophysics Data System (ADS)

De Silva, Liyanage C.; Esther, Kho G. P.

2000-12-01

Current face recognition techniques tend to work well when recognizing faces under small variations in lighting, facial expression and pose, but deteriorate under more extreme conditions. In this paper, a face recognition system to recognize faces of known individuals, despite variations in facial expression due to different emotions, is developed. The eigenface approach is used for feature extraction. Classification methods include Euclidean distance, back propagation neural network and generalized regression neural network. These methods yield 100% recognition accuracy when the training database is representative, containing one image representing the peak expression for each emotion of each person apart from the neutral expression. The feature vectors used for comparison in the Euclidean distance method and for training the neural network must be all the feature vectors of the training set. These results are obtained for a face database consisting of only four persons.
Mathematical morphology-based shape feature analysis for Chinese character recognition systems

NASA Astrophysics Data System (ADS)

Pai, Tun-Wen; Shyu, Keh-Hwa; Chen, Ling-Fan; Tai, Gwo-Chin

1995-04-01

This paper proposes an efficient technique of shape feature extraction based on the application of mathematical morphology theory. A new shape complexity index for preclassification of machine printed Chinese Character Recognition (CCR) is also proposed. For characters represented in different fonts/sizes or in a low resolution environment, a more stable local feature such as shape structure is preferred for character recognition. Morphological valley extraction filters are applied to extract the protrusive strokes from four sides of an input Chinese character. The number of extracted local strokes reflects the shape complexity of each side. These shape features of characters are encoded as corresponding shape complexity indices. Based on the shape complexity index, data base is able to be classified into 16 groups prior to recognition procedures. The performance of associating with shape feature analysis reclaims several characters from misrecognized character sets and results in an average of 3.3% improvement of recognition rate from an existing recognition system. In addition to enhance the recognition performance, the extracted stroke information can be further analyzed and classified its own stroke type. Therefore, the combination of extracted strokes from each side provides a means for data base clustering based on radical or subword components. It is one of the best solutions for recognizing high complexity characters such as Chinese characters which are divided into more than 200 different categories and consist more than 13,000 characters.
MGRA: Motion Gesture Recognition via Accelerometer.

PubMed

Hong, Feng; You, Shujuan; Wei, Meiyu; Zhang, Yongtuo; Guo, Zhongwen

2016-04-13

Accelerometers have been widely embedded in most current mobile devices, enabling easy and intuitive operations. This paper proposes a Motion Gesture Recognition system (MGRA) based on accelerometer data only, which is entirely implemented on mobile devices and can provide users with real-time interactions. A robust and unique feature set is enumerated through the time domain, the frequency domain and singular value decomposition analysis using our motion gesture set containing 11,110 traces. The best feature vector for classification is selected, taking both static and mobile scenarios into consideration. MGRA exploits support vector machine as the classifier with the best feature vector. Evaluations confirm that MGRA can accommodate a broad set of gesture variations within each class, including execution time, amplitude and non-gestural movement. Extensive evaluations confirm that MGRA achieves higher accuracy under both static and mobile scenarios and costs less computation time and energy on an LG Nexus 5 than previous methods.
Ordinal feature selection for iris and palmprint recognition.

PubMed

Sun, Zhenan; Wang, Libin; Tan, Tieniu

2014-09-01

Ordinal measures have been demonstrated as an effective feature representation model for iris and palmprint recognition. However, ordinal measures are a general concept of image analysis and numerous variants with different parameter settings, such as location, scale, orientation, and so on, can be derived to construct a huge feature space. This paper proposes a novel optimization formulation for ordinal feature selection with successful applications to both iris and palmprint recognition. The objective function of the proposed feature selection method has two parts, i.e., misclassification error of intra and interclass matching samples and weighted sparsity of ordinal feature descriptors. Therefore, the feature selection aims to achieve an accurate and sparse representation of ordinal measures. And, the optimization subjects to a number of linear inequality constraints, which require that all intra and interclass matching pairs are well separated with a large margin. Ordinal feature selection is formulated as a linear programming (LP) problem so that a solution can be efficiently obtained even on a large-scale feature pool and training database. Extensive experimental results demonstrate that the proposed LP formulation is advantageous over existing feature selection methods, such as mRMR, ReliefF, Boosting, and Lasso for biometric recognition, reporting state-of-the-art accuracy on CASIA and PolyU databases.
Getting the Gist of Events: Recognition of Two-Participant Actions from Brief Displays

PubMed Central

Hafri, Alon; Papafragou, Anna; Trueswell, John C.

2013-01-01

Unlike rapid scene and object recognition from brief displays, little is known about recognition of event categories and event roles from minimal visual information. In three experiments, we displayed naturalistic photographs of a wide range of two-participant event scenes for 37 ms and 73 ms followed by a mask, and found that event categories (the event gist, e.g., ‘kicking’, ‘pushing’, etc.) and event roles (i.e., Agent and Patient) can be recognized rapidly, even with various actor pairs and backgrounds. Norming ratings from a subsequent experiment revealed that certain physical features (e.g., outstretched extremities) that correlate with Agent-hood could have contributed to rapid role recognition. In a final experiment, using identical twin actors, we then varied these features in two sets of stimuli, in which Patients had Agent-like features or not. Subjects recognized the roles of event participants less accurately when Patients possessed Agent-like features, with this difference being eliminated with two-second durations. Thus, given minimal visual input, typical Agent-like physical features are used in role recognition but, with sufficient input from multiple fixations, people categorically determine the relationship between event participants. PMID:22984951
Behavioral model of visual perception and recognition

NASA Astrophysics Data System (ADS)

Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.

1993-09-01

In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.

PubMed

Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G

2017-09-01

To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Sorted Index Numbers for Privacy Preserving Face Recognition

NASA Astrophysics Data System (ADS)

Wang, Yongjin; Hatzinakos, Dimitrios

2009-12-01

This paper presents a novel approach for changeable and privacy preserving face recognition. We first introduce a new method of biometric matching using the sorted index numbers (SINs) of feature vectors. Since it is impossible to recover any of the exact values of the original features, the transformation from original features to the SIN vectors is noninvertible. To address the irrevocable nature of biometric signals whilst obtaining stronger privacy protection, a random projection-based method is employed in conjunction with the SIN approach to generate changeable and privacy preserving biometric templates. The effectiveness of the proposed method is demonstrated on a large generic data set, which contains images from several well-known face databases. Extensive experimentation shows that the proposed solution may improve the recognition accuracy.
Invariant object recognition based on the generalized discrete radon transform

NASA Astrophysics Data System (ADS)

Easley, Glenn R.; Colonna, Flavia

2004-04-01

We introduce a method for classifying objects based on special cases of the generalized discrete Radon transform. We adjust the transform and the corresponding ridgelet transform by means of circular shifting and a singular value decomposition (SVD) to obtain a translation, rotation and scaling invariant set of feature vectors. We then use a back-propagation neural network to classify the input feature vectors. We conclude with experimental results and compare these with other invariant recognition methods.
Reduced isothermal feature set for long wave infrared (LWIR) face recognition

NASA Astrophysics Data System (ADS)

Donoso, Ramiro; San Martín, Cesar; Hermosilla, Gabriel

2017-06-01

In this paper, we introduce a new concept in the thermal face recognition area: isothermal features. This consists of a feature vector built from a thermal signature that depends on the emission of the skin of the person and its temperature. A thermal signature is the appearance of the face to infrared sensors and is unique to each person. The infrared face is decomposed into isothermal regions that present the thermal features of the face. Each isothermal region is modeled as circles within a center representing the pixel of the image, and the feature vector is composed of a maximum radius of the circles at the isothermal region. This feature vector corresponds to the thermal signature of a person. The face recognition process is built using a modification of the Expectation Maximization (EM) algorithm in conjunction with a proposed probabilistic index to the classification process. Results obtained using an infrared database are compared with typical state-of-the-art techniques showing better performance, especially in uncontrolled acquisition conditions scenarios.
A Set of Handwriting Features for Use in Automated Writer Identification.

PubMed

Miller, John J; Patterson, Robert Bradley; Gantz, Donald T; Saunders, Christopher P; Walch, Mark A; Buscaglia, JoAnn

2017-05-01

A writer's biometric identity can be characterized through the distribution of physical feature measurements ("writer's profile"); a graph-based system that facilitates the quantification of these features is described. To accomplish this quantification, handwriting is segmented into basic graphical forms ("graphemes"), which are "skeletonized" to yield the graphical topology of the handwritten segment. The graph-based matching algorithm compares the graphemes first by their graphical topology and then by their geometric features. Graphs derived from known writers can be compared against graphs extracted from unknown writings. The process is computationally intensive and relies heavily upon statistical pattern recognition algorithms. This article focuses on the quantification of these physical features and the construction of the associated pattern recognition methods for using the features to discriminate among writers. The graph-based system described in this article has been implemented in a highly accurate and approximately language-independent biometric recognition system of writers of cursive documents. © 2017 American Academy of Forensic Sciences.
Event Recognition Based on Deep Learning in Chinese Texts

PubMed Central

Zhang, Yajun; Liu, Zongtian; Zhou, Wen

2016-01-01

Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231
Event Recognition Based on Deep Learning in Chinese Texts.

PubMed

Zhang, Yajun; Liu, Zongtian; Zhou, Wen

2016-01-01

Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
Neural network classification technique and machine vision for bread crumb grain evaluation

NASA Astrophysics Data System (ADS)

Zayas, Inna Y.; Chung, O. K.; Caley, M.

1995-10-01

Bread crumb grain was studied to develop a model for pattern recognition of bread baked at Hard Winter Wheat Quality Laboratory (HWWQL), Grain Marketing and Production Research Center (GMPRC). Images of bread slices were acquired with a scanner in a 512 multiplied by 512 format. Subimages in the central part of the slices were evaluated by several features such as mean, determinant, eigen values, shape of a slice and other crumb features. Derived features were used to describe slices and loaves. Neural network programs of MATLAB package were used for data analysis. Learning vector quantization method and multivariate discriminant analysis were applied to bread slices from what of different sources. A training and test sets of different bread crumb texture classes were obtained. The ranking of subimages was well correlated with visual judgement. The performance of different models on slice recognition rate was studied to choose the best model. The recognition of classes created according to human judgement with image features was low. Recognition of arbitrarily created classes, according to porosity patterns, with several feature patterns was approximately 90%. Correlation coefficient was approximately 0.7 between slice shape features and loaf volume.
Correlation pattern recognition: optimal parameters for quality standards control of chocolate marshmallow candy

NASA Astrophysics Data System (ADS)

Flores, Jorge L.; García-Torales, G.; Ponce Ávila, Cristina

2006-08-01

This paper describes an in situ image recognition system designed to inspect the quality standards of the chocolate pops during their production. The essence of the recognition system is the localization of the events (i.e., defects) in the input images that affect the quality standards of pops. To this end, processing modules, based on correlation filter, and segmentation of images are employed with the objective of measuring the quality standards. Therefore, we designed the correlation filter and defined a set of features from the correlation plane. The desired values for these parameters are obtained by exploiting information about objects to be rejected in order to find the optimal discrimination capability of the system. Regarding this set of features, the pop can be correctly classified. The efficacy of the system has been tested thoroughly under laboratory conditions using at least 50 images, containing 3 different types of possible defects.
Control of adaptive immunity by the innate immune system.

PubMed

Iwasaki, Akiko; Medzhitov, Ruslan

2015-04-01

Microbial infections are recognized by the innate immune system both to elicit immediate defense and to generate long-lasting adaptive immunity. To detect and respond to vastly different groups of pathogens, the innate immune system uses several recognition systems that rely on sensing common structural and functional features associated with different classes of microorganisms. These recognition systems determine microbial location, viability, replication and pathogenicity. Detection of these features by recognition pathways of the innate immune system is translated into different classes of effector responses though specialized populations of dendritic cells. Multiple mechanisms for the induction of immune responses are variations on a common design principle wherein the cells that sense infections produce one set of cytokines to induce lymphocytes to produce another set of cytokines, which in turn activate effector responses. Here we discuss these emerging principles of innate control of adaptive immunity.
Appearance-based human gesture recognition using multimodal features for human computer interaction

NASA Astrophysics Data System (ADS)

Luo, Dan; Gao, Hua; Ekenel, Hazim Kemal; Ohya, Jun

2011-03-01

The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.
Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition.

PubMed

Ibrahim, Wisam; Abadeh, Mohammad Saniee

2017-05-21

Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage. Copyright © 2017 Elsevier Ltd. All rights reserved.
Iris Recognition Using Feature Extraction of Box Counting Fractal Dimension

NASA Astrophysics Data System (ADS)

Khotimah, C.; Juniati, D.

2018-01-01

Biometrics is a science that is now growing rapidly. Iris recognition is a biometric modality which captures a photo of the eye pattern. The markings of the iris are distinctive that it has been proposed to use as a means of identification, instead of fingerprints. Iris recognition was chosen for identification in this research because every human has a special feature that each individual is different and the iris is protected by the cornea so that it will have a fixed shape. This iris recognition consists of three step: pre-processing of data, feature extraction, and feature matching. Hough transformation is used in the process of pre-processing to locate the iris area and Daugman’s rubber sheet model to normalize the iris data set into rectangular blocks. To find the characteristics of the iris, it was used box counting method to get the fractal dimension value of the iris. Tests carried out by used k-fold cross method with k = 5. In each test used 10 different grade K of K-Nearest Neighbor (KNN). The result of iris recognition was obtained with the best accuracy was 92,63 % for K = 3 value on K-Nearest Neighbor (KNN) method.
Automatic feature design for optical character recognition using an evolutionary search procedure.

PubMed

Stentiford, F W

1985-03-01

An automatic evolutionary search is applied to the problem of feature extraction in an OCR application. A performance measure based on feature independence is used to generate features which do not appear to suffer from peaking effects [17]. Features are extracted from a training set of 30 600 machine printed 34 class alphanumeric characters derived from British mail. Classification results on the training set and a test set of 10 200 characters are reported for an increasing number of features. A 1.01 percent forced decision error rate is obtained on the test data using 316 features. The hardware implementation should be cheap and fast to operate. The performance compares favorably with current low cost OCR page readers.

Luminance sticker based facial expression recognition using discrete wavelet transform for physically disabled persons.

PubMed

Nagarajan, R; Hariharan, M; Satiyan, M

2012-08-01

Developing tools to assist physically disabled and immobilized people through facial expression is a challenging area of research and has attracted many researchers recently. In this paper, luminance stickers based facial expression recognition is proposed. Recognition of facial expression is carried out by employing Discrete Wavelet Transform (DWT) as a feature extraction method. Different wavelet families with their different orders (db1 to db20, Coif1 to Coif 5 and Sym2 to Sym8) are utilized to investigate their performance in recognizing facial expression and to evaluate their computational time. Standard deviation is computed for the coefficients of first level of wavelet decomposition for every order of wavelet family. This standard deviation is used to form a set of feature vectors for classification. In this study, conventional validation and cross validation are performed to evaluate the efficiency of the suggested feature vectors. Three different classifiers namely Artificial Neural Network (ANN), k-Nearest Neighborhood (kNN) and Linear Discriminant Analysis (LDA) are used to classify a set of eight facial expressions. The experimental results demonstrate that the proposed method gives very promising classification accuracies.
Interactive object recognition assistance: an approach to recognition starting from target objects

NASA Astrophysics Data System (ADS)

Geisler, Juergen; Littfass, Michael

1999-07-01

Recognition of target objects in remotely sensed imagery required detailed knowledge about the target object domain as well as about mapping properties of the sensing system. The art of object recognition is to combine both worlds appropriately and to provide models of target appearance with respect to sensor characteristics. Common approaches to support interactive object recognition are either driven from the sensor point of view and address the problem of displaying images in a manner adequate to the sensing system. Or they focus on target objects and provide exhaustive encyclopedic information about this domain. Our paper discusses an approach to assist interactive object recognition based on knowledge about target objects and taking into account the significance of object features with respect to characteristics of the sensed imagery, e.g. spatial and spectral resolution. An `interactive recognition assistant' takes the image analyst through the interpretation process by indicating step-by-step the respectively most significant features of objects in an actual set of candidates. The significance of object features is expressed by pregenerated trees of significance, and by the dynamic computation of decision relevance for every feature at each step of the recognition process. In the context of this approach we discuss the question of modeling and storing the multisensorial/multispectral appearances of target objects and object classes as well as the problem of an adequate dynamic human-machine-interface that takes into account various mental models of human image interpretation.
Performance Evaluation of Multimodal Multifeature Authentication System Using KNN Classification.

PubMed

Rajagopal, Gayathri; Palaniswamy, Ramamoorthy

2015-01-01

This research proposes a multimodal multifeature biometric system for human recognition using two traits, that is, palmprint and iris. The purpose of this research is to analyse integration of multimodal and multifeature biometric system using feature level fusion to achieve better performance. The main aim of the proposed system is to increase the recognition accuracy using feature level fusion. The features at the feature level fusion are raw biometric data which contains rich information when compared to decision and matching score level fusion. Hence information fused at the feature level is expected to obtain improved recognition accuracy. However, information fused at feature level has the problem of curse in dimensionality; here PCA (principal component analysis) is used to diminish the dimensionality of the feature sets as they are high dimensional. The proposed multimodal results were compared with other multimodal and monomodal approaches. Out of these comparisons, the multimodal multifeature palmprint iris fusion offers significant improvements in the accuracy of the suggested multimodal biometric system. The proposed algorithm is tested using created virtual multimodal database using UPOL iris database and PolyU palmprint database.
Performance Evaluation of Multimodal Multifeature Authentication System Using KNN Classification

PubMed Central

Rajagopal, Gayathri; Palaniswamy, Ramamoorthy

2015-01-01

This research proposes a multimodal multifeature biometric system for human recognition using two traits, that is, palmprint and iris. The purpose of this research is to analyse integration of multimodal and multifeature biometric system using feature level fusion to achieve better performance. The main aim of the proposed system is to increase the recognition accuracy using feature level fusion. The features at the feature level fusion are raw biometric data which contains rich information when compared to decision and matching score level fusion. Hence information fused at the feature level is expected to obtain improved recognition accuracy. However, information fused at feature level has the problem of curse in dimensionality; here PCA (principal component analysis) is used to diminish the dimensionality of the feature sets as they are high dimensional. The proposed multimodal results were compared with other multimodal and monomodal approaches. Out of these comparisons, the multimodal multifeature palmprint iris fusion offers significant improvements in the accuracy of the suggested multimodal biometric system. The proposed algorithm is tested using created virtual multimodal database using UPOL iris database and PolyU palmprint database. PMID:26640813
Comparing supervised learning techniques on the task of physical activity recognition.

PubMed

Dalton, A; OLaighin, G

2013-01-01

The objective of this study was to compare the performance of base-level and meta-level classifiers on the task of physical activity recognition. Five wireless kinematic sensors were attached to each subject (n = 25) while they completed a range of basic physical activities in a controlled laboratory setting. Subjects were then asked to carry out similar self-annotated physical activities in a random order and in an unsupervised environment. A combination of time-domain and frequency-domain features were extracted from the sensor data including the first four central moments, zero-crossing rate, average magnitude, sensor cross-correlation, sensor auto-correlation, spectral entropy and dominant frequency components. A reduced feature set was generated using a wrapper subset evaluation technique with a linear forward search and this feature set was employed for classifier comparison. The meta-level classifier AdaBoostM1 with C4.5 Graft as its base-level classifier achieved an overall accuracy of 95%. Equal sized datasets of subject independent data and subject dependent data were used to train this classifier and high recognition rates could be achieved without the need for user specific training. Furthermore, it was found that an accuracy of 88% could be achieved using data from the ankle and wrist sensors only.
Neural Network and Letter Recognition.

NASA Astrophysics Data System (ADS)

Lee, Hue Yeon

Neural net architectures and learning algorithms that recognize hand written 36 alphanumeric characters are studied. The thin line input patterns written in 32 x 32 binary array are used. The system is comprised of two major components, viz. a preprocessing unit and a Recognition unit. The preprocessing unit in turn consists of three layers of neurons; the U-layer, the V-layer, and the C -layer. The functions of the U-layer is to extract local features by template matching. The correlation between the detected local features are considered. Through correlating neurons in a plane with their neighboring neurons, the V-layer would thicken the on-cells or lines that are groups of on-cells of the previous layer. These two correlations would yield some deformation tolerance and some of the rotational tolerance of the system. The C-layer then compresses data through the 'Gabor' transform. Pattern dependent choice of center and wavelengths of 'Gabor' filters is the cause of shift and scale tolerance of the system. Three different learning schemes had been investigated in the recognition unit, namely; the error back propagation learning with hidden units, a simple perceptron learning, and a competitive learning. Their performances were analyzed and compared. Since sometimes the network fails to distinguish between two letters that are inherently similar, additional ambiguity resolving neural nets are introduced on top of the above main neural net. The two dimensional Fourier transform is used as the preprocessing and the perceptron is used as the recognition unit of the ambiguity resolver. One hundred different person's handwriting sets are collected. Some of these are used as the training sets and the remainders are used as the test sets. The correct recognition rate of the system increases with the number of training sets and eventually saturates at a certain value. Similar recognition rates are obtained for the above three different learning algorithms. The minimum error rate, 4.9% is achieved for alphanumeric sets when 50 sets are trained. With the ambiguity resolver, it is reduced to 2.5%. In case that only numeral sets are trained and tested, 2.0% error rate is achieved. When only alphabet sets are considered, the error rate is reduced to 1.1%.
Three-dimensional object recognition using similar triangles and decision trees

NASA Technical Reports Server (NTRS)

Spirkovska, Lilly

1993-01-01

A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.
Integrated system for automated financial document processing

NASA Astrophysics Data System (ADS)

Hassanein, Khaled S.; Wesolkowski, Slawo; Higgins, Ray; Crabtree, Ralph; Peng, Antai

1997-02-01

A system was developed that integrates intelligent document analysis with multiple character/numeral recognition engines in order to achieve high accuracy automated financial document processing. In this system, images are accepted in both their grayscale and binary formats. A document analysis module starts by extracting essential features from the document to help identify its type (e.g. personal check, business check, etc.). These features are also utilized to conduct a full analysis of the image to determine the location of interesting zones such as the courtesy amount and the legal amount. These fields are then made available to several recognition knowledge sources such as courtesy amount recognition engines and legal amount recognition engines through a blackboard architecture. This architecture allows all the available knowledge sources to contribute incrementally and opportunistically to the solution of the given recognition query. Performance results on a test set of machine printed business checks using the integrated system are also reported.
Evaluation of Feature Extraction and Recognition for Activity Monitoring and Fall Detection Based on Wearable sEMG Sensors.

PubMed

Xi, Xugang; Tang, Minyan; Miran, Seyed M; Luo, Zhizeng

2017-05-27

As an essential subfield of context awareness, activity awareness, especially daily activity monitoring and fall detection, plays a significant role for elderly or frail people who need assistance in their daily activities. This study investigates the feature extraction and pattern recognition of surface electromyography (sEMG), with the purpose of determining the best features and classifiers of sEMG for daily living activities monitoring and fall detection. This is done by a serial of experiments. In the experiments, four channels of sEMG signal from wireless, wearable sensors located on lower limbs are recorded from three subjects while they perform seven activities of daily living (ADL). A simulated trip fall scenario is also considered with a custom-made device attached to the ankle. With this experimental setting, 15 feature extraction methods of sEMG, including time, frequency, time/frequency domain and entropy, are analyzed based on class separability and calculation complexity, and five classification methods, each with 15 features, are estimated with respect to the accuracy rate of recognition and calculation complexity for activity monitoring and fall detection. It is shown that a high accuracy rate of recognition and a minimal calculation time for daily activity monitoring and fall detection can be achieved in the current experimental setting. Specifically, the Wilson Amplitude (WAMP) feature performs the best, and the classifier Gaussian Kernel Support Vector Machine (GK-SVM) with Permutation Entropy (PE) or WAMP results in the highest accuracy for activity monitoring with recognition rates of 97.35% and 96.43%. For fall detection, the classifier Fuzzy Min-Max Neural Network (FMMNN) has the best sensitivity and specificity at the cost of the longest calculation time, while the classifier Gaussian Kernel Fisher Linear Discriminant Analysis (GK-FDA) with the feature WAMP guarantees a high sensitivity (98.70%) and specificity (98.59%) with a short calculation time (65.586 ms), making it a possible choice for pre-impact fall detection. The thorough quantitative comparison of the features and classifiers in this study supports the feasibility of a wireless, wearable sEMG sensor system for automatic activity monitoring and fall detection.
Evaluation of Feature Extraction and Recognition for Activity Monitoring and Fall Detection Based on Wearable sEMG Sensors

PubMed Central

Xi, Xugang; Tang, Minyan; Miran, Seyed M.; Luo, Zhizeng

2017-01-01

As an essential subfield of context awareness, activity awareness, especially daily activity monitoring and fall detection, plays a significant role for elderly or frail people who need assistance in their daily activities. This study investigates the feature extraction and pattern recognition of surface electromyography (sEMG), with the purpose of determining the best features and classifiers of sEMG for daily living activities monitoring and fall detection. This is done by a serial of experiments. In the experiments, four channels of sEMG signal from wireless, wearable sensors located on lower limbs are recorded from three subjects while they perform seven activities of daily living (ADL). A simulated trip fall scenario is also considered with a custom-made device attached to the ankle. With this experimental setting, 15 feature extraction methods of sEMG, including time, frequency, time/frequency domain and entropy, are analyzed based on class separability and calculation complexity, and five classification methods, each with 15 features, are estimated with respect to the accuracy rate of recognition and calculation complexity for activity monitoring and fall detection. It is shown that a high accuracy rate of recognition and a minimal calculation time for daily activity monitoring and fall detection can be achieved in the current experimental setting. Specifically, the Wilson Amplitude (WAMP) feature performs the best, and the classifier Gaussian Kernel Support Vector Machine (GK-SVM) with Permutation Entropy (PE) or WAMP results in the highest accuracy for activity monitoring with recognition rates of 97.35% and 96.43%. For fall detection, the classifier Fuzzy Min-Max Neural Network (FMMNN) has the best sensitivity and specificity at the cost of the longest calculation time, while the classifier Gaussian Kernel Fisher Linear Discriminant Analysis (GK-FDA) with the feature WAMP guarantees a high sensitivity (98.70%) and specificity (98.59%) with a short calculation time (65.586 ms), making it a possible choice for pre-impact fall detection. The thorough quantitative comparison of the features and classifiers in this study supports the feasibility of a wireless, wearable sEMG sensor system for automatic activity monitoring and fall detection. PMID:28555016
SD-MSAEs: Promoter recognition in human genome based on deep feature extraction.

PubMed

Xu, Wenxuan; Zhang, Li; Lu, Yaping

2016-06-01

The prediction and recognition of promoter in human genome play an important role in DNA sequence analysis. Entropy, in Shannon sense, of information theory is a multiple utility in bioinformatic details analysis. The relative entropy estimator methods based on statistical divergence (SD) are used to extract meaningful features to distinguish different regions of DNA sequences. In this paper, we choose context feature and use a set of methods of SD to select the most effective n-mers distinguishing promoter regions from other DNA regions in human genome. Extracted from the total possible combinations of n-mers, we can get four sparse distributions based on promoter and non-promoters training samples. The informative n-mers are selected by optimizing the differentiating extents of these distributions. Specially, we combine the advantage of statistical divergence and multiple sparse auto-encoders (MSAEs) in deep learning to extract deep feature for promoter recognition. And then we apply multiple SVMs and a decision model to construct a human promoter recognition method called SD-MSAEs. Framework is flexible that it can integrate new feature extraction or new classification models freely. Experimental results show that our method has high sensitivity and specificity. Copyright © 2016 Elsevier Inc. All rights reserved.
Cross-View Action Recognition via Transferable Dictionary Learning.

PubMed

Zheng, Jingjing; Jiang, Zhuolin; Chellappa, Rama

2016-05-01

Discriminative appearance features are effective for recognizing actions in a fixed view, but may not generalize well to a new view. In this paper, we present two effective approaches to learn dictionaries for robust action recognition across views. In the first approach, we learn a set of view-specific dictionaries where each dictionary corresponds to one camera view. These dictionaries are learned simultaneously from the sets of correspondence videos taken at different views with the aim of encouraging each video in the set to have the same sparse representation. In the second approach, we additionally learn a common dictionary shared by different views to model view-shared features. This approach represents the videos in each view using a view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from the different views of the same action to have the similar sparse representations. The learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labeled videos exist in the target view. The extensive experiments using three public datasets demonstrate that the proposed approach outperforms recently developed approaches for cross-view action recognition.
Statistical-techniques-based computer-aided diagnosis (CAD) using texture feature analysis: application in computed tomography (CT) imaging to fatty liver disease

NASA Astrophysics Data System (ADS)

Chung, Woon-Kwan; Park, Hyong-Hu; Im, In-Chul; Lee, Jae-Seung; Goo, Eun-Hoe; Dong, Kyung-Rae

2012-09-01

This paper proposes a computer-aided diagnosis (CAD) system based on texture feature analysis and statistical wavelet transformation technology to diagnose fatty liver disease with computed tomography (CT) imaging. In the target image, a wavelet transformation was performed for each lesion area to set the region of analysis (ROA, window size: 50 × 50 pixels) and define the texture feature of a pixel. Based on the extracted texture feature values, six parameters (average gray level, average contrast, relative smoothness, skewness, uniformity, and entropy) were determined to calculate the recognition rate for a fatty liver. In addition, a multivariate analysis of the variance (MANOVA) method was used to perform a discriminant analysis to verify the significance of the extracted texture feature values and the recognition rate for a fatty liver. According to the results, each texture feature value was significant for a comparison of the recognition rate for a fatty liver ( p < 0.05). Furthermore, the F-value, which was used as a scale for the difference in recognition rates, was highest in the average gray level, relatively high in the skewness and the entropy, and relatively low in the uniformity, the relative smoothness and the average contrast. The recognition rate for a fatty liver had the same scale as that for the F-value, showing 100% (average gray level) at the maximum and 80% (average contrast) at the minimum. Therefore, the recognition rate is believed to be a useful clinical value for the automatic detection and computer-aided diagnosis (CAD) using the texture feature value. Nevertheless, further study on various diseases and singular diseases will be needed in the future.
Image Classification Using Biomimetic Pattern Recognition with Convolutional Neural Networks Features

PubMed Central

Huo, Guanying

2017-01-01

As a typical deep-learning model, Convolutional Neural Networks (CNNs) can be exploited to automatically extract features from images using the hierarchical structure inspired by mammalian visual system. For image classification tasks, traditional CNN models employ the softmax function for classification. However, owing to the limited capacity of the softmax function, there are some shortcomings of traditional CNN models in image classification. To deal with this problem, a new method combining Biomimetic Pattern Recognition (BPR) with CNNs is proposed for image classification. BPR performs class recognition by a union of geometrical cover sets in a high-dimensional feature space and therefore can overcome some disadvantages of traditional pattern recognition. The proposed method is evaluated on three famous image classification benchmarks, that is, MNIST, AR, and CIFAR-10. The classification accuracies of the proposed method for the three datasets are 99.01%, 98.40%, and 87.11%, respectively, which are much higher in comparison with the other four methods in most cases. PMID:28316614
Artificial intelligence tools for pattern recognition

NASA Astrophysics Data System (ADS)

Acevedo, Elena; Acevedo, Antonio; Felipe, Federico; Avilés, Pedro

2017-06-01

In this work, we present a system for pattern recognition that combines the power of genetic algorithms for solving problems and the efficiency of the morphological associative memories. We use a set of 48 tire prints divided into 8 brands of tires. The images have dimensions of 200 x 200 pixels. We applied Hough transform to obtain lines as main features. The number of lines obtained is 449. The genetic algorithm reduces the number of features to ten suitable lines that give thus the 100% of recognition. Morphological associative memories were used as evaluation function. The selection algorithms were Tournament and Roulette wheel. For reproduction, we applied one-point, two-point and uniform crossover.
Membership-degree preserving discriminant analysis with applications to face recognition.

PubMed

Yang, Zhangjing; Liu, Chuancai; Huang, Pu; Qian, Jianjun

2013-01-01

In pattern recognition, feature extraction techniques have been widely employed to reduce the dimensionality of high-dimensional data. In this paper, we propose a novel feature extraction algorithm called membership-degree preserving discriminant analysis (MPDA) based on the fisher criterion and fuzzy set theory for face recognition. In the proposed algorithm, the membership degree of each sample to particular classes is firstly calculated by the fuzzy k-nearest neighbor (FKNN) algorithm to characterize the similarity between each sample and class centers, and then the membership degree is incorporated into the definition of the between-class scatter and the within-class scatter. The feature extraction criterion via maximizing the ratio of the between-class scatter to the within-class scatter is applied. Experimental results on the ORL, Yale, and FERET face databases demonstrate the effectiveness of the proposed algorithm.
Ordinal measures for iris recognition.

PubMed

Sun, Zhenan; Tan, Tieniu

2009-12-01

Images of a human iris contain rich texture information useful for identity authentication. A key and still open issue in iris recognition is how best to represent such textural information using a compact set of features (iris features). In this paper, we propose using ordinal measures for iris feature representation with the objective of characterizing qualitative relationships between iris regions rather than precise measurements of iris image structures. Such a representation may lose some image-specific information, but it achieves a good trade-off between distinctiveness and robustness. We show that ordinal measures are intrinsic features of iris patterns and largely invariant to illumination changes. Moreover, compactness and low computational complexity of ordinal measures enable highly efficient iris recognition. Ordinal measures are a general concept useful for image analysis and many variants can be derived for ordinal feature extraction. In this paper, we develop multilobe differential filters to compute ordinal measures with flexible intralobe and interlobe parameters such as location, scale, orientation, and distance. Experimental results on three public iris image databases demonstrate the effectiveness of the proposed ordinal feature models.
Fast and efficient indexing approach for object recognition

NASA Astrophysics Data System (ADS)

Hefnawy, Alaa; Mashali, Samia A.; Rashwan, Mohsen; Fikri, Magdi

1999-08-01

This paper introduces a fast and efficient indexing approach for both 2D and 3D model-based object recognition in the presence of rotation, translation, and scale variations of objects. The indexing entries are computed after preprocessing the data by Haar wavelet decomposition. The scheme is based on a unified image feature detection approach based on Zernike moments. A set of low level features, e.g. high precision edges, gray level corners, are estimated by a set of orthogonal Zernike moments, calculated locally around every image point. A high dimensional, highly descriptive indexing entries are then calculated based on the correlation of these local features and employed for fast access to the model database to generate hypotheses. A list of the most candidate models is then presented by evaluating the hypotheses. Experimental results are included to demonstrate the effectiveness of the proposed indexing approach.
Function Follows Form: Activation of Shape and Function Features during Object Identification

ERIC Educational Resources Information Center

Yee, Eiling; Huffstetler, Stacy; Thompson-Schill, Sharon L.

2011-01-01

Most theories of semantic memory characterize knowledge of a given object as comprising a set of semantic features. But how does conceptual activation of these features proceed during object identification? We present the results of a pair of experiments that demonstrate that object recognition is a dynamically unfolding process in which function…
Development of Collaborative Research Initiatives to Advance the Aerospace Sciences-via the Communications, Electronics, Information Systems Focus Group

NASA Technical Reports Server (NTRS)

Knasel, T. Michael

1996-01-01

The primary goal of the Adaptive Vision Laboratory Research project was to develop advanced computer vision systems for automatic target recognition. The approach used in this effort combined several machine learning paradigms including evolutionary learning algorithms, neural networks, and adaptive clustering techniques to develop the E-MOR.PH system. This system is capable of generating pattern recognition systems to solve a wide variety of complex recognition tasks. A series of simulation experiments were conducted using E-MORPH to solve problems in OCR, military target recognition, industrial inspection, and medical image analysis. The bulk of the funds provided through this grant were used to purchase computer hardware and software to support these computationally intensive simulations. The payoff from this effort is the reduced need for human involvement in the design and implementation of recognition systems. We have shown that the techniques used in E-MORPH are generic and readily transition to other problem domains. Specifically, E-MORPH is multi-phase evolutionary leaming system that evolves cooperative sets of features detectors and combines their response using an adaptive classifier to form a complete pattern recognition system. The system can operate on binary or grayscale images. In our most recent experiments, we used multi-resolution images that are formed by applying a Gabor wavelet transform to a set of grayscale input images. To begin the leaming process, candidate chips are extracted from the multi-resolution images to form a training set and a test set. A population of detector sets is randomly initialized to start the evolutionary process. Using a combination of evolutionary programming and genetic algorithms, the feature detectors are enhanced to solve a recognition problem. The design of E-MORPH and recognition results for a complex problem in medical image analysis are described at the end of this report. The specific task involves the identification of vertebrae in x-ray images of human spinal columns. This problem is extremely challenging because the individual vertebra exhibit variation in shape, scale, orientation, and contrast. E-MORPH generated several accurate recognition systems to solve this task. This dual use of this ATR technology clearly demonstrates the flexibility and power of our approach.

Analysis Of The IJCNN 2011 UTL Challenge

DTIC Science & Technology

2012-01-13

large datasets from various application domains: handwriting recognition, image recognition, video processing, text processing, and ecology. The goal...validation and final evaluation sets consist of 4096 examples each. Dataset Domain Features Sparsity Devel. Transf. AVICENNA Handwriting 120 0% 150205...documents [3]. Transfer learning methods could accelerate the application of handwriting recognizers to historical manuscript by reducing the need for
Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

PubMed

Schädler, Marc René; Kollmeier, Birger

2015-04-01

To test if simultaneous spectral and temporal processing is required to extract robust features for automatic speech recognition (ASR), the robust spectro-temporal two-dimensional-Gabor filter bank (GBFB) front-end from Schädler, Meyer, and Kollmeier [J. Acoust. Soc. Am. 131, 4134-4151 (2012)] was de-composed into a spectral one-dimensional-Gabor filter bank and a temporal one-dimensional-Gabor filter bank. A feature set that is extracted with these separate spectral and temporal modulation filter banks was introduced, the separate Gabor filter bank (SGBFB) features, and evaluated on the CHiME (Computational Hearing in Multisource Environments) keywords-in-noise recognition task. From the perspective of robust ASR, the results showed that spectral and temporal processing can be performed independently and are not required to interact with each other. Using SGBFB features permitted the signal-to-noise ratio (SNR) to be lowered by 1.2 dB while still performing as well as the GBFB-based reference system, which corresponds to a relative improvement of the word error rate by 12.8%. Additionally, the real time factor of the spectro-temporal processing could be reduced by more than an order of magnitude. Compared to human listeners, the SNR needed to be 13 dB higher when using Mel-frequency cepstral coefficient features, 11 dB higher when using GBFB features, and 9 dB higher when using SGBFB features to achieve the same recognition performance.
Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach.

PubMed

Liu, Li; Shao, Ling; Li, Xuelong; Lu, Ke

2016-01-01

Extracting discriminative and robust features from video sequences is the first and most critical step in human action recognition. In this paper, instead of using handcrafted features, we automatically learn spatio-temporal motion features for action recognition. This is achieved via an evolutionary method, i.e., genetic programming (GP), which evolves the motion feature descriptor on a population of primitive 3D operators (e.g., 3D-Gabor and wavelet). In this way, the scale and shift invariant features can be effectively extracted from both color and optical flow sequences. We intend to learn data adaptive descriptors for different datasets with multiple layers, which makes fully use of the knowledge to mimic the physical structure of the human visual cortex for action recognition and simultaneously reduce the GP searching space to effectively accelerate the convergence of optimal solutions. In our evolutionary architecture, the average cross-validation classification error, which is calculated by an support-vector-machine classifier on the training set, is adopted as the evaluation criterion for the GP fitness function. After the entire evolution procedure finishes, the best-so-far solution selected by GP is regarded as the (near-)optimal action descriptor obtained. The GP-evolving feature extraction method is evaluated on four popular action datasets, namely KTH, HMDB51, UCF YouTube, and Hollywood2. Experimental results show that our method significantly outperforms other types of features, either hand-designed or machine-learned.
Efficient feature subset selection with probabilistic distance criteria. [pattern recognition

NASA Technical Reports Server (NTRS)

Chittineni, C. B.

1979-01-01

Recursive expressions are derived for efficiently computing the commonly used probabilistic distance measures as a change in the criteria both when a feature is added to and when a feature is deleted from the current feature subset. A combinatorial algorithm for generating all possible r feature combinations from a given set of s features in (s/r) steps with a change of a single feature at each step is presented. These expressions can also be used for both forward and backward sequential feature selection.
Military personnel recognition system using texture, colour, and SURF features

NASA Astrophysics Data System (ADS)

Irhebhude, Martins E.; Edirisinghe, Eran A.

2014-06-01

This paper presents an automatic, machine vision based, military personnel identification and classification system. Classification is done using a Support Vector Machine (SVM) on sets of Army, Air Force and Navy camouflage uniform personnel datasets. In the proposed system, the arm of service of personnel is recognised by the camouflage of a persons uniform, type of cap and the type of badge/logo. The detailed analysis done include; camouflage cap and plain cap differentiation using gray level co-occurrence matrix (GLCM) texture feature; classification on Army, Air Force and Navy camouflaged uniforms using GLCM texture and colour histogram bin features; plain cap badge classification into Army, Air Force and Navy using Speed Up Robust Feature (SURF). The proposed method recognised camouflage personnel arm of service on sets of data retrieved from google images and selected military websites. Correlation-based Feature Selection (CFS) was used to improve recognition and reduce dimensionality, thereby speeding the classification process. With this method success rates recorded during the analysis include 93.8% for camouflage appearance category, 100%, 90% and 100% rates of plain cap and camouflage cap categories for Army, Air Force and Navy categories, respectively. Accurate recognition was recorded using SURF for the plain cap badge category. Substantial analysis has been carried out and results prove that the proposed method can correctly classify military personnel into various arms of service. We show that the proposed method can be integrated into a face recognition system, which will recognise personnel in addition to determining the arm of service which the personnel belong. Such a system can be used to enhance the security of a military base or facility.
A Novel Multilayer Correlation Maximization Model for Improving CCA-Based Frequency Recognition in SSVEP Brain-Computer Interface.

PubMed

Jiao, Yong; Zhang, Yu; Wang, Yu; Wang, Bei; Jin, Jing; Wang, Xingyu

2018-05-01

Multiset canonical correlation analysis (MsetCCA) has been successfully applied to optimize the reference signals by extracting common features from multiple sets of electroencephalogram (EEG) for steady-state visual evoked potential (SSVEP) recognition in brain-computer interface application. To avoid extracting the possible noise components as common features, this study proposes a sophisticated extension of MsetCCA, called multilayer correlation maximization (MCM) model for further improving SSVEP recognition accuracy. MCM combines advantages of both CCA and MsetCCA by carrying out three layers of correlation maximization processes. The first layer is to extract the stimulus frequency-related information in using CCA between EEG samples and sine-cosine reference signals. The second layer is to learn reference signals by extracting the common features with MsetCCA. The third layer is to re-optimize the reference signals set in using CCA with sine-cosine reference signals again. Experimental study is implemented to validate effectiveness of the proposed MCM model in comparison with the standard CCA and MsetCCA algorithms. Superior performance of MCM demonstrates its promising potential for the development of an improved SSVEP-based brain-computer interface.
Statistical process control using optimized neural networks: a case study.

PubMed

Addeh, Jalil; Ebrahimzadeh, Ata; Azarbad, Milad; Ranaee, Vahid

2014-09-01

The most common statistical process control (SPC) tools employed for monitoring process changes are control charts. A control chart demonstrates that the process has altered by generating an out-of-control signal. This study investigates the design of an accurate system for the control chart patterns (CCPs) recognition in two aspects. First, an efficient system is introduced that includes two main modules: feature extraction module and classifier module. In the feature extraction module, a proper set of shape features and statistical feature are proposed as the efficient characteristics of the patterns. In the classifier module, several neural networks, such as multilayer perceptron, probabilistic neural network and radial basis function are investigated. Based on an experimental study, the best classifier is chosen in order to recognize the CCPs. Second, a hybrid heuristic recognition system is introduced based on cuckoo optimization algorithm (COA) algorithm to improve the generalization performance of the classifier. The simulation results show that the proposed algorithm has high recognition accuracy. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
Face recognition with the Karhunen-Loeve transform

NASA Astrophysics Data System (ADS)

Suarez, Pedro F.

1991-12-01

The major goal of this research was to investigate machine recognition of faces. The approach taken to achieve this goal was to investigate the use of Karhunen-Loe've Transform (KLT) by implementing flexible and practical code. The KLT utilizes the eigenvectors of the covariance matrix as a basis set. Faces were projected onto the eigenvectors, called eigenfaces, and the resulting projection coefficients were used as features. Face recognition accuracies for the KLT coefficients were superior to Fourier based techniques. Additionally, this thesis demonstrated the image compression and reconstruction capabilities of the KLT. This theses also developed the use of the KLT as a facial feature detector. The ability to differentiate between facial features provides a computer communications interface for non-vocal people with cerebral palsy. Lastly, this thesis developed a KLT based axis system for laser scanner data of human heads. The scanner data axis system provides the anthropometric community a more precise method of fitting custom helmets.
Fuel spill identification using solid-phase extraction and solid-phase microextraction. 1. Aviation turbine fuels.

PubMed

Lavine, B K; Brzozowski, D M; Ritter, J; Moores, A J; Mayfield, H T

2001-12-01

The water-soluble fraction of aviation jet fuels is examined using solid-phase extraction and solid-phase microextraction. Gas chromatographic profiles of solid-phase extracts and solid-phase microextracts of the water-soluble fraction of kerosene- and nonkerosene-based jet fuels reveal that each jet fuel possesses a unique profile. Pattern recognition analysis reveals fingerprint patterns within the data characteristic of fuel type. By using a novel genetic algorithm (GA) that emulates human pattern recognition through machine learning, it is possible to identify features characteristic of the chromatographic profile of each fuel class. The pattern recognition GA identifies a set of features that optimize the separation of the fuel classes in a plot of the two largest principal components of the data. Because principal components maximize variance, the bulk of the information encoded by the selected features is primarily about the differences between the fuel classes.
Recognition of face and non-face stimuli in autistic spectrum disorder.

PubMed

Arkush, Leo; Smith-Collins, Adam P R; Fiorentini, Chiara; Skuse, David H

2013-12-01

The ability to remember faces is critical for the development of social competence. From childhood to adulthood, we acquire a high level of expertise in the recognition of facial images, and neural processes become dedicated to sustaining competence. Many people with autism spectrum disorder (ASD) have poor face recognition memory; changes in hairstyle or other non-facial features in an otherwise familiar person affect their recollection skills. The observation implies that they may not use the configuration of the inner face to achieve memory competence, but bolster performance in other ways. We aimed to test this hypothesis by comparing the performance of a group of high-functioning unmedicated adolescents with ASD and a matched control group on a "surprise" face recognition memory task. We compared their memory for unfamiliar faces with their memory for images of houses. To evaluate the role that is played by peripheral cues in assisting recognition memory, we cropped both sets of pictures, retaining only the most salient central features. ASD adolescents had poorer recognition memory for faces than typical controls, but their recognition memory for houses was unimpaired. Cropping images of faces did not disproportionately influence their recall accuracy, relative to controls. House recognition skills (cropped and uncropped) were similar in both groups. In the ASD group only, performance on both sets of task was closely correlated, implying that memory for faces and other complex pictorial stimuli is achieved by domain-general (non-dedicated) cognitive mechanisms. Adolescents with ASD apparently do not use domain-specialized processing of inner facial cues to support face recognition memory. © 2013 International Society for Autism Research, Wiley Periodicals, Inc.
Towards a smart glove: arousal recognition based on textile Electrodermal Response.

PubMed

Valenza, Gaetano; Lanata, Antonio; Scilingo, Enzo Pasquale; De Rossi, Danilo

2010-01-01

This paper investigates the possibility of using Electrodermal Response, acquired by a sensing fabric glove with embedded textile electrodes, as reliable means for emotion recognition. Here, all the essential steps for an automatic recognition system are described, from the recording of physiological data set to a feature-based multiclass classification. Data were collected from 35 healthy volunteers during arousal elicitation by means of International Affective Picture System (IAPS) pictures. Experimental results show high discrimination after twenty steps of cross validation.
Recognition of emotional facial expressions and broad autism phenotype in parents of children diagnosed with autistic spectrum disorder.

PubMed

Kadak, Muhammed Tayyib; Demirel, Omer Faruk; Yavuz, Mesut; Demir, Türkay

2014-07-01

Research findings debate about features of broad autism phenotype. In this study, we tested whether parents of children with autism have problems recognizing emotional facial expression and the contribution of such an impairment to the broad phenotype of autism. Seventy-two parents of children with autistic spectrum disorder and 38 parents of control group participated in the study. Broad autism features was measured with Autism Quotient (AQ). Recognition of Emotional Face Expression Test was assessed with the Emotion Recognition Test, consisting a set of photographs from Ekman & Friesen's. In a two-tailed analysis of variance of AQ, there was a significant difference for social skills (F(1, 106)=6.095; p<.05). Analyses of variance revealed significant difference in the recognition of happy, surprised and neutral expressions (F(1, 106)=4.068, p=.046; F(1, 106)=4.068, p=.046; F(1, 106)=6.064, p=.016). According to our findings, social impairment could be considered a characteristic feature of BAP. ASD parents had difficulty recognizing neutral expressions, suggesting that ASD parents may have impaired recognition of ambiguous expressions as do autistic children. Copyright © 2014 Elsevier Inc. All rights reserved.
Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach

PubMed Central

Ahmad, Riaz; Naz, Saeeda; Afzal, Muhammad Zeshan; Amin, Sayed Hassan; Breuel, Thomas

2015-01-01

The presence of a large number of unique shapes called ligatures in cursive languages, along with variations due to scaling, orientation and location provides one of the most challenging pattern recognition problems. Recognition of the large number of ligatures is often a complicated task in oriental languages such as Pashto, Urdu, Persian and Arabic. Research on cursive script recognition often ignores the fact that scaling, orientation, location and font variations are common in printed cursive text. Therefore, these variations are not included in image databases and in experimental evaluations. This research uncovers challenges faced by Arabic cursive script recognition in a holistic framework by considering Pashto as a test case, because Pashto language has larger alphabet set than Arabic, Persian and Urdu. A database containing 8000 images of 1000 unique ligatures having scaling, orientation and location variations is introduced. In this article, a feature space based on scale invariant feature transform (SIFT) along with a segmentation framework has been proposed for overcoming the above mentioned challenges. The experimental results show a significantly improved performance of proposed scheme over traditional feature extraction techniques such as principal component analysis (PCA). PMID:26368566
Recognition of complex human behaviours using 3D imaging for intelligent surveillance applications

NASA Astrophysics Data System (ADS)

Yao, Bo; Lepley, Jason J.; Peall, Robert; Butler, Michael; Hagras, Hani

2016-10-01

We introduce a system that exploits 3-D imaging technology as an enabler for the robust recognition of the human form. We combine this with pose and feature recognition capabilities from which we can recognise high-level human behaviours. We propose a hierarchical methodology for the recognition of complex human behaviours, based on the identification of a set of atomic behaviours, individual and sequential poses (e.g. standing, sitting, walking, drinking and eating) that provides a framework from which we adopt time-based machine learning techniques to recognise complex behaviour patterns.
Fuzzy Logic-Based Audio Pattern Recognition

NASA Astrophysics Data System (ADS)

Malcangi, M.

2008-11-01

Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
Facial Asymmetry-Based Age Group Estimation: Role in Recognizing Age-Separated Face Images.

PubMed

Sajid, Muhammad; Taj, Imtiaz Ahmad; Bajwa, Usama Ijaz; Ratyal, Naeem Iqbal

2018-04-23

Face recognition aims to establish the identity of a person based on facial characteristics. On the other hand, age group estimation is the automatic calculation of an individual's age range based on facial features. Recognizing age-separated face images is still a challenging research problem due to complex aging processes involving different types of facial tissues, skin, fat, muscles, and bones. Certain holistic and local facial features are used to recognize age-separated face images. However, most of the existing methods recognize face images without incorporating the knowledge learned from age group estimation. In this paper, we propose an age-assisted face recognition approach to handle aging variations. Inspired by the observation that facial asymmetry is an age-dependent intrinsic facial feature, we first use asymmetric facial dimensions to estimate the age group of a given face image. Deeply learned asymmetric facial features are then extracted for face recognition using a deep convolutional neural network (dCNN). Finally, we integrate the knowledge learned from the age group estimation into the face recognition algorithm using the same dCNN. This integration results in a significant improvement in the overall performance compared to using the face recognition algorithm alone. The experimental results on two large facial aging datasets, the MORPH and FERET sets, show that the proposed age group estimation based on the face recognition approach yields superior performance compared to some existing state-of-the-art methods. © 2018 American Academy of Forensic Sciences.
A practical approach for writer-dependent symbol recognition using a writer-independent symbol recognizer.

PubMed

LaViola, Joseph J; Zeleznik, Robert C

2007-11-01

We present a practical technique for using a writer-independent recognition engine to improve the accuracy and speed while reducing the training requirements of a writer-dependent symbol recognizer. Our writer-dependent recognizer uses a set of binary classifiers based on the AdaBoost learning algorithm, one for each possible pairwise symbol comparison. Each classifier consists of a set of weak learners, one of which is based on a writer-independent handwriting recognizer. During online recognition, we also use the n-best list of the writer-independent recognizer to prune the set of possible symbols and thus reduce the number of required binary classifications. In this paper, we describe the geometric and statistical features used in our recognizer and our all-pairs classification algorithm. We also present the results of experiments that quantify the effect incorporating a writer-independent recognition engine into a writer-dependent recognizer has on accuracy, speed, and user training time.
Pattern recognition and feature extraction with an optical Hough transform

NASA Astrophysics Data System (ADS)

Fernández, Ariel

2016-09-01

Pattern recognition and localization along with feature extraction are image processing applications of great interest in defect inspection and robot vision among others. In comparison to purely digital methods, the attractiveness of optical processors for pattern recognition lies in their highly parallel operation and real-time processing capability. This work presents an optical implementation of the generalized Hough transform (GHT), a well-established technique for the recognition of geometrical features in binary images. Detection of a geometric feature under the GHT is accomplished by mapping the original image to an accumulator space; the large computational requirements for this mapping make the optical implementation an attractive alternative to digital- only methods. Starting from the integral representation of the GHT, it is possible to device an optical setup where the transformation is obtained, and the size and orientation parameters can be controlled, allowing for dynamic scale and orientation-variant pattern recognition. A compact system for the above purposes results from the use of an electrically tunable lens for scale control and a rotating pupil mask for orientation variation, implemented on a high-contrast spatial light modulator (SLM). Real-time (as limited by the frame rate of the device used to capture the GHT) can also be achieved, allowing for the processing of video sequences. Besides, by thresholding of the GHT (with the aid of another SLM) and inverse transforming (which is optically achieved in the incoherent system under appropriate focusing setting), the previously detected features of interest can be extracted.
Driver face recognition as a security and safety feature

NASA Astrophysics Data System (ADS)

Vetter, Volker; Giefing, Gerd-Juergen; Mai, Rudolf; Weisser, Hubert

1995-09-01

We present a driver face recognition system for comfortable access control and individual settings of automobiles. The primary goals are the prevention of car thefts and heavy accidents caused by unauthorized use (joy-riders), as well as the increase of safety through optimal settings, e.g. of the mirrors and the seat position. The person sitting on the driver's seat is observed automatically by a small video camera in the dashboard. All he has to do is to behave cooperatively, i.e. to look into the camera. A classification system validates his access. Only after a positive identification, the car can be used and the driver-specific environment (e.g. seat position, mirrors, etc.) may be set up to ensure the driver's comfort and safety. The driver identification system has been integrated in a Volkswagen research car. Recognition results are presented.
A keyword spotting model using perceptually significant energy features

NASA Astrophysics Data System (ADS)

Umakanthan, Padmalochini

The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.

Feature-based RNN target recognition

NASA Astrophysics Data System (ADS)

Bakircioglu, Hakan; Gelenbe, Erol

1998-09-01

Detection and recognition of target signatures in sensory data obtained by synthetic aperture radar (SAR), forward- looking infrared, or laser radar, have received considerable attention in the literature. In this paper, we propose a feature based target classification methodology to detect and classify targets in cluttered SAR images, that makes use of selective signature data from sensory data, together with a neural network technique which uses a set of trained networks based on the Random Neural Network (RNN) model (Gelenbe 89, 90, 91, 93) which is trained to act as a matched filter. We propose and investigate radial features of target shapes that are invariant to rotation, translation, and scale, to characterize target and clutter signatures. These features are then used to train a set of learning RNNs which can be used to detect targets within clutter with high accuracy, and to classify the targets or man-made objects from natural clutter. Experimental data from SAR imagery is used to illustrate and validate the proposed method, and to calculate Receiver Operating Characteristics which illustrate the performance of the proposed algorithm.
Hierarchical Recognition Scheme for Human Facial Expression Recognition Systems

PubMed Central

Siddiqi, Muhammad Hameed; Lee, Sungyoung; Lee, Young-Koo; Khan, Adil Mehmood; Truc, Phan Tran Ho

2013-01-01

Over the last decade, human facial expressions recognition (FER) has emerged as an important research area. Several factors make FER a challenging research problem. These include varying light conditions in training and test images; need for automatic and accurate face detection before feature extraction; and high similarity among different expressions that makes it difficult to distinguish these expressions with a high accuracy. This work implements a hierarchical linear discriminant analysis-based facial expressions recognition (HL-FER) system to tackle these problems. Unlike the previous systems, the HL-FER uses a pre-processing step to eliminate light effects, incorporates a new automatic face detection scheme, employs methods to extract both global and local features, and utilizes a HL-FER to overcome the problem of high similarity among different expressions. Unlike most of the previous works that were evaluated using a single dataset, the performance of the HL-FER is assessed using three publicly available datasets under three different experimental settings: n-fold cross validation based on subjects for each dataset separately; n-fold cross validation rule based on datasets; and, finally, a last set of experiments to assess the effectiveness of each module of the HL-FER separately. Weighted average recognition accuracy of 98.7% across three different datasets, using three classifiers, indicates the success of employing the HL-FER for human FER. PMID:24316568
Multiresolution analysis (discrete wavelet transform) through Daubechies family for emotion recognition in speech.

NASA Astrophysics Data System (ADS)

Campo, D.; Quintero, O. L.; Bastidas, M.

2016-04-01

We propose a study of the mathematical properties of voice as an audio signal. This work includes signals in which the channel conditions are not ideal for emotion recognition. Multiresolution analysis- discrete wavelet transform - was performed through the use of Daubechies Wavelet Family (Db1-Haar, Db6, Db8, Db10) allowing the decomposition of the initial audio signal into sets of coefficients on which a set of features was extracted and analyzed statistically in order to differentiate emotional states. ANNs proved to be a system that allows an appropriate classification of such states. This study shows that the extracted features using wavelet decomposition are enough to analyze and extract emotional content in audio signals presenting a high accuracy rate in classification of emotional states without the need to use other kinds of classical frequency-time features. Accordingly, this paper seeks to characterize mathematically the six basic emotions in humans: boredom, disgust, happiness, anxiety, anger and sadness, also included the neutrality, for a total of seven states to identify.
Temporal distance and person memory: thinking about the future changes memory for the past.

PubMed

Wyer, Natalie A; Perfect, Timothy J; Pahl, Sabine

2010-06-01

Psychological distance has been shown to influence how people construe an event such that greater distance produces high-level construal (characterized by global or holistic processing) and lesser distance produces low-level construal (characterized by detailed or feature-based processing). The present research tested the hypothesis that construal level has carryover effects on how information about an event is retrieved from memory. Two experiments manipulated temporal distance and found that greater distance (high-level construal) improves face recognition and increases retrieval of the abstract features of an event, whereas lesser distance (low-level construal) impairs face recognition and increases retrieval of the concrete details of an event. The findings have implications for transfer-inappropriate processing accounts of face recognition and event memory, and suggest potential applications in forensic settings.
Artificial Neural Network for Probabilistic Feature Recognition in Liquid Chromatography Coupled to High-Resolution Mass Spectrometry.

PubMed

Woldegebriel, Michael; Derks, Eduard

2017-01-17

In this work, a novel probabilistic untargeted feature detection algorithm for liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS) using artificial neural network (ANN) is presented. The feature detection process is approached as a pattern recognition problem, and thus, ANN was utilized as an efficient feature recognition tool. Unlike most existing feature detection algorithms, with this approach, any suspected chromatographic profile (i.e., shape of a peak) can easily be incorporated by training the network, avoiding the need to perform computationally expensive regression methods with specific mathematical models. In addition, with this method, we have shown that the high-resolution raw data can be fully utilized without applying any arbitrary thresholds or data reduction, therefore improving the sensitivity of the method for compound identification purposes. Furthermore, opposed to existing deterministic (binary) approaches, this method rather estimates the probability of a feature being present/absent at a given point of interest, thus giving chance for all data points to be propagated down the data analysis pipeline, weighed with their probability. The algorithm was tested with data sets generated from spiked samples in forensic and food safety context and has shown promising results by detecting features for all compounds in a computationally reasonable time.
Learning Compact Binary Face Descriptor for Face Recognition.

PubMed

Lu, Jiwen; Liong, Venice Erin; Zhou, Xiuzhuang; Zhou, Jie

2015-10-01

Binary feature descriptors such as local binary patterns (LBP) and its variations have been widely used in many face recognition systems due to their excellent robustness and strong discriminative power. However, most existing binary face descriptors are hand-crafted, which require strong prior knowledge to engineer them by hand. In this paper, we propose a compact binary face descriptor (CBFD) feature learning method for face representation and recognition. Given each face image, we first extract pixel difference vectors (PDVs) in local patches by computing the difference between each pixel and its neighboring pixels. Then, we learn a feature mapping to project these pixel difference vectors into low-dimensional binary vectors in an unsupervised manner, where 1) the variance of all binary codes in the training set is maximized, 2) the loss between the original real-valued codes and the learned binary codes is minimized, and 3) binary codes evenly distribute at each learned bin, so that the redundancy information in PDVs is removed and compact binary codes are obtained. Lastly, we cluster and pool these binary codes into a histogram feature as the final representation for each face image. Moreover, we propose a coupled CBFD (C-CBFD) method by reducing the modality gap of heterogeneous faces at the feature level to make our method applicable to heterogeneous face recognition. Extensive experimental results on five widely used face datasets show that our methods outperform state-of-the-art face descriptors.
Circular blurred shape model for multiclass symbol recognition.

PubMed

Escalera, Sergio; Fornés, Alicia; Pujol, Oriol; Lladós, Josep; Radeva, Petia

2011-04-01

In this paper, we propose a circular blurred shape model descriptor to deal with the problem of symbol detection and classification as a particular case of object recognition. The feature extraction is performed by capturing the spatial arrangement of significant object characteristics in a correlogram structure. The shape information from objects is shared among correlogram regions, where a prior blurring degree defines the level of distortion allowed in the symbol, making the descriptor tolerant to irregular deformations. Moreover, the descriptor is rotation invariant by definition. We validate the effectiveness of the proposed descriptor in both the multiclass symbol recognition and symbol detection domains. In order to perform the symbol detection, the descriptors are learned using a cascade of classifiers. In the case of multiclass categorization, the new feature space is learned using a set of binary classifiers which are embedded in an error-correcting output code design. The results over four symbol data sets show the significant improvements of the proposed descriptor compared to the state-of-the-art descriptors. In particular, the results are even more significant in those cases where the symbols suffer from elastic deformations.
Mexican sign language recognition using normalized moments and artificial neural networks

NASA Astrophysics Data System (ADS)

Solís-V., J.-Francisco; Toxqui-Quitl, Carina; Martínez-Martínez, David; H.-G., Margarita

2014-09-01

This work presents a framework designed for the Mexican Sign Language (MSL) recognition. A data set was recorded with 24 static signs from the MSL using 5 different versions, this MSL dataset was captured using a digital camera in incoherent light conditions. Digital Image Processing was used to segment hand gestures, a uniform background was selected to avoid using gloved hands or some special markers. Feature extraction was performed by calculating normalized geometric moments of gray scaled signs, then an Artificial Neural Network performs the recognition using a 10-fold cross validation tested in weka, the best result achieved 95.83% of recognition rate.
CNN-SVM for Microvascular Morphological Type Recognition with Data Augmentation.

PubMed

Xue, Di-Xiu; Zhang, Rong; Feng, Hui; Wang, Ya-Lei

2016-01-01

This paper focuses on the problem of feature extraction and the classification of microvascular morphological types to aid esophageal cancer detection. We present a patch-based system with a hybrid SVM model with data augmentation for intraepithelial papillary capillary loop recognition. A greedy patch-generating algorithm and a specialized CNN named NBI-Net are designed to extract hierarchical features from patches. We investigate a series of data augmentation techniques to progressively improve the prediction invariance of image scaling and rotation. For classifier boosting, SVM is used as an alternative to softmax to enhance generalization ability. The effectiveness of CNN feature representation ability is discussed for a set of widely used CNN models, including AlexNet, VGG-16, and GoogLeNet. Experiments are conducted on the NBI-ME dataset. The recognition rate is up to 92.74% on the patch level with data augmentation and classifier boosting. The results show that the combined CNN-SVM model beats models of traditional features with SVM as well as the original CNN with softmax. The synthesis results indicate that our system is able to assist clinical diagnosis to a certain extent.
Ensemble methods with simple features for document zone classification

NASA Astrophysics Data System (ADS)

Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing

2012-01-01

Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.
Facial expression recognition under partial occlusion based on fusion of global and local features

NASA Astrophysics Data System (ADS)

Wang, Xiaohua; Xia, Chen; Hu, Min; Ren, Fuji

2018-04-01

Facial expression recognition under partial occlusion is a challenging research. This paper proposes a novel framework for facial expression recognition under occlusion by fusing the global and local features. In global aspect, first, information entropy are employed to locate the occluded region. Second, principal Component Analysis (PCA) method is adopted to reconstruct the occlusion region of image. After that, a replace strategy is applied to reconstruct image by replacing the occluded region with the corresponding region of the best matched image in training set, Pyramid Weber Local Descriptor (PWLD) feature is then extracted. At last, the outputs of SVM are fitted to the probabilities of the target class by using sigmoid function. For the local aspect, an overlapping block-based method is adopted to extract WLD features, and each block is weighted adaptively by information entropy, Chi-square distance and similar block summation methods are then applied to obtain the probabilities which emotion belongs to. Finally, fusion at the decision level is employed for the data fusion of the global and local features based on Dempster-Shafer theory of evidence. Experimental results on the Cohn-Kanade and JAFFE databases demonstrate the effectiveness and fault tolerance of this method.
Classification Influence of Features on Given Emotions and Its Application in Feature Selection

NASA Astrophysics Data System (ADS)

Xing, Yin; Chen, Chuang; Liu, Li-Long

2018-04-01

In order to solve the problem that there is a large amount of redundant data in high-dimensional speech emotion features, we analyze deeply the extracted speech emotion features and select better features. Firstly, a given emotion is classified by each feature. Secondly, the recognition rate is ranked in descending order. Then, the optimal threshold of features is determined by rate criterion. Finally, the better features are obtained. When applied in Berlin and Chinese emotional data set, the experimental results show that the feature selection method outperforms the other traditional methods.
Influence of quality of images recorded in far infrared on pattern recognition based on neural networks and Eigenfaces algorithm

NASA Astrophysics Data System (ADS)

Jelen, Lukasz; Kobel, Joanna; Podbielska, Halina

2003-11-01

This paper discusses the possibility of exploiting of the tennovision registration and artificial neural networks for facial recognition systems. A biometric system that is able to identify people from thermograms is presented. To identify a person we used the Eigenfaces algorithm. For the face detection in the picture the backpropagation neural network was designed. For this purpose thermograms of 10 people in various external conditions were studies. The Eigenfaces algorithm calculated an average face and then the set of characteristic features for each studied person was produced. The neural network has to detect the face in the image before it actually can be identified. We used five hidden layers for that purpose. It was shown that the errors in recognition depend on the feature extraction, for low quality pictures the error was so high as 30%. However, for pictures with a good feature extraction the results of proper identification higher then 90%, were obtained.
Evaluation of accelerometer based multi-sensor versus single-sensor activity recognition systems.

PubMed

Gao, Lei; Bourke, A K; Nelson, John

2014-06-01

Physical activity has a positive impact on people's well-being and it had been shown to decrease the occurrence of chronic diseases in the older adult population. To date, a substantial amount of research studies exist, which focus on activity recognition using inertial sensors. Many of these studies adopt a single sensor approach and focus on proposing novel features combined with complex classifiers to improve the overall recognition accuracy. In addition, the implementation of the advanced feature extraction algorithms and the complex classifiers exceed the computing ability of most current wearable sensor platforms. This paper proposes a method to adopt multiple sensors on distributed body locations to overcome this problem. The objective of the proposed system is to achieve higher recognition accuracy with "light-weight" signal processing algorithms, which run on a distributed computing based sensor system comprised of computationally efficient nodes. For analysing and evaluating the multi-sensor system, eight subjects were recruited to perform eight normal scripted activities in different life scenarios, each repeated three times. Thus a total of 192 activities were recorded resulting in 864 separate annotated activity states. The methods for designing such a multi-sensor system required consideration of the following: signal pre-processing algorithms, sampling rate, feature selection and classifier selection. Each has been investigated and the most appropriate approach is selected to achieve a trade-off between recognition accuracy and computing execution time. A comparison of six different systems, which employ single or multiple sensors, is presented. The experimental results illustrate that the proposed multi-sensor system can achieve an overall recognition accuracy of 96.4% by adopting the mean and variance features, using the Decision Tree classifier. The results demonstrate that elaborate classifiers and feature sets are not required to achieve high recognition accuracies on a multi-sensor system. Copyright © 2014 IPEM. Published by Elsevier Ltd. All rights reserved.
A novel approach for fire recognition using hybrid features and manifold learning-based classifier

NASA Astrophysics Data System (ADS)

Zhu, Rong; Hu, Xueying; Tang, Jiajun; Hu, Sheng

2018-03-01

Although image/video based fire recognition has received growing attention, an efficient and robust fire detection strategy is rarely explored. In this paper, we propose a novel approach to automatically identify the flame or smoke regions in an image. It is composed to three stages: (1) a block processing is applied to divide an image into several nonoverlapping image blocks, and these image blocks are identified as suspicious fire regions or not by using two color models and a color histogram-based similarity matching method in the HSV color space, (2) considering that compared to other information, the flame and smoke regions have significant visual characteristics, so that two kinds of image features are extracted for fire recognition, where local features are obtained based on the Scale Invariant Feature Transform (SIFT) descriptor and the Bags of Keypoints (BOK) technique, and texture features are extracted based on the Gray Level Co-occurrence Matrices (GLCM) and the Wavelet-based Analysis (WA) methods, and (3) a manifold learning-based classifier is constructed based on two image manifolds, which is designed via an improve Globular Neighborhood Locally Linear Embedding (GNLLE) algorithm, and the extracted hybrid features are used as input feature vectors to train the classifier, which is used to make decision for fire images or non fire images. Experiments and comparative analyses with four approaches are conducted on the collected image sets. The results show that the proposed approach is superior to the other ones in detecting fire and achieving a high recognition accuracy and a low error rate.
Domain Regeneration for Cross-Database Micro-Expression Recognition

NASA Astrophysics Data System (ADS)

Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying

2018-05-01

In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.
Emotion recognition based on physiological changes in music listening.

PubMed

Kim, Jonghwa; André, Elisabeth

2008-12-01

Little attention has been paid so far to physiological signals for emotion recognition compared to audiovisual emotion channels such as facial expression or speech. This paper investigates the potential of physiological signals as reliable channels for emotion recognition. All essential stages of an automatic recognition system are discussed, from the recording of a physiological dataset to a feature-based multiclass classification. In order to collect a physiological dataset from multiple subjects over many weeks, we used a musical induction method which spontaneously leads subjects to real emotional states, without any deliberate lab setting. Four-channel biosensors were used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to find the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by classification results. Classification of four musical emotions (positive/high arousal, negative/high arousal, negative/low arousal, positive/low arousal) is performed by using an extended linear discriminant analysis (pLDA). Furthermore, by exploiting a dichotomic property of the 2D emotion model, we develop a novel scheme of emotion-specific multilevel dichotomous classification (EMDC) and compare its performance with direct multiclass classification using the pLDA. Improved recognition accuracy of 95\\% and 70\\% for subject-dependent and subject-independent classification, respectively, is achieved by using the EMDC scheme.
Automatic voice recognition using traditional and artificial neural network approaches

NASA Technical Reports Server (NTRS)

Botros, Nazeih M.

1989-01-01

The main objective of this research is to develop an algorithm for isolated-word recognition. This research is focused on digital signal analysis rather than linguistic analysis of speech. Features extraction is carried out by applying a Linear Predictive Coding (LPC) algorithm with order of 10. Continuous-word and speaker independent recognition will be considered in future study after accomplishing this isolated word research. To examine the similarity between the reference and the training sets, two approaches are explored. The first is implementing traditional pattern recognition techniques where a dynamic time warping algorithm is applied to align the two sets and calculate the probability of matching by measuring the Euclidean distance between the two sets. The second is implementing a backpropagation artificial neural net model with three layers as the pattern classifier. The adaptation rule implemented in this network is the generalized least mean square (LMS) rule. The first approach has been accomplished. A vocabulary of 50 words was selected and tested. The accuracy of the algorithm was found to be around 85 percent. The second approach is in progress at the present time.
Natural scene logo recognition by joint boosting feature selection in salient regions

NASA Astrophysics Data System (ADS)

Fan, Wei; Sun, Jun; Naoi, Satoshi; Minagawa, Akihiro; Hotta, Yoshinobu

2011-01-01

Logos are considered valuable intellectual properties and a key component of the goodwill of a business. In this paper, we propose a natural scene logo recognition method which is segmentation-free and capable of processing images extremely rapidly and achieving high recognition rates. The classifiers for each logo are trained jointly, rather than independently. In this way, common features can be shared across multiple classes for better generalization. To deal with large range of aspect ratio of different logos, a set of salient regions of interest (ROI) are extracted to describe each class. We ensure the selected ROIs to be both individually informative and two-by-two weakly dependant by a Class Conditional Entropy Maximization criteria. Experimental results on a large logo database demonstrate the effectiveness and efficiency of our proposed method.
Fuzzy Intervals for Designing Structural Signature: An Application to Graphic Symbol Recognition

NASA Astrophysics Data System (ADS)

Luqman, Muhammad Muzzamil; Delalandre, Mathieu; Brouard, Thierry; Ramel, Jean-Yves; Lladós, Josep

The motivation behind our work is to present a new methodology for symbol recognition. The proposed method employs a structural approach for representing visual associations in symbols and a statistical classifier for recognition. We vectorize a graphic symbol, encode its topological and geometrical information by an attributed relational graph and compute a signature from this structural graph. We have addressed the sensitivity of structural representations to noise, by using data adapted fuzzy intervals. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set. The Bayesian network is deployed in a supervised learning scenario for recognizing query symbols. The method has been evaluated for robustness against degradations & deformations on pre-segmented 2D linear architectural & electronic symbols from GREC databases, and for its recognition abilities on symbols with context noise i.e. cropped symbols.

A simulation framework for auditory discrimination experiments: Revealing the importance of across-frequency processing in speech perception.

PubMed

Schädler, Marc René; Warzybok, Anna; Ewert, Stephan D; Kollmeier, Birger

2016-05-01

A framework for simulating auditory discrimination experiments, based on an approach from Schädler, Warzybok, Hochmuth, and Kollmeier [(2015). Int. J. Audiol. 54, 100-107] which was originally designed to predict speech recognition thresholds, is extended to also predict psychoacoustic thresholds. The proposed framework is used to assess the suitability of different auditory-inspired feature sets for a range of auditory discrimination experiments that included psychoacoustic as well as speech recognition experiments in noise. The considered experiments were 2 kHz tone-in-broadband-noise simultaneous masking depending on the tone length, spectral masking with simultaneously presented tone signals and narrow-band noise maskers, and German Matrix sentence test reception threshold in stationary and modulated noise. The employed feature sets included spectro-temporal Gabor filter bank features, Mel-frequency cepstral coefficients, logarithmically scaled Mel-spectrograms, and the internal representation of the Perception Model from Dau, Kollmeier, and Kohlrausch [(1997). J. Acoust. Soc. Am. 102(5), 2892-2905]. The proposed framework was successfully employed to simulate all experiments with a common parameter set and obtain objective thresholds with less assumptions compared to traditional modeling approaches. Depending on the feature set, the simulated reference-free thresholds were found to agree with-and hence to predict-empirical data from the literature. Across-frequency processing was found to be crucial to accurately model the lower speech reception threshold in modulated noise conditions than in stationary noise conditions.
The effects of digital signal processing features on children's speech recognition and loudness perception.

PubMed

Crukley, Jeffery; Scollie, Susan D

2014-03-01

The purpose of this study was to determine the effects of hearing instruments set to Desired Sensation Level version 5 (DSL v5) hearing instrument prescription algorithm targets and equipped with directional microphones and digital noise reduction (DNR) on children's sentence recognition in noise performance and loudness perception in a classroom environment. Ten children (ages 8-17 years) with stable, congenital sensorineural hearing losses participated in the study. Participants were fitted bilaterally with behind-the-ear hearing instruments set to DSL v5 prescriptive targets. Sentence recognition in noise was evaluated using the Bamford-Kowal-Bench Speech in Noise Test (Niquette et al., 2003). Loudness perception was evaluated using a modified version of the Contour Test of Loudness Perception (Cox, Alexander, Taylor, & Gray, 1997). Children's sentence recognition in noise performance was significantly better when using directional microphones alone or in combination with DNR than when using omnidirectional microphones alone or in combination with DNR. Children's loudness ratings for sounds above 72 dB SPL were lowest when fitted with the DSL v5 Noise prescription combined with directional microphones. DNR use showed no effect on loudness ratings. Use of the DSL v5 Noise prescription with a directional microphone improved sentence recognition in noise performance and reduced loudness perception ratings for loud sounds relative to a typical clinical reference fitting with the DSL v5 Quiet prescription with no digital signal processing features enabled. Potential clinical strategies are discussed.
Towards limb position invariant myoelectric pattern recognition using time-dependent spectral features.

PubMed

Khushaba, Rami N; Takruri, Maen; Miro, Jaime Valls; Kodagoda, Sarath

2014-07-01

Recent studies in Electromyogram (EMG) pattern recognition reveal a gap between research findings and a viable clinical implementation of myoelectric control strategies. One of the important factors contributing to the limited performance of such controllers in practice is the variation in the limb position associated with normal use as it results in different EMG patterns for the same movements when carried out at different positions. However, the end goal of the myoelectric control scheme is to allow amputees to control their prosthetics in an intuitive and accurate manner regardless of the limb position at which the movement is initiated. In an attempt to reduce the impact of limb position on EMG pattern recognition, this paper proposes a new feature extraction method that extracts a set of power spectrum characteristics directly from the time-domain. The end goal is to form a set of features invariant to limb position. Specifically, the proposed method estimates the spectral moments, spectral sparsity, spectral flux, irregularity factor, and signals power spectrum correlation. This is achieved through using Fourier transform properties to form invariants to amplification, translation and signal scaling, providing an efficient and accurate representation of the underlying EMG activity. Additionally, due to the inherent temporal structure of the EMG signal, the proposed method is applied on the global segments of EMG data as well as the sliced segments using multiple overlapped windows. The performance of the proposed features is tested on EMG data collected from eleven subjects, while implementing eight classes of movements, each at five different limb positions. Practical results indicate that the proposed feature set can achieve significant reduction in classification error rates, in comparison to other methods, with ≈8% error on average across all subjects and limb positions. A real-time implementation and demonstration is also provided and made available as a video supplement (see Appendix A). Copyright © 2014 Elsevier Ltd. All rights reserved.
Classifying performance impairment in response to sleep loss using pattern recognition algorithms on single session testing

PubMed Central

St. Hilaire, Melissa A.; Sullivan, Jason P.; Anderson, Clare; Cohen, Daniel A.; Barger, Laura K.; Lockley, Steven W.; Klerman, Elizabeth B.

2012-01-01

There is currently no “gold standard” marker of cognitive performance impairment resulting from sleep loss. We utilized pattern recognition algorithms to determine which features of data collected under controlled laboratory conditions could most reliably identify cognitive performance impairment in response to sleep loss using data from only one testing session, such as would occur in the “real world” or field conditions. A training set for testing the pattern recognition algorithms was developed using objective Psychomotor Vigilance Task (PVT) and subjective Karolinska Sleepiness Scale (KSS) data collected from laboratory studies during which subjects were sleep deprived for 26 – 52 hours. The algorithm was then tested in data from both laboratory and field experiments. The pattern recognition algorithm was able to identify performance impairment with a single testing session in individuals studied under laboratory conditions using PVT, KSS, length of time awake and time of day information with sensitivity and specificity as high as 82%. When this algorithm was tested on data collected under real-world conditions from individuals whose data were not in the training set, accuracy of predictions for individuals categorized with low performance impairment were as high as 98%. Predictions for medium and severe performance impairment were less accurate. We conclude that pattern recognition algorithms may be a promising method for identifying performance impairment in individuals using only current information about the individual’s behavior. Single testing features (e.g., number of PVT lapses) with high correlation with performance impairment in the laboratory setting may not be the best indicators of performance impairment under real-world conditions. Pattern recognition algorithms should be further tested for their ability to be used in conjunction with other assessments of sleepiness in real-world conditions to quantify performance impairment in response to sleep loss. PMID:22959616
A human performance evaluation of graphic symbol-design features.

PubMed

Samet, M G; Geiselman, R E; Landee, B M

1982-06-01

16 subjects learned each of two tactical display symbol sets (conventional symbols and iconic symbols) in turn and were then shown a series of graphic displays containing various symbol configurations. For each display, the subject was asked questions corresponding to different behavioral processes relating to symbol use (identification, search, comparison, pattern recognition). The results indicated that: (a) conventional symbols yielded faster pattern-recognition performance than iconic symbols, and iconic symbols did not yield faster identification than conventional symbols, and (b) the portrayal of additional feature information (through the use of perimeter density or vector projection coding) slowed processing of the core symbol information in four tasks, but certain symbol-design features created less perceptual interference and had greater correspondence with the portrayal of specific tactical concepts than others. The results were discussed in terms of the complexities involved in the selection of symbol design features for use in graphic tactical displays.
Experimental study on GMM-based speaker recognition

NASA Astrophysics Data System (ADS)

Ye, Wenxing; Wu, Dapeng; Nucci, Antonio

2010-04-01

Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent.
Time-frequency feature representation using multi-resolution texture analysis and acoustic activity detector for real-life speech emotion recognition.

PubMed

Wang, Kun-Ching

2015-01-14

The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.
Holistic processing, contact, and the other-race effect in face recognition.

PubMed

Zhao, Mintao; Hayward, William G; Bülthoff, Isabelle

2014-12-01

Face recognition, holistic processing, and processing of configural and featural facial information are known to be influenced by face race, with better performance for own- than other-race faces. However, whether these various other-race effects (OREs) arise from the same underlying mechanisms or from different processes remains unclear. The present study addressed this question by measuring the OREs in a set of face recognition tasks, and testing whether these OREs are correlated with each other. Participants performed different tasks probing (1) face recognition, (2) holistic processing, (3) processing of configural information, and (4) processing of featural information for both own- and other-race faces. Their contact with other-race people was also assessed with a questionnaire. The results show significant OREs in tasks testing face memory and processing of configural information, but not in tasks testing either holistic processing or processing of featural information. Importantly, there was no cross-task correlation between any of the measured OREs. Moreover, the level of other-race contact predicted only the OREs obtained in tasks testing face memory and processing of configural information. These results indicate that these various cross-race differences originate from different aspects of face processing, in contrary to the view that the ORE in face recognition is due to cross-race differences in terms of holistic processing. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Comparison Analysis of Recognition Algorithms of Forest-Cover Objects on Hyperspectral Air-Borne and Space-Borne Images

NASA Astrophysics Data System (ADS)

Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.

2017-12-01

The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.
Extraction of edge-based and region-based features for object recognition

NASA Astrophysics Data System (ADS)

Coutts, Benjamin; Ravi, Srinivas; Hu, Gongzhu; Shrikhande, Neelima

1993-08-01

One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Employing wavelet-based texture features in ammunition classification

NASA Astrophysics Data System (ADS)

Borzino, Ángelo M. C. R.; Maher, Robert C.; Apolinário, José A.; de Campos, Marcello L. R.

2017-05-01

Pattern recognition, a branch of machine learning, involves classification of information in images, sounds, and other digital representations. This paper uses pattern recognition to identify which kind of ammunition was used when a bullet was fired based on a carefully constructed set of gunshot sound recordings. To do this task, we show that texture features obtained from the wavelet transform of a component of the gunshot signal, treated as an image, and quantized in gray levels, are good ammunition discriminators. We test the technique with eight different calibers and achieve a classification rate better than 95%. We also compare the performance of the proposed method with results obtained by standard temporal and spectrographic techniques
Combining feature extraction and classification for fNIRS BCIs by regularized least squares optimization.

PubMed

Heger, Dominic; Herff, Christian; Schultz, Tanja

2014-01-01

In this paper, we show that multiple operations of the typical pattern recognition chain of an fNIRS-based BCI, including feature extraction and classification, can be unified by solving a convex optimization problem. We formulate a regularized least squares problem that learns a single affine transformation of raw HbO(2) and HbR signals. We show that this transformation can achieve competitive results in an fNIRS BCI classification task, as it significantly improves recognition of different levels of workload over previously published results on a publicly available n-back data set. Furthermore, we visualize the learned models and analyze their spatio-temporal characteristics.
Combining point context and dynamic time warping for online gesture recognition

NASA Astrophysics Data System (ADS)

Mao, Xia; Li, Chen

2017-05-01

Previous gesture recognition methods usually focused on recognizing gestures after the entire gesture sequences were obtained. However, in many practical applications, a system has to identify gestures before they end to give instant feedback. We present an online gesture recognition approach that can realize early recognition of unfinished gestures with low latency. First, a curvature buffer-based point context (CBPC) descriptor is proposed to extract the shape feature of a gesture trajectory. The CBPC descriptor is a complete descriptor with a simple computation, and thus has its superiority in online scenarios. Then, we introduce an online windowed dynamic time warping algorithm to realize online matching between the ongoing gesture and the template gestures. In the algorithm, computational complexity is effectively decreased by adding a sliding window to the accumulative distance matrix. Lastly, the experiments are conducted on the Australian sign language data set and the Kinect hand gesture (KHG) data set. Results show that the proposed method outperforms other state-of-the-art methods especially when gesture information is incomplete.
Enhancement of face recognition learning in patients with brain injury using three cognitive training procedures.

PubMed

Powell, Jane; Letson, Susan; Davidoff, Jules; Valentine, Tim; Greenwood, Richard

2008-04-01

Twenty patients with impairments of face recognition, in the context of a broader pattern of cognitive deficits, were administered three new training procedures derived from contemporary theories of face processing to enhance their learning of new faces: semantic association (being given additional verbal information about the to-be-learned faces); caricaturing (presentation of caricatured versions of the faces during training and veridical versions at recognition testing); and part recognition (focusing patients on distinctive features during the training phase). Using a within-subjects design, each training procedure was applied to a different set of 10 previously unfamiliar faces and entailed six presentations of each face. In a "simple exposure" control procedure (SE), participants were given six presentations of another set of faces using the same basic protocol but with no further elaboration. Order of the four procedures was counterbalanced, and each condition was administered on a different day. A control group of 12 patients with similar levels of face recognition impairment were trained on all four sets of faces under SE conditions. Compared to the SE condition, all three training procedures resulted in more accurate discrimination between the 10 studied faces and 10 distractor faces in a post-training recognition test. This did not reflect any intrinsic lesser memorability of the faces used in the SE condition, as evidenced by the comparable performance across face sets by the control group. At the group level, the three experimental procedures were of similar efficacy, and associated cognitive deficits did not predict which technique would be most beneficial to individual patients; however, there was limited power to detect such associations. Interestingly, a pure prosopagnosic patient who was tested separately showed benefit only from the part recognition technique. Possible mechanisms for the observed effects, and implications for rehabilitation, are discussed.
Adaptive weighted local textural features for illumination, expression, and occlusion invariant face recognition

NASA Astrophysics Data System (ADS)

Cui, Chen; Asari, Vijayan K.

2014-03-01

Biometric features such as fingerprints, iris patterns, and face features help to identify people and restrict access to secure areas by performing advanced pattern analysis and matching. Face recognition is one of the most promising biometric methodologies for human identification in a non-cooperative security environment. However, the recognition results obtained by face recognition systems are a affected by several variations that may happen to the patterns in an unrestricted environment. As a result, several algorithms have been developed for extracting different facial features for face recognition. Due to the various possible challenges of data captured at different lighting conditions, viewing angles, facial expressions, and partial occlusions in natural environmental conditions, automatic facial recognition still remains as a difficult issue that needs to be resolved. In this paper, we propose a novel approach to tackling some of these issues by analyzing the local textural descriptions for facial feature representation. The textural information is extracted by an enhanced local binary pattern (ELBP) description of all the local regions of the face. The relationship of each pixel with respect to its neighborhood is extracted and employed to calculate the new representation. ELBP reconstructs a much better textural feature extraction vector from an original gray level image in different lighting conditions. The dimensionality of the texture image is reduced by principal component analysis performed on each local face region. Each low dimensional vector representing a local region is now weighted based on the significance of the sub-region. The weight of each sub-region is determined by employing the local variance estimate of the respective region, which represents the significance of the region. The final facial textural feature vector is obtained by concatenating the reduced dimensional weight sets of all the modules (sub-regions) of the face image. Experiments conducted on various popular face databases show promising performance of the proposed algorithm in varying lighting, expression, and partial occlusion conditions. Four databases were used for testing the performance of the proposed system: Yale Face database, Extended Yale Face database B, Japanese Female Facial Expression database, and CMU AMP Facial Expression database. The experimental results in all four databases show the effectiveness of the proposed system. Also, the computation cost is lower because of the simplified calculation steps. Research work is progressing to investigate the effectiveness of the proposed face recognition method on pose-varying conditions as well. It is envisaged that a multilane approach of trained frameworks at different pose bins and an appropriate voting strategy would lead to a good recognition rate in such situation.
Uniform Local Binary Pattern Based Texture-Edge Feature for 3D Human Behavior Recognition.

PubMed

Ming, Yue; Wang, Guangchao; Fan, Chunxiao

2015-01-01

With the rapid development of 3D somatosensory technology, human behavior recognition has become an important research field. Human behavior feature analysis has evolved from traditional 2D features to 3D features. In order to improve the performance of human activity recognition, a human behavior recognition method is proposed, which is based on a hybrid texture-edge local pattern coding feature extraction and integration of RGB and depth videos information. The paper mainly focuses on background subtraction on RGB and depth video sequences of behaviors, extracting and integrating historical images of the behavior outlines, feature extraction and classification. The new method of 3D human behavior recognition has achieved the rapid and efficient recognition of behavior videos. A large number of experiments show that the proposed method has faster speed and higher recognition rate. The recognition method has good robustness for different environmental colors, lightings and other factors. Meanwhile, the feature of mixed texture-edge uniform local binary pattern can be used in most 3D behavior recognition.
Illumination-invariant and deformation-tolerant inner knuckle print recognition using portable devices.

PubMed

Xu, Xuemiao; Jin, Qiang; Zhou, Le; Qin, Jing; Wong, Tien-Tsin; Han, Guoqiang

2015-02-12

We propose a novel biometric recognition method that identifies the inner knuckle print (IKP). It is robust enough to confront uncontrolled lighting conditions, pose variations and low imaging quality. Such robustness is crucial for its application on portable devices equipped with consumer-level cameras. We achieve this robustness by two means. First, we propose a novel feature extraction scheme that highlights the salient structure and suppresses incorrect and/or unwanted features. The extracted IKP features retain simple geometry and morphology and reduce the interference of illumination. Second, to counteract the deformation induced by different hand orientations, we propose a novel structure-context descriptor based on local statistics. To our best knowledge, we are the first to simultaneously consider the illumination invariance and deformation tolerance for appearance-based low-resolution hand biometrics. Settings in previous works are more restrictive. They made strong assumptions either about the illumination condition or the restrictive hand orientation. Extensive experiments demonstrate that our method outperforms the state-of-the-art methods in terms of recognition accuracy, especially under uncontrolled lighting conditions and the flexible hand orientation requirement.
Illumination-Invariant and Deformation-Tolerant Inner Knuckle Print Recognition Using Portable Devices

PubMed Central

Xu, Xuemiao; Jin, Qiang; Zhou, Le; Qin, Jing; Wong, Tien-Tsin; Han, Guoqiang

2015-01-01

We propose a novel biometric recognition method that identifies the inner knuckle print (IKP). It is robust enough to confront uncontrolled lighting conditions, pose variations and low imaging quality. Such robustness is crucial for its application on portable devices equipped with consumer-level cameras. We achieve this robustness by two means. First, we propose a novel feature extraction scheme that highlights the salient structure and suppresses incorrect and/or unwanted features. The extracted IKP features retain simple geometry and morphology and reduce the interference of illumination. Second, to counteract the deformation induced by different hand orientations, we propose a novel structure-context descriptor based on local statistics. To our best knowledge, we are the first to simultaneously consider the illumination invariance and deformation tolerance for appearance-based low-resolution hand biometrics. Settings in previous works are more restrictive. They made strong assumptions either about the illumination condition or the restrictive hand orientation. Extensive experiments demonstrate that our method outperforms the state-of-the-art methods in terms of recognition accuracy, especially under uncontrolled lighting conditions and the flexible hand orientation requirement. PMID:25686317
Efficient iris recognition by characterizing key local variations.

PubMed

Ma, Li; Tan, Tieniu; Wang, Yunhong; Zhang, Dexin

2004-06-01

Unlike other biometrics such as fingerprints and face, the distinct aspect of iris comes from randomly distributed features. This leads to its high reliability for personal identification, and at the same time, the difficulty in effectively representing such details in an image. This paper describes an efficient algorithm for iris recognition by characterizing key local variations. The basic idea is that local sharp variation points, denoting the appearing or vanishing of an important image structure, are utilized to represent the characteristics of the iris. The whole procedure of feature extraction includes two steps: 1) a set of one-dimensional intensity signals is constructed to effectively characterize the most important information of the original two-dimensional image; 2) using a particular class of wavelets, a position sequence of local sharp variation points in such signals is recorded as features. We also present a fast matching scheme based on exclusive OR operation to compute the similarity between a pair of position sequences. Experimental results on 2255 iris images show that the performance of the proposed method is encouraging and comparable to the best iris recognition algorithm found in the current literature.
Efficient Spatio-Temporal Local Binary Patterns for Spontaneous Facial Micro-Expression Recognition

PubMed Central

Wang, Yandan; See, John; Phan, Raphael C.-W.; Oh, Yee-Hui

2015-01-01

Micro-expression recognition is still in the preliminary stage, owing much to the numerous difficulties faced in the development of datasets. Since micro-expression is an important affective clue for clinical diagnosis and deceit analysis, much effort has gone into the creation of these datasets for research purposes. There are currently two publicly available spontaneous micro-expression datasets—SMIC and CASME II, both with baseline results released using the widely used dynamic texture descriptor LBP-TOP for feature extraction. Although LBP-TOP is popular and widely used, it is still not compact enough. In this paper, we draw further inspiration from the concept of LBP-TOP that considers three orthogonal planes by proposing two efficient approaches for feature extraction. The compact robust form described by the proposed LBP-Six Intersection Points (SIP) and a super-compact LBP-Three Mean Orthogonal Planes (MOP) not only preserves the essential patterns, but also reduces the redundancy that affects the discriminality of the encoded features. Through a comprehensive set of experiments, we demonstrate the strengths of our approaches in terms of recognition accuracy and efficiency. PMID:25993498

Educators' insights in using chronicle diabetes: a data management system for diabetes education.

PubMed

Wang, Jing; Siminerio, Linda M

2013-01-01

Diabetes educators lack data systems to monitor diabetes self-management education processes and programs. The purpose of the study is to explore diabetes educator's insights in using a diabetes education data management program: the Chronicle Diabetes system. We conducted 1 focus group with 8 diabetes educators who use the Chronicle system in western Pennsylvania. The focus group was audiotaped and transcribed verbatim. Themes were categorized according to system facilitators and barriers in using Chronicle. Educators report 4 system facilitators and 4 barrier features. System facilitators include (1) ability to extract data from Chronicle for education program recognition, (2) central location for collecting and documenting all patient and education data, (3) capability to monitor behavioral goal setting and clinical outcomes, and (4) use of a patient snapshot report that automatically summarizes behavioral goal setting and an education plan. Barriers reported are (1) initially time-consuming for data entry, (2) Health Insurance Portability and Accountability Act privacy concerns for e-mailing or downloading report, (3) need for special features (e.g., ability to attach a food diary), and (4) need to enhance existing features to standardize goal-setting process and incorporate psychosocial content. Educators favor capabilities for documenting program requirements, goal setting, and patient summaries. Barriers that need to be overcome are the amount of time needed for data entry, privacy, and special features. Diabetes educators conclude that a data management system such as Chronicle facilitates the education process and affords ease in documentation of meeting diabetes self-management education standards and recognition requirements.
A Random Forest-based ensemble method for activity recognition.

PubMed

Feng, Zengtao; Mo, Lingfei; Li, Meng

2015-01-01

This paper presents a multi-sensor ensemble approach to human physical activity (PA) recognition, using random forest. We designed an ensemble learning algorithm, which integrates several independent Random Forest classifiers based on different sensor feature sets to build a more stable, more accurate and faster classifier for human activity recognition. To evaluate the algorithm, PA data collected from the PAMAP (Physical Activity Monitoring for Aging People), which is a standard, publicly available database, was utilized to train and test. The experimental results show that the algorithm is able to correctly recognize 19 PA types with an accuracy of 93.44%, while the training is faster than others. The ensemble classifier system based on the RF (Random Forest) algorithm can achieve high recognition accuracy and fast calculation.
Gimli: open source and high-performance biomedical name recognition

PubMed Central

2013-01-01

Background Automatic recognition of biomedical names is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. In recent years, various solutions have been implemented to tackle this problem. However, limitations regarding system characteristics, customization and usability still hinder their wider application outside text mining research. Results We present Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli achieves an F-measure of 87.17% on GENETAG and 72.23% on JNLPBA corpus, significantly outperforming existing open-source solutions. Conclusions Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field. Gimli is freely available at http://bioinformatics.ua.pt/gimli. PMID:23413997
Improving EMG based classification of basic hand movements using EMD.

PubMed

Sapsanis, Christos; Georgoulas, George; Tzes, Anthony; Lymberopoulos, Dimitrios

2013-01-01

This paper presents a pattern recognition approach for the identification of basic hand movements using surface electromyographic (EMG) data. The EMG signal is decomposed using Empirical Mode Decomposition (EMD) into Intrinsic Mode Functions (IMFs) and subsequently a feature extraction stage takes place. Various combinations of feature subsets are tested using a simple linear classifier for the detection task. Our results suggest that the use of EMD can increase the discrimination ability of the conventional feature sets extracted from the raw EMG signal.
Selection of Norway spruce somatic embryos by computer vision

NASA Astrophysics Data System (ADS)

Hamalainen, Jari J.; Jokinen, Kari J.

1993-05-01

A computer vision system was developed for the classification of plant somatic embryos. The embryos are in a Petri dish that is transferred with constant speed and they are recognized as they pass a line scan camera. A classification algorithm needs to be installed for every plant species. This paper describes an algorithm for the recognition of Norway spruce (Picea abies) embryos. A short review of conifer micropropagation by somatic embryogenesis is also given. The recognition algorithm is based on features calculated from the boundary of the object. Only part of the boundary corresponding to the developing cotyledons (2 - 15) and the straight sides of the embryo are used for recognition. An index of the length of the cotyledons describes the developmental stage of the embryo. The testing set for classifier performance consisted of 118 embryos and 478 nonembryos. With the classification tolerances chosen 69% of the objects classified as embryos by a human classifier were selected and 31$% rejected. Less than 1% of the nonembryos were classified as embryos. The basic features developed can probably be easily adapted for the recognition of other conifer somatic embryos.
Unconstrained handwritten numeral recognition based on radial basis competitive and cooperative networks with spatio-temporal feature representation.

PubMed

Lee, S; Pan, J J

1996-01-01

This paper presents a new approach to representation and recognition of handwritten numerals. The approach first transforms a two-dimensional (2-D) spatial representation of a numeral into a three-dimensional (3-D) spatio-temporal representation by identifying the tracing sequence based on a set of heuristic rules acting as transformation operators. A multiresolution critical-point segmentation method is then proposed to extract local feature points, at varying degrees of scale and coarseness. A new neural network architecture, referred to as radial-basis competitive and cooperative network (RCCN), is presented especially for handwritten numeral recognition. RCCN is a globally competitive and locally cooperative network with the capability of self-organizing hidden units to progressively achieve desired network performance, and functions as a universal approximator of arbitrary input-output mappings. Three types of RCCNs are explored: input-space RCCN (IRCCN), output-space RCCN (ORCCN), and bidirectional RCCN (BRCCN). Experiments against handwritten zip code numerals acquired by the U.S. Postal Service indicated that the proposed method is robust in terms of variations, deformations, transformations, and corruption, achieving about 97% recognition rate.
Higher-order neural network software for distortion invariant object recognition

NASA Technical Reports Server (NTRS)

Reid, Max B.; Spirkovska, Lilly

1991-01-01

The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.
Learning Hierarchical Feature Extractors for Image Recognition

DTIC Science & Technology

2012-09-01

space as a natural criterion for devising better pools. Finally, we propose ways to make coding faster and more powerful through fast convolutional...parameter is the set of pools over which the summary statistic is computed. We propose locality in feature configuration space as a natural criterion for...pooling (dotted lines) is consistently higher than average pooling (solid lines), but the gap is much less signif - icant with intersection kernel (closed
Authorship Attribution of Short Messages Using Multimodal Features

DTIC Science & Technology

2011-03-01

demodulation algorithm, but does say that it has to be able to handle two multipath 27 signals of equal power received at up to 16 µs apart. This...possible with appropriate normalization of the data. The fields of biometrics, image analysis, and handwriting analysis also use diverse feature sets...Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition,” IEEE Transactions on Systems, Man, and Cybernetics
Generic decoding of seen and imagined objects using hierarchical visual features.

PubMed

Horikawa, Tomoyasu; Kamitani, Yukiyasu

2017-05-22

Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Iris recognition using possibilistic fuzzy matching on local features.

PubMed

Tsai, Chung-Chih; Lin, Heng-Yi; Taur, Jinshiuh; Tao, Chin-Wang

2012-02-01

In this paper, we propose a novel possibilistic fuzzy matching strategy with invariant properties, which can provide a robust and effective matching scheme for two sets of iris feature points. In addition, the nonlinear normalization model is adopted to provide more accurate position before matching. Moreover, an effective iris segmentation method is proposed to refine the detected inner and outer boundaries to smooth curves. For feature extraction, the Gabor filters are adopted to detect the local feature points from the segmented iris image in the Cartesian coordinate system and to generate a rotation-invariant descriptor for each detected point. After that, the proposed matching algorithm is used to compute a similarity score for two sets of feature points from a pair of iris images. The experimental results show that the performance of our system is better than those of the systems based on the local features and is comparable to those of the typical systems.
Two-dimensional wavelet analysis based classification of gas chromatogram differential mobility spectrometry signals.

PubMed

Zhao, Weixiang; Sankaran, Shankar; Ibáñez, Ana M; Dandekar, Abhaya M; Davis, Cristina E

2009-08-04

This study introduces two-dimensional (2-D) wavelet analysis to the classification of gas chromatogram differential mobility spectrometry (GC/DMS) data which are composed of retention time, compensation voltage, and corresponding intensities. One reported method to process such large data sets is to convert 2-D signals to 1-D signals by summing intensities either across retention time or compensation voltage, but it can lose important signal information in one data dimension. A 2-D wavelet analysis approach keeps the 2-D structure of original signals, while significantly reducing data size. We applied this feature extraction method to 2-D GC/DMS signals measured from control and disordered fruit and then employed two typical classification algorithms to testify the effects of the resultant features on chemical pattern recognition. Yielding a 93.3% accuracy of separating data from control and disordered fruit samples, 2-D wavelet analysis not only proves its feasibility to extract feature from original 2-D signals but also shows its superiority over the conventional feature extraction methods including converting 2-D to 1-D and selecting distinguishable pixels from training set. Furthermore, this process does not require coupling with specific pattern recognition methods, which may help ensure wide applications of this method to 2-D spectrometry data.
Multiscale moment-based technique for object matching and recognition

NASA Astrophysics Data System (ADS)

Thio, HweeLi; Chen, Liya; Teoh, Eam-Khwang

2000-03-01

A new method is proposed to extract features from an object for matching and recognition. The features proposed are a combination of local and global characteristics -- local characteristics from the 1-D signature function that is defined to each pixel on the object boundary, global characteristics from the moments that are generated from the signature function. The boundary of the object is first extracted, then the signature function is generated by computing the angle between two lines from every point on the boundary as a function of position along the boundary. This signature function is position, scale and rotation invariant (PSRI). The shape of the signature function is then described quantitatively by using moments. The moments of the signature function are the global characters of a local feature set. Using moments as the eventual features instead of the signature function reduces the time and complexity of an object matching application. Multiscale moments are implemented to produce several sets of moments that will generate more accurate matching. Basically multiscale technique is a coarse to fine procedure and makes the proposed method more robust to noise. This method is proposed to match and recognize objects under simple transformation, such as translation, scale changes, rotation and skewing. A simple logo indexing system is implemented to illustrate the performance of the proposed method.
Time-Frequency Feature Representation Using Multi-Resolution Texture Analysis and Acoustic Activity Detector for Real-Life Speech Emotion Recognition

PubMed Central

Wang, Kun-Ching

2015-01-01

The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech. PMID:25594590
Neural network and letter recognition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Hue Yeon.

Neural net architectures and learning algorithms that recognize hand written 36 alphanumeric characters are studied. The thin line input patterns written in 32 x 32 binary array are used. The system is comprised of two major components, viz. a preprocessing unit and a Recognition unit. The preprocessing unit in turn consists of three layers of neurons; the U-layer, the V-layer, and the C-layer. The functions of the U-layer is to extract local features by template matching. The correlation between the detected local features are considered. Through correlating neurons in a plane with their neighboring neurons, the V-layer would thicken themore » on-cells or lines that are groups of on-cells of the previous layer. These two correlations would yield some deformation tolerance and some of the rotational tolerance of the system. The C-layer then compresses data through the Gabor transform. Pattern dependent choice of center and wavelengths of Gabor filters is the cause of shift and scale tolerance of the system. Three different learning schemes had been investigated in the recognition unit, namely; the error back propagation learning with hidden units, a simple perceptron learning, and a competitive learning. Their performances were analyzed and compared. Since sometimes the network fails to distinguish between two letters that are inherently similar, additional ambiguity resolving neural nets are introduced on top of the above main neural net. The two dimensional Fourier transform is used as the preprocessing and the perceptron is used as the recognition unit of the ambiguity resolver. One hundred different person's handwriting sets are collected. Some of these are used as the training sets and the remainders are used as the test sets.« less
DCT-based iris recognition.

PubMed

Monro, Donald M; Rakshit, Soumyadip; Zhang, Dexin

2007-04-01

This paper presents a novel iris coding method based on differences of discrete cosine transform (DCT) coefficients of overlapped angular patches from normalized iris images. The feature extraction capabilities of the DCT are optimized on the two largest publicly available iris image data sets, 2,156 images of 308 eyes from the CASIA database and 2,955 images of 150 eyes from the Bath database. On this data, we achieve 100 percent Correct Recognition Rate (CRR) and perfect Receiver-Operating Characteristic (ROC) Curves with no registered false accepts or rejects. Individual feature bit and patch position parameters are optimized for matching through a product-of-sum approach to Hamming distance calculation. For verification, a variable threshold is applied to the distance metric and the False Acceptance Rate (FAR) and False Rejection Rate (FRR) are recorded. A new worst-case metric is proposed for predicting practical system performance in the absence of matching failures, and the worst case theoretical Equal Error Rate (EER) is predicted to be as low as 2.59 x 10(-4) on the available data sets.
Role of PFC during retrieval of recognition memory in rodents.

PubMed

Bekinschtein, Pedro; Weisstaub, Noelia

2014-01-01

One of the challenges for memory researches is the study of the neurobiology of episodic memory which is defined by the integration of all the different components of experiences that support the conscious recollection of events. The features of episodic memory includes a particular object or person ("what"), the context in which the experience took place ("where") and the particular time at which the event occurred ("when"). Although episodic memory has been mainly studied in humans, there are many studies that demonstrate these features in non-human animals. Here, we summarize a set of studies that employ different versions of recognition memory tasks in animals to study the role of the medial prefrontal cortex in episodic memory. Copyright © 2014 Elsevier Ltd. All rights reserved.
Human body as a set of biometric features identified by means of optoelectronics

NASA Astrophysics Data System (ADS)

Podbielska, Halina; Bauer, Joanna

2005-09-01

Human body posses many unique, singular features that are impossible to copy or forge. Nowadays, to establish and to ensure the public security requires specially designed devices and systems. Biometrics is a field of science and technology, exploiting human body characteristics for people recognition. It identifies the most characteristic and unique ones in order to design and construct systems capable to recognize people. In this paper some overview is given, presenting the achievements in biometrics. The verification and identification process is explained, along with the way of evaluation of biometric recognition systems. The most frequently human biometrics used in practice are shortly presented, including fingerprints, facial imaging (including thermal characteristic), hand geometry and iris patterns.
Gender-specific automatic valence recognition of affective olfactory stimulation through the analysis of the electrodermal activity.

PubMed

Greco, Alberto; Lanata, Antonio; Valenza, Gaetano; Di Francesco, Fabio; Scilingo, Enzo Pasquale

2016-08-01

This study reports on the development of a gender-specific classification system able to discern between two valence levels of smell, through information gathered from electrodermal activity (EDA) dynamics. Specifically, two odorants were administered to 32 healthy volunteers (16 males) while monitoring EDA. CvxEDA model was used to process the EDA signal and extract features from both tonic and phasic components. The feature set was used as input to a K-NN classifier implementing a leave-one-subject-out procedure. Results show strong differences in the accuracy of valence recognition between men (62.5%) and women (78%). We can conclude that affective olfactory stimulation significantly affect EDA dynamics with a highly specific gender dependency.
How well does multiple OCR error correction generalize?

NASA Astrophysics Data System (ADS)

Lund, William B.; Ringger, Eric K.; Walker, Daniel D.

2013-12-01

As the digitization of historical documents, such as newspapers, becomes more common, the need of the archive patron for accurate digital text from those documents increases. Building on our earlier work, the contributions of this paper are: 1. in demonstrating the applicability of novel methods for correcting optical character recognition (OCR) on disparate data sets, including a new synthetic training set, 2. enhancing the correction algorithm with novel features, and 3. assessing the data requirements of the correction learning method. First, we correct errors using conditional random fields (CRF) trained on synthetic training data sets in order to demonstrate the applicability of the methodology to unrelated test sets. Second, we show the strength of lexical features from the training sets on two unrelated test sets, yielding a relative reduction in word error rate on the test sets of 6.52%. New features capture the recurrence of hypothesis tokens and yield an additional relative reduction in WER of 2.30%. Further, we show that only 2.0% of the full training corpus of over 500,000 feature cases is needed to achieve correction results comparable to those using the entire training corpus, effectively reducing both the complexity of the training process and the learned correction model.

Effective Fingerprint Quality Estimation for Diverse Capture Sensors

PubMed Central

Xie, Shan Juan; Yoon, Sook; Shin, Jinwook; Park, Dong Sun

2010-01-01

Recognizing the quality of fingerprints in advance can be beneficial for improving the performance of fingerprint recognition systems. The representative features to assess the quality of fingerprint images from different types of capture sensors are known to vary. In this paper, an effective quality estimation system that can be adapted for different types of capture sensors is designed by modifying and combining a set of features including orientation certainty, local orientation quality and consistency. The proposed system extracts basic features, and generates next level features which are applicable for various types of capture sensors. The system then uses the Support Vector Machine (SVM) classifier to determine whether or not an image should be accepted as input to the recognition system. The experimental results show that the proposed method can perform better than previous methods in terms of accuracy. In the meanwhile, the proposed method has an ability to eliminate residue images from the optical and capacitive sensors, and the coarse images from thermal sensors. PMID:22163632
Pen-chant: Acoustic emissions of handwriting and drawing

NASA Astrophysics Data System (ADS)

Seniuk, Andrew G.

The sounds generated by a writing instrument ('pen-chant') provide a rich and underutilized source of information for pattern recognition. We examine the feasibility of recognition of handwritten cursive text, exclusively through an analysis of acoustic emissions. We design and implement a family of recognizers using a template matching approach, with templates and similarity measures derived variously from: smoothed amplitude signal with fixed resolution, discrete sequence of magnitudes obtained from peaks in the smoothed amplitude signal, and ordered tree obtained from a scale space signal representation. Test results are presented for recognition of isolated lowercase cursive characters and for whole words. We also present qualitative results for recognizing gestures such as circling, scratch-out, check-marks, and hatching. Our first set of results, using samples provided by the author, yield recognition rates of over 70% (alphabet) and 90% (26 words), with a confidence of +/-8%, based solely on acoustic emissions. Our second set of results uses data gathered from nine writers. These results demonstrate that acoustic emissions are a rich source of information, usable---on their own or in conjunction with image-based features---to solve pattern recognition problems. In future work, this approach can be applied to writer identification, handwriting and gesture-based computer input technology, emotion recognition, and temporal analysis of sketches.
Finessing filter scarcity problem in face recognition via multi-fold filter convolution

NASA Astrophysics Data System (ADS)

Low, Cheng-Yaw; Teoh, Andrew Beng-Jin

2017-06-01

The deep convolutional neural networks for face recognition, from DeepFace to the recent FaceNet, demand a sufficiently large volume of filters for feature extraction, in addition to being deep. The shallow filter-bank approaches, e.g., principal component analysis network (PCANet), binarized statistical image features (BSIF), and other analogous variants, endure the filter scarcity problem that not all PCA and ICA filters available are discriminative to abstract noise-free features. This paper extends our previous work on multi-fold filter convolution (ℳ-FFC), where the pre-learned PCA and ICA filter sets are exponentially diversified by ℳ folds to instantiate PCA, ICA, and PCA-ICA offspring. The experimental results unveil that the 2-FFC operation solves the filter scarcity state. The 2-FFC descriptors are also evidenced to be superior to that of PCANet, BSIF, and other face descriptors, in terms of rank-1 identification rate (%).
An Energy-Based Approach for Detection and Characterization of Subtle Entities Within Laser Scanning Point-Clouds

NASA Astrophysics Data System (ADS)

Arav, Reuma; Filin, Sagi

2016-06-01

Airborne laser scans present an optimal tool to describe geomorphological features in natural environments. However, a challenge arises in the detection of such phenomena, as they are embedded in the topography, tend to blend into their surroundings and leave only a subtle signature within the data. Most object-recognition studies address mainly urban environments and follow a general pipeline where the data are partitioned into segments with uniform properties. These approaches are restricted to man-made domain and are capable to handle limited features that answer a well-defined geometric form. As natural environments present a more complex set of features, the common interpretation of the data is still manual at large. In this paper, we propose a data-aware detection scheme, unbound to specific domains or shapes. We define the recognition question as an energy optimization problem, solved by variational means. Our approach, based on the level-set method, characterizes geometrically local surfaces within the data, and uses these characteristics as potential field for minimization. The main advantage here is that it allows topological changes of the evolving curves, such as merging and breaking. We demonstrate the proposed methodology on the detection of collapse sinkholes.
The Fisher-Markov selector: fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data.

PubMed

Cheng, Qiang; Zhou, Hongbo; Cheng, Jie

2011-06-01

Selecting features for multiclass classification is a critically important task for pattern recognition and machine learning applications. Especially challenging is selecting an optimal subset of features from high-dimensional data, which typically have many more variables than observations and contain significant noise, missing components, or outliers. Existing methods either cannot handle high-dimensional data efficiently or scalably, or can only obtain local optimum instead of global optimum. Toward the selection of the globally optimal subset of features efficiently, we introduce a new selector--which we call the Fisher-Markov selector--to identify those features that are the most useful in describing essential differences among the possible groups. In particular, in this paper we present a way to represent essential discriminating characteristics together with the sparsity as an optimization objective. With properly identified measures for the sparseness and discriminativeness in possibly high-dimensional settings, we take a systematic approach for optimizing the measures to choose the best feature subset. We use Markov random field optimization techniques to solve the formulated objective functions for simultaneous feature selection. Our results are noncombinatorial, and they can achieve the exact global optimum of the objective function for some special kernels. The method is fast; in particular, it can be linear in the number of features and quadratic in the number of observations. We apply our procedure to a variety of real-world data, including mid--dimensional optical handwritten digit data set and high-dimensional microarray gene expression data sets. The effectiveness of our method is confirmed by experimental results. In pattern recognition and from a model selection viewpoint, our procedure says that it is possible to select the most discriminating subset of variables by solving a very simple unconstrained objective function which in fact can be obtained with an explicit expression.
Vehicle license plate recognition based on geometry restraints and multi-feature decision

NASA Astrophysics Data System (ADS)

Wu, Jianwei; Wang, Zongyue

2005-10-01

Vehicle license plate (VLP) recognition is of great importance to many traffic applications. Though researchers have paid much attention to VLP recognition there has not been a fully operational VLP recognition system yet for many reasons. This paper discusses a valid and practical method for vehicle license plate recognition based on geometry restraints and multi-feature decision including statistical and structural features. In general, the VLP recognition includes the following steps: the location of VLP, character segmentation, and character recognition. This paper discusses the three steps in detail. The characters of VLP are always declining caused by many factors, which makes it more difficult to recognize the characters of VLP, therefore geometry restraints such as the general ratio of length and width, the adjacent edges being perpendicular are used for incline correction. Image Moment has been proved to be invariant to translation, rotation and scaling therefore image moment is used as one feature for character recognition. Stroke is the basic element for writing and hence taking it as a feature is helpful to character recognition. Finally we take the image moment, the strokes and the numbers of each stroke for each character image and some other structural features and statistical features as the multi-feature to match each character image with sample character images so that each character image can be recognized by BP neural net. The proposed method combines statistical and structural features for VLP recognition, and the result shows its validity and efficiency.
Odor Recognition vs. Classification in Artificial Olfaction

NASA Astrophysics Data System (ADS)

Raman, Baranidharan; Hertz, Joshua; Benkstein, Kurt; Semancik, Steve

2011-09-01

Most studies in chemical sensing have focused on the problem of precise identification of chemical species that were exposed during the training phase (the recognition problem). However, generalization of training to predict the chemical composition of untrained gases based on their similarity with analytes in the training set (the classification problem) has received very limited attention. These two analytical tasks pose conflicting constraints on the system. While correct recognition requires detection of molecular features that are unique to an analyte, generalization to untrained chemicals requires detection of features that are common across a desired class of analytes. A simple solution that addresses both issues simultaneously can be obtained from biological olfaction, where the odor class and identity information are decoupled and extracted individually over time. Mimicking this approach, we proposed a hierarchical scheme that allowed initial discrimination between broad chemical classes (e.g. contains oxygen) followed by finer refinements using additional data into sub-classes (e.g. ketones vs. alcohols) and, eventually, specific compositions (e.g. ethanol vs. methanol) [1]. We validated this approach using an array of temperature-controlled chemiresistors. We demonstrated that a small set of training analytes is sufficient to allow generalization to novel chemicals and that the scheme provides robust categorization despite aging. Here, we provide further characterization of this approach.
Retina vascular network recognition

NASA Astrophysics Data System (ADS)

Tascini, Guido; Passerini, Giorgio; Puliti, Paolo; Zingaretti, Primo

1993-09-01

The analysis of morphological and structural modifications of the retina vascular network is an interesting investigation method in the study of diabetes and hypertension. Normally this analysis is carried out by qualitative evaluations, according to standardized criteria, though medical research attaches great importance to quantitative analysis of vessel color, shape and dimensions. The paper describes a system which automatically segments and recognizes the ocular fundus circulation and micro circulation network, and extracts a set of features related to morphometric aspects of vessels. For this class of images the classical segmentation methods seem weak. We propose a computer vision system in which segmentation and recognition phases are strictly connected. The system is hierarchically organized in four modules. Firstly the Image Enhancement Module (IEM) operates a set of custom image enhancements to remove blur and to prepare data for subsequent segmentation and recognition processes. Secondly the Papilla Border Analysis Module (PBAM) automatically recognizes number, position and local diameter of blood vessels departing from optical papilla. Then the Vessel Tracking Module (VTM) analyses vessels comparing the results of body and edge tracking and detects branches and crossings. Finally the Feature Extraction Module evaluates PBAM and VTM output data and extracts some numerical indexes. Used algorithms appear to be robust and have been successfully tested on various ocular fundus images.
Research on the feature extraction and pattern recognition of the distributed optical fiber sensing signal

NASA Astrophysics Data System (ADS)

Wang, Bingjie; Sun, Qi; Pi, Shaohua; Wu, Hongyan

2014-09-01

In this paper, feature extraction and pattern recognition of the distributed optical fiber sensing signal have been studied. We adopt Mel-Frequency Cepstral Coefficient (MFCC) feature extraction, wavelet packet energy feature extraction and wavelet packet Shannon entropy feature extraction methods to obtain sensing signals (such as speak, wind, thunder and rain signals, etc.) characteristic vectors respectively, and then perform pattern recognition via RBF neural network. Performances of these three feature extraction methods are compared according to the results. We choose MFCC characteristic vector to be 12-dimensional. For wavelet packet feature extraction, signals are decomposed into six layers by Daubechies wavelet packet transform, in which 64 frequency constituents as characteristic vector are respectively extracted. In the process of pattern recognition, the value of diffusion coefficient is introduced to increase the recognition accuracy, while keeping the samples for testing algorithm the same. Recognition results show that wavelet packet Shannon entropy feature extraction method yields the best recognition accuracy which is up to 97%; the performance of 12-dimensional MFCC feature extraction method is less satisfactory; the performance of wavelet packet energy feature extraction method is the worst.
The Precise and Efficient Identification of Medical Order Forms Using Shape Trees

NASA Astrophysics Data System (ADS)

Henker, Uwe; Petersohn, Uwe; Ultsch, Alfred

A powerful and flexible technique to identify, classify and process documents using images from a scanning process is presented. The types of documents can be described to the system as a set of differentiating features in a case base using shape trees. The features are filtered and abstracted from an extremely reduced scanner image of the document. Classification rules are stored with the cases to enable precise recognition and further mark reading and Optical Character Recognition (OCR) process. The method is implemented in a system which actually processes the majority of requests for medical lab procedures in Germany. A large practical experiment with data from practitioners was performed. An average of 97% of the forms were correctly identified; none were identified incorrectly. This meets the quality requirements for most medical applications. The modular description of the recognition process allows for a flexible adaptation of future changes to the form and content of the document’s structures.
The N400 as a snapshot of interactive processing: evidence from regression analyses of orthographic neighbor and lexical associate effects

PubMed Central

Laszlo, Sarah; Federmeier, Kara D.

2010-01-01

Linking print with meaning tends to be divided into subprocesses, such as recognition of an input's lexical entry and subsequent access of semantics. However, recent results suggest that the set of semantic features activated by an input is broader than implied by a view wherein access serially follows recognition. EEG was collected from participants who viewed items varying in number and frequency of both orthographic neighbors and lexical associates. Regression analysis of single item ERPs replicated past findings, showing that N400 amplitudes are greater for items with more neighbors, and further revealed that N400 amplitudes increase for items with more lexical associates and with higher frequency neighbors or associates. Together, the data suggest that in the N400 time window semantic features of items broadly related to inputs are active, consistent with models in which semantic access takes place in parallel with stimulus recognition. PMID:20624252
Open set recognition of aircraft in aerial imagery using synthetic template models

NASA Astrophysics Data System (ADS)

Bapst, Aleksander B.; Tran, Jonathan; Koch, Mark W.; Moya, Mary M.; Swahn, Robert

2017-05-01

Fast, accurate and robust automatic target recognition (ATR) in optical aerial imagery can provide game-changing advantages to military commanders and personnel. ATR algorithms must reject non-targets with a high degree of confidence in a world with an infinite number of possible input images. Furthermore, they must learn to recognize new targets without requiring massive data collections. Whereas most machine learning algorithms classify data in a closed set manner by mapping inputs to a fixed set of training classes, open set recognizers incorporate constraints that allow for inputs to be labelled as unknown. We have adapted two template-based open set recognizers to use computer generated synthetic images of military aircraft as training data, to provide a baseline for military-grade ATR: (1) a frequentist approach based on probabilistic fusion of extracted image features, and (2) an open set extension to the one-class support vector machine (SVM). These algorithms both use histograms of oriented gradients (HOG) as features as well as artificial augmentation of both real and synthetic image chips to take advantage of minimal training data. Our results show that open set recognizers trained with synthetic data and tested with real data can successfully discriminate real target inputs from non-targets. However, there is still a requirement for some knowledge of the real target in order to calibrate the relationship between synthetic template and target score distributions. We conclude by proposing algorithm modifications that may improve the ability of synthetic data to represent real data.
Linear and Non-Linear Visual Feature Learning in Rat and Humans

PubMed Central

Bossens, Christophe; Op de Beeck, Hans P.

2016-01-01

The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201
Emotion recognition from speech: tools and challenges

NASA Astrophysics Data System (ADS)

Al-Talabani, Abdulbasit; Sellahewa, Harin; Jassim, Sabah A.

2015-05-01

Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more "natural" than the acted datasets where the subjects attempt to suppress all but one emotion.
I Hear You Eat and Speak: Automatic Recognition of Eating Condition and Food Type, Use-Cases, and Impact on ASR Performance

PubMed Central

Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn

2016-01-01

We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486
Information based universal feature extraction

NASA Astrophysics Data System (ADS)

Amiri, Mohammad; Brause, Rüdiger

2015-02-01

In many real world image based pattern recognition tasks, the extraction and usage of task-relevant features are the most crucial part of the diagnosis. In the standard approach, they mostly remain task-specific, although humans who perform such a task always use the same image features, trained in early childhood. It seems that universal feature sets exist, but they are not yet systematically found. In our contribution, we tried to find those universal image feature sets that are valuable for most image related tasks. In our approach, we trained a neural network by natural and non-natural images of objects and background, using a Shannon information-based algorithm and learning constraints. The goal was to extract those features that give the most valuable information for classification of visual objects hand-written digits. This will give a good start and performance increase for all other image learning tasks, implementing a transfer learning approach. As result, in our case we found that we could indeed extract features which are valid in all three kinds of tasks.
Methodological Note: Analyzing Signs for Recognition & Feature Salience.

ERIC Educational Resources Information Center

Shyan, Melissa R.

1985-01-01

Presents a method to determine how signs in American Sign Language are recognized by signers. The method uses natural settings and avoids common artificialities found in prior work. A pilot study is described involving language research with Atlantic Bottlenose Dolphins in which the method was successfully used. (SED)
Automatic recognition of ship types from infrared images using superstructure moment invariants

NASA Astrophysics Data System (ADS)

Li, Heng; Wang, Xinyu

2007-11-01

Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.
Near infrared and visible face recognition based on decision fusion of LBP and DCT features

NASA Astrophysics Data System (ADS)

Xie, Zhihua; Zhang, Shuai; Liu, Guodong; Xiong, Jinquan

2018-03-01

Visible face recognition systems, being vulnerable to illumination, expression, and pose, can not achieve robust performance in unconstrained situations. Meanwhile, near infrared face images, being light- independent, can avoid or limit the drawbacks of face recognition in visible light, but its main challenges are low resolution and signal noise ratio (SNR). Therefore, near infrared and visible fusion face recognition has become an important direction in the field of unconstrained face recognition research. In order to extract the discriminative complementary features between near infrared and visible images, in this paper, we proposed a novel near infrared and visible face fusion recognition algorithm based on DCT and LBP features. Firstly, the effective features in near-infrared face image are extracted by the low frequency part of DCT coefficients and the partition histograms of LBP operator. Secondly, the LBP features of visible-light face image are extracted to compensate for the lacking detail features of the near-infrared face image. Then, the LBP features of visible-light face image, the DCT and LBP features of near-infrared face image are sent to each classifier for labeling. Finally, decision level fusion strategy is used to obtain the final recognition result. The visible and near infrared face recognition is tested on HITSZ Lab2 visible and near infrared face database. The experiment results show that the proposed method extracts the complementary features of near-infrared and visible face images and improves the robustness of unconstrained face recognition. Especially for the circumstance of small training samples, the recognition rate of proposed method can reach 96.13%, which has improved significantly than 92.75 % of the method based on statistical feature fusion.
High-resolution face verification using pore-scale facial features.

PubMed

Li, Dong; Zhou, Huiling; Lam, Kin-Man

2015-08-01

Face recognition methods, which usually represent face images using holistic or local facial features, rely heavily on alignment. Their performances also suffer a severe degradation under variations in expressions or poses, especially when there is one gallery per subject only. With the easy access to high-resolution (HR) face images nowadays, some HR face databases have recently been developed. However, few studies have tackled the use of HR information for face recognition or verification. In this paper, we propose a pose-invariant face-verification method, which is robust to alignment errors, using the HR information based on pore-scale facial features. A new keypoint descriptor, namely, pore-Principal Component Analysis (PCA)-Scale Invariant Feature Transform (PPCASIFT)-adapted from PCA-SIFT-is devised for the extraction of a compact set of distinctive pore-scale facial features. Having matched the pore-scale features of two-face regions, an effective robust-fitting scheme is proposed for the face-verification task. Experiments show that, with one frontal-view gallery only per subject, our proposed method outperforms a number of standard verification methods, and can achieve excellent accuracy even the faces are under large variations in expression and pose.

Automated detection of pulmonary nodules in CT images with support vector machines

NASA Astrophysics Data System (ADS)

Liu, Lu; Liu, Wanyu; Sun, Xiaoming

2008-10-01

Many methods have been proposed to avoid radiologists fail to diagnose small pulmonary nodules. Recently, support vector machines (SVMs) had received an increasing attention for pattern recognition. In this paper, we present a computerized system aimed at pulmonary nodules detection; it identifies the lung field, extracts a set of candidate regions with a high sensitivity ratio and then classifies candidates by the use of SVMs. The Computer Aided Diagnosis (CAD) system presented in this paper supports the diagnosis of pulmonary nodules from Computed Tomography (CT) images as inflammation, tuberculoma, granuloma..sclerosing hemangioma, and malignant tumor. Five texture feature sets were extracted for each lesion, while a genetic algorithm based feature selection method was applied to identify the most robust features. The selected feature set was fed into an ensemble of SVMs classifiers. The achieved classification performance was 100%, 92.75% and 90.23% in the training, validation and testing set, respectively. It is concluded that computerized analysis of medical images in combination with artificial intelligence can be used in clinical practice and may contribute to more efficient diagnosis.
Self-organizing neural integration of pose-motion features for human action recognition

PubMed Central

Parisi, German I.; Weber, Cornelius; Wermter, Stefan

2015-01-01

The visual recognition of complex, articulated human movements is fundamental for a wide range of artificial systems oriented toward human-robot communication, action classification, and action-driven perception. These challenging tasks may generally involve the processing of a huge amount of visual information and learning-based mechanisms for generalizing a set of training actions and classifying new samples. To operate in natural environments, a crucial property is the efficient and robust recognition of actions, also under noisy conditions caused by, for instance, systematic sensor errors and temporarily occluded persons. Studies of the mammalian visual system and its outperforming ability to process biological motion information suggest separate neural pathways for the distinct processing of pose and motion features at multiple levels and the subsequent integration of these visual cues for action perception. We present a neurobiologically-motivated approach to achieve noise-tolerant action recognition in real time. Our model consists of self-organizing Growing When Required (GWR) networks that obtain progressively generalized representations of sensory inputs and learn inherent spatio-temporal dependencies. During the training, the GWR networks dynamically change their topological structure to better match the input space. We first extract pose and motion features from video sequences and then cluster actions in terms of prototypical pose-motion trajectories. Multi-cue trajectories from matching action frames are subsequently combined to provide action dynamics in the joint feature space. Reported experiments show that our approach outperforms previous results on a dataset of full-body actions captured with a depth sensor, and ranks among the best results for a public benchmark of domestic daily actions. PMID:26106323
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors

PubMed Central

Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai

2017-01-01

RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553
Conditional random fields for pattern recognition applied to structured data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burr, Tom; Skurikhin, Alexei

In order to predict labels from an output domain, Y, pattern recognition is used to gather measurements from an input domain, X. Image analysis is one setting where one might want to infer whether a pixel patch contains an object that is “manmade” (such as a building) or “natural” (such as a tree). Suppose the label for a pixel patch is “manmade”; if the label for a nearby pixel patch is then more likely to be “manmade” there is structure in the output domain that can be exploited to improve pattern recognition performance. Modeling P(X) is difficult because features betweenmore » parts of the model are often correlated. Thus, conditional random fields (CRFs) model structured data using the conditional distribution P(Y|X = x), without specifying a model for P(X), and are well suited for applications with dependent features. Our paper has two parts. First, we overview CRFs and their application to pattern recognition in structured problems. Our primary examples are image analysis applications in which there is dependence among samples (pixel patches) in the output domain. Second, we identify research topics and present numerical examples.« less
Conditional random fields for pattern recognition applied to structured data

DOE PAGES

Burr, Tom; Skurikhin, Alexei

2015-07-14

In order to predict labels from an output domain, Y, pattern recognition is used to gather measurements from an input domain, X. Image analysis is one setting where one might want to infer whether a pixel patch contains an object that is “manmade” (such as a building) or “natural” (such as a tree). Suppose the label for a pixel patch is “manmade”; if the label for a nearby pixel patch is then more likely to be “manmade” there is structure in the output domain that can be exploited to improve pattern recognition performance. Modeling P(X) is difficult because features betweenmore » parts of the model are often correlated. Thus, conditional random fields (CRFs) model structured data using the conditional distribution P(Y|X = x), without specifying a model for P(X), and are well suited for applications with dependent features. Our paper has two parts. First, we overview CRFs and their application to pattern recognition in structured problems. Our primary examples are image analysis applications in which there is dependence among samples (pixel patches) in the output domain. Second, we identify research topics and present numerical examples.« less
Dissimilarity representations in lung parenchyma classification

NASA Astrophysics Data System (ADS)

Sørensen, Lauge; de Bruijne, Marleen

2009-02-01

A good problem representation is important for a pattern recognition system to be successful. The traditional approach to statistical pattern recognition is feature representation. More specifically, objects are represented by a number of features in a feature vector space, and classifiers are built in this representation. This is also the general trend in lung parenchyma classification in computed tomography (CT) images, where the features often are measures on feature histograms. Instead, we propose to build normal density based classifiers in dissimilarity representations for lung parenchyma classification. This allows for the classifiers to work on dissimilarities between objects, which might be a more natural way of representing lung parenchyma. In this context, dissimilarity is defined between CT regions of interest (ROI)s. ROIs are represented by their CT attenuation histogram and ROI dissimilarity is defined as a histogram dissimilarity measure between the attenuation histograms. In this setting, the full histograms are utilized according to the chosen histogram dissimilarity measure. We apply this idea to classification of different emphysema patterns as well as normal, healthy tissue. Two dissimilarity representation approaches as well as different histogram dissimilarity measures are considered. The approaches are evaluated on a set of 168 CT ROIs using normal density based classifiers all showing good performance. Compared to using histogram dissimilarity directly as distance in a emph{k} nearest neighbor classifier, which achieves a classification accuracy of 92.9%, the best dissimilarity representation based classifier is significantly better with a classification accuracy of 97.0% (text{emph{p" border="0" class="imgtopleft"> = 0.046).
Control chart pattern recognition using RBF neural network with new training algorithm and practical features.

PubMed

Addeh, Abdoljalil; Khormali, Aminollah; Golilarz, Noorbakhsh Amiri

2018-05-04

The control chart patterns are the most commonly used statistical process control (SPC) tools to monitor process changes. When a control chart produces an out-of-control signal, this means that the process has been changed. In this study, a new method based on optimized radial basis function neural network (RBFNN) is proposed for control chart patterns (CCPs) recognition. The proposed method consists of four main modules: feature extraction, feature selection, classification and learning algorithm. In the feature extraction module, shape and statistical features are used. Recently, various shape and statistical features have been presented for the CCPs recognition. In the feature selection module, the association rules (AR) method has been employed to select the best set of the shape and statistical features. In the classifier section, RBFNN is used and finally, in RBFNN, learning algorithm has a high impact on the network performance. Therefore, a new learning algorithm based on the bees algorithm has been used in the learning module. Most studies have considered only six patterns: Normal, Cyclic, Increasing Trend, Decreasing Trend, Upward Shift and Downward Shift. Since three patterns namely Normal, Stratification, and Systematic are very similar to each other and distinguishing them is very difficult, in most studies Stratification and Systematic have not been considered. Regarding to the continuous monitoring and control over the production process and the exact type detection of the problem encountered during the production process, eight patterns have been investigated in this study. The proposed method is tested on a dataset containing 1600 samples (200 samples from each pattern) and the results showed that the proposed method has a very good performance. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
a Fully Automated Pipeline for Classification Tasks with AN Application to Remote Sensing

NASA Astrophysics Data System (ADS)

Suzuki, K.; Claesen, M.; Takeda, H.; De Moor, B.

2016-06-01

Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed `shallow' machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.
Super-resolution method for face recognition using nonlinear mappings on coherent features.

PubMed

Huang, Hua; He, Huiting

2011-01-01

Low-resolution (LR) of face images significantly decreases the performance of face recognition. To address this problem, we present a super-resolution method that uses nonlinear mappings to infer coherent features that favor higher recognition of the nearest neighbor (NN) classifiers for recognition of single LR face image. Canonical correlation analysis is applied to establish the coherent subspaces between the principal component analysis (PCA) based features of high-resolution (HR) and LR face images. Then, a nonlinear mapping between HR/LR features can be built by radial basis functions (RBFs) with lower regression errors in the coherent feature space than in the PCA feature space. Thus, we can compute super-resolved coherent features corresponding to an input LR image according to the trained RBF model efficiently and accurately. And, face identity can be obtained by feeding these super-resolved features to a simple NN classifier. Extensive experiments on the Facial Recognition Technology, University of Manchester Institute of Science and Technology, and Olivetti Research Laboratory databases show that the proposed method outperforms the state-of-the-art face recognition algorithms for single LR image in terms of both recognition rate and robustness to facial variations of pose and expression.
Fusion of smartphone motion sensors for physical activity recognition.

PubMed

Shoaib, Muhammad; Bosch, Stephan; Incel, Ozlem Durmaz; Scholten, Hans; Havinga, Paul J M

2014-06-10

For physical activity recognition, smartphone sensors, such as an accelerometer and a gyroscope, are being utilized in many research studies. So far, particularly, the accelerometer has been extensively studied. In a few recent studies, a combination of a gyroscope, a magnetometer (in a supporting role) and an accelerometer (in a lead role) has been used with the aim to improve the recognition performance. How and when are various motion sensors, which are available on a smartphone, best used for better recognition performance, either individually or in combination? This is yet to be explored. In order to investigate this question, in this paper, we explore how these various motion sensors behave in different situations in the activity recognition process. For this purpose, we designed a data collection experiment where ten participants performed seven different activities carrying smart phones at different positions. Based on the analysis of this data set, we show that these sensors, except the magnetometer, are each capable of taking the lead roles individually, depending on the type of activity being recognized, the body position, the used data features and the classification method employed (personalized or generalized). We also show that their combination only improves the overall recognition performance when their individual performances are not very high, so that there is room for performance improvement. We have made our data set and our data collection application publicly available, thereby making our experiments reproducible.
Recognition of edible oil by using BP neural network and laser induced fluorescence spectrum

NASA Astrophysics Data System (ADS)

Mu, Tao-tao; Chen, Si-ying; Zhang, Yin-chao; Guo, Pan; Chen, He; Zhang, Hong-yan; Liu, Xiao-hua; Wang, Yuan; Bu, Zhi-chao

2013-09-01

In order to accomplish recognition of the different edible oil we set up a laser induced fluorescence spectrum system in the laboratory based on Laser induced fluorescence spectrum technology, and then collect the fluorescence spectrum of different edible oil by using that system. Based on this, we set up a fluorescence spectrum database of different cooking oil. It is clear that there are three main peak position of different edible oil from fluorescence spectrum chart. Although the peak positions of all cooking oil were almost the same, the relative intensity of different edible oils was totally different. So it could easily accomplish that oil recognition could take advantage of the difference of relative intensity. Feature invariants were extracted from the spectrum data, which were chosen from the fluorescence spectrum database randomly, before distinguishing different cooking oil. Then back propagation (BP) neural network was established and trained by the chosen data from the spectrum database. On that basis real experiment data was identified by BP neural network. It was found that the overall recognition rate could reach as high as 83.2%. Experiments showed that the laser induced fluorescence spectrum of different cooking oil was very different from each other, which could be used to accomplish the oil recognition. Laser induced fluorescence spectrum technology, combined BP neural network，was fast, high sensitivity, non-contact, and high recognition rate. It could become a new technique to accomplish the edible oil recognition and quality detection.
Concurrent evolution of feature extractors and modular artificial neural networks

NASA Astrophysics Data System (ADS)

Hannak, Victor; Savakis, Andreas; Yang, Shanchieh Jay; Anderson, Peter

2009-05-01

This paper presents a new approach for the design of feature-extracting recognition networks that do not require expert knowledge in the application domain. Feature-Extracting Recognition Networks (FERNs) are composed of interconnected functional nodes (feurons), which serve as feature extractors, and are followed by a subnetwork of traditional neural nodes (neurons) that act as classifiers. A concurrent evolutionary process (CEP) is used to search the space of feature extractors and neural networks in order to obtain an optimal recognition network that simultaneously performs feature extraction and recognition. By constraining the hill-climbing search functionality of the CEP on specific parts of the solution space, i.e., individually limiting the evolution of feature extractors and neural networks, it was demonstrated that concurrent evolution is a necessary component of the system. Application of this approach to a handwritten digit recognition task illustrates that the proposed methodology is capable of producing recognition networks that perform in-line with other methods without the need for expert knowledge in image processing.
Infrared vehicle recognition using unsupervised feature learning based on K-feature

NASA Astrophysics Data System (ADS)

Lin, Jin; Tan, Yihua; Xia, Haijiao; Tian, Jinwen

2018-02-01

Subject to the complex battlefield environment, it is difficult to establish a complete knowledge base in practical application of vehicle recognition algorithms. The infrared vehicle recognition is always difficult and challenging, which plays an important role in remote sensing. In this paper we propose a new unsupervised feature learning method based on K-feature to recognize vehicle in infrared images. First, we use the target detection algorithm which is based on the saliency to detect the initial image. Then, the unsupervised feature learning based on K-feature, which is generated by Kmeans clustering algorithm that extracted features by learning a visual dictionary from a large number of samples without label, is calculated to suppress the false alarm and improve the accuracy. Finally, the vehicle target recognition image is finished by some post-processing. Large numbers of experiments demonstrate that the proposed method has satisfy recognition effectiveness and robustness for vehicle recognition in infrared images under complex backgrounds, and it also improve the reliability of it.
Cross-Modal Retrieval With CNN Visual Features: A New Baseline.

PubMed

Wei, Yunchao; Zhao, Yao; Lu, Canyi; Wei, Shikui; Liu, Luoqi; Zhu, Zhenfeng; Yan, Shuicheng

2017-02-01

Recently, convolutional neural network (CNN) visual features have demonstrated their powerful ability as a universal representation for various recognition tasks. In this paper, cross-modal retrieval with CNN visual features is implemented with several classic methods. Specifically, off-the-shelf CNN visual features are extracted from the CNN model, which is pretrained on ImageNet with more than one million images from 1000 object categories, as a generic image representation to tackle cross-modal retrieval. To further enhance the representational ability of CNN visual features, based on the pretrained CNN model on ImageNet, a fine-tuning step is performed by using the open source Caffe CNN library for each target data set. Besides, we propose a deep semantic matching method to address the cross-modal retrieval problem with respect to samples which are annotated with one or multiple labels. Extensive experiments on five popular publicly available data sets well demonstrate the superiority of CNN visual features for cross-modal retrieval.
An acidic microenvironment sets the humoral pattern recognition molecule PTX3 in a tissue repair mode

PubMed Central

Doni, Andrea; Musso, Tiziana; Morone, Diego; Bastone, Antonio; Zambelli, Vanessa; Sironi, Marina; Castagnoli, Carlotta; Cambieri, Irene; Stravalaci, Matteo; Pasqualini, Fabio; Laface, Ilaria; Valentino, Sonia; Tartari, Silvia; Ponzetta, Andrea; Maina, Virginia; Barbieri, Silvia S.; Tremoli, Elena; Catapano, Alberico L.; Norata, Giuseppe D.; Bottazzi, Barbara; Garlanda, Cecilia

2015-01-01

Pentraxin 3 (PTX3) is a fluid-phase pattern recognition molecule and a key component of the humoral arm of innate immunity. In four different models of tissue damage in mice, PTX3 deficiency was associated with increased fibrin deposition and persistence, and thicker clots, followed by increased collagen deposition, when compared with controls. Ptx3-deficient macrophages showed defective pericellular fibrinolysis in vitro. PTX3-bound fibrinogen/fibrin and plasminogen at acidic pH and increased plasmin-mediated fibrinolysis. The second exon-encoded N-terminal domain of PTX3 recapitulated the activity of the intact molecule. Thus, a prototypic component of humoral innate immunity, PTX3, plays a nonredundant role in the orchestration of tissue repair and remodeling. Tissue acidification resulting from metabolic adaptation during tissue repair sets PTX3 in a tissue remodeling and repair mode, suggesting that matrix and microbial recognition are common, ancestral features of the humoral arm of innate immunity. PMID:25964372
An Exemplar-Based Multi-View Domain Generalization Framework for Visual Recognition.

PubMed

Niu, Li; Li, Wen; Xu, Dong; Cai, Jianfei

2018-02-01

In this paper, we propose a new exemplar-based multi-view domain generalization (EMVDG) framework for visual recognition by learning robust classifier that are able to generalize well to arbitrary target domain based on the training samples with multiple types of features (i.e., multi-view features). In this framework, we aim to address two issues simultaneously. First, the distribution of training samples (i.e., the source domain) is often considerably different from that of testing samples (i.e., the target domain), so the performance of the classifiers learnt on the source domain may drop significantly on the target domain. Moreover, the testing data are often unseen during the training procedure. Second, when the training data are associated with multi-view features, the recognition performance can be further improved by exploiting the relation among multiple types of features. To address the first issue, considering that it has been shown that fusing multiple SVM classifiers can enhance the domain generalization ability, we build our EMVDG framework upon exemplar SVMs (ESVMs), in which a set of ESVM classifiers are learnt with each one trained based on one positive training sample and all the negative training samples. When the source domain contains multiple latent domains, the learnt ESVM classifiers are expected to be grouped into multiple clusters. To address the second issue, we propose two approaches under the EMVDG framework based on the consensus principle and the complementary principle, respectively. Specifically, we propose an EMVDG_CO method by adding a co-regularizer to enforce the cluster structures of ESVM classifiers on different views to be consistent based on the consensus principle. Inspired by multiple kernel learning, we also propose another EMVDG_MK method by fusing the ESVM classifiers from different views based on the complementary principle. In addition, we further extend our EMVDG framework to exemplar-based multi-view domain adaptation (EMVDA) framework when the unlabeled target domain data are available during the training procedure. The effectiveness of our EMVDG and EMVDA frameworks for visual recognition is clearly demonstrated by comprehensive experiments on three benchmark data sets.
Recognizing Materials using Perceptually Inspired Features

PubMed Central

Sharan, Lavanya; Liu, Ce; Rosenholtz, Ruth; Adelson, Edward H.

2013-01-01

Our world consists not only of objects and scenes but also of materials of various kinds. Being able to recognize the materials that surround us (e.g., plastic, glass, concrete) is important for humans as well as for computer vision systems. Unfortunately, materials have received little attention in the visual recognition literature, and very few computer vision systems have been designed specifically to recognize materials. In this paper, we present a system for recognizing material categories from single images. We propose a set of low and mid-level image features that are based on studies of human material recognition, and we combine these features using an SVM classifier. Our system outperforms a state-of-the-art system [Varma and Zisserman, 2009] on a challenging database of real-world material categories [Sharan et al., 2009]. When the performance of our system is compared directly to that of human observers, humans outperform our system quite easily. However, when we account for the local nature of our image features and the surface properties they measure (e.g., color, texture, local shape), our system rivals human performance. We suggest that future progress in material recognition will come from: (1) a deeper understanding of the role of non-local surface properties (e.g., extended highlights, object identity); and (2) efforts to model such non-local surface properties in images. PMID:23914070
PCANet: A Simple Deep Learning Baseline for Image Classification?

PubMed

Chan, Tsung-Han; Jia, Kui; Gao, Shenghua; Lu, Jiwen; Zeng, Zinan; Ma, Yi

2015-12-01

In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.
Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier.

PubMed

Zhang, Baochang; Yang, Yun; Chen, Chen; Yang, Linlin; Han, Jungong; Shao, Ling

2017-10-01

Human action recognition is an important yet challenging task. This paper presents a low-cost descriptor called 3D histograms of texture (3DHoTs) to extract discriminant features from a sequence of depth maps. 3DHoTs are derived from projecting depth frames onto three orthogonal Cartesian planes, i.e., the frontal, side, and top planes, and thus compactly characterize the salient information of a specific action, on which texture features are calculated to represent the action. Besides this fast feature descriptor, a new multi-class boosting classifier (MBC) is also proposed to efficiently exploit different kinds of features in a unified framework for action classification. Compared with the existing boosting frameworks, we add a new multi-class constraint into the objective function, which helps to maintain a better margin distribution by maximizing the mean of margin, whereas still minimizing the variance of margin. Experiments on the MSRAction3D, MSRGesture3D, MSRActivity3D, and UTD-MHAD data sets demonstrate that the proposed system combining 3DHoTs and MBC is superior to the state of the art.
Input Decimated Ensembles

NASA Technical Reports Server (NTRS)

Tumer, Kagan; Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)

2001-01-01

Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers' performance levels high is an important area of research. In this article, we explore input decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses them to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles (IDEs) outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains.

Feature Selection in Classification of Eye Movements Using Electrooculography for Activity Recognition

PubMed Central

Mala, S.; Latha, K.

2014-01-01

Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition. PMID:25574185
Feature selection in classification of eye movements using electrooculography for activity recognition.

PubMed

Mala, S; Latha, K

2014-01-01

Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.
Horror Image Recognition Based on Context-Aware Multi-Instance Learning.

PubMed

Li, Bing; Xiong, Weihua; Wu, Ou; Hu, Weiming; Maybank, Stephen; Yan, Shuicheng

2015-12-01

Horror content sharing on the Web is a growing phenomenon that can interfere with our daily life and affect the mental health of those involved. As an important form of expression, horror images have their own characteristics that can evoke extreme emotions. In this paper, we present a novel context-aware multi-instance learning (CMIL) algorithm for horror image recognition. The CMIL algorithm identifies horror images and picks out the regions that cause the sensation of horror in these horror images. It obtains contextual cues among adjacent regions in an image using a random walk on a contextual graph. Borrowing the strength of the fuzzy support vector machine (FSVM), we define a heuristic optimization procedure based on the FSVM to search for the optimal classifier for the CMIL. To improve the initialization of the CMIL, we propose a novel visual saliency model based on the tensor analysis. The average saliency value of each segmented region is set as its initial fuzzy membership in the CMIL. The advantage of the tensor-based visual saliency model is that it not only adaptively selects features, but also dynamically determines fusion weights for saliency value combination from different feature subspaces. The effectiveness of the proposed CMIL model is demonstrated by its use in horror image recognition on two large-scale image sets collected from the Internet.
Natural image statistics and low-complexity feature selection.

PubMed

Vasconcelos, Manuela; Vasconcelos, Nuno

2009-02-01

Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes.
Attention-based image similarity measure with application to content-based information retrieval

NASA Astrophysics Data System (ADS)

Stentiford, Fred W. M.

2003-01-01

Whilst storage and capture technologies are able to cope with huge numbers of images, image retrieval is in danger of rendering many repositories valueless because of the difficulty of access. This paper proposes a similarity measure that imposes only very weak assumptions on the nature of the features used in the recognition process. This approach does not make use of a pre-defined set of feature measurements which are extracted from a query image and used to match those from database images, but instead generates features on a trial and error basis during the calculation of the similarity measure. This has the significant advantage that features that determine similarity can match whatever image property is important in a particular region whether it be a shape, a texture, a colour or a combination of all three. It means that effort is expended searching for the best feature for the region rather than expecting that a fixed feature set will perform optimally over the whole area of an image and over every image in a database. The similarity measure is evaluated on a problem of distinguishing similar shapes in sets of black and white symbols.
WND-CHARM: Multi-purpose image classification using compound image transforms

PubMed Central

Orlov, Nikita; Shamir, Lior; Macura, Tomasz; Johnston, Josiah; Eckley, D. Mark; Goldberg, Ilya G.

2008-01-01

We describe a multi-purpose image classifier that can be applied to a wide variety of image classification tasks without modifications or fine-tuning, and yet provide classification accuracy comparable to state-of-the-art task-specific image classifiers. The proposed image classifier first extracts a large set of 1025 image features including polynomial decompositions, high contrast features, pixel statistics, and textures. These features are computed on the raw image, transforms of the image, and transforms of transforms of the image. The feature values are then used to classify test images into a set of pre-defined image classes. This classifier was tested on several different problems including biological image classification and face recognition. Although we cannot make a claim of universality, our experimental results show that this classifier performs as well or better than classifiers developed specifically for these image classification tasks. Our classifier’s high performance on a variety of classification problems is attributed to (i) a large set of features extracted from images; and (ii) an effective feature selection and weighting algorithm sensitive to specific image classification problems. The algorithms are available for free download from openmicroscopy.org. PMID:18958301
Emotion recognition based on multiple order features using fractional Fourier transform

NASA Astrophysics Data System (ADS)

Ren, Bo; Liu, Deyin; Qi, Lin

2017-07-01

In order to deal with the insufficiency of recently algorithms based on Two Dimensions Fractional Fourier Transform (2D-FrFT), this paper proposes a multiple order features based method for emotion recognition. Most existing methods utilize the feature of single order or a couple of orders of 2D-FrFT. However, different orders of 2D-FrFT have different contributions on the feature extraction of emotion recognition. Combination of these features can enhance the performance of an emotion recognition system. The proposed approach obtains numerous features that extracted in different orders of 2D-FrFT in the directions of x-axis and y-axis, and uses the statistical magnitudes as the final feature vectors for recognition. The Support Vector Machine (SVM) is utilized for the classification and RML Emotion database and Cohn-Kanade (CK) database are used for the experiment. The experimental results demonstrate the effectiveness of the proposed method.
Visual Saliency Detection Based on Multiscale Deep CNN Features.

PubMed

Guanbin Li; Yizhou Yu

2016-11-01

Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this paper, we discover that a high-quality visual saliency model can be learned from multiscale features extracted using deep convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature. To generate a more robust feature, we integrate handcrafted low-level features with our deep contrast feature. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotations. Experimental results demonstrate that our proposed method is capable of achieving the state-of-the-art performance on all public benchmarks, improving the F-measure by 6.12% and 10%, respectively, on the DUT-OMRON data set and our new data set (HKU-IS), and lowering the mean absolute error by 9% and 35.3%, respectively, on these two data sets.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.

PubMed

Liu, Qingfeng; Liu, Chengjun

2017-09-01

A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
Detection of facilities in satellite imagery using semi-supervised image classification and auxiliary contextual observables

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harvey, Neal R; Ruggiero, Christy E; Pawley, Norma H

2009-01-01

Detecting complex targets, such as facilities, in commercially available satellite imagery is a difficult problem that human analysts try to solve by applying world knowledge. Often there are known observables that can be extracted by pixel-level feature detectors that can assist in the facility detection process. Individually, each of these observables is not sufficient for an accurate and reliable detection, but in combination, these auxiliary observables may provide sufficient context for detection by a machine learning algorithm. We describe an approach for automatic detection of facilities that uses an automated feature extraction algorithm to extract auxiliary observables, and a semi-supervisedmore » assisted target recognition algorithm to then identify facilities of interest. We illustrate the approach using an example of finding schools in Quickbird image data of Albuquerque, New Mexico. We use Los Alamos National Laboratory's Genie Pro automated feature extraction algorithm to find a set of auxiliary features that should be useful in the search for schools, such as parking lots, large buildings, sports fields and residential areas and then combine these features using Genie Pro's assisted target recognition algorithm to learn a classifier that finds schools in the image data.« less
Environmental modeling and recognition for an autonomous land vehicle

NASA Technical Reports Server (NTRS)

Lawton, D. T.; Levitt, T. S.; Mcconnell, C. C.; Nelson, P. C.

1987-01-01

An architecture for object modeling and recognition for an autonomous land vehicle is presented. Examples of objects of interest include terrain features, fields, roads, horizon features, trees, etc. The architecture is organized around a set of data bases for generic object models and perceptual structures, temporary memory for the instantiation of object and relational hypotheses, and a long term memory for storing stable hypotheses that are affixed to the terrain representation. Multiple inference processes operate over these databases. Researchers describe these particular components: the perceptual structure database, the grouping processes that operate over this, schemas, and the long term terrain database. A processing example that matches predictions from the long term terrain model to imagery, extracts significant perceptual structures for consideration as potential landmarks, and extracts a relational structure to update the long term terrain database is given.
Enhancing speech recognition using improved particle swarm optimization based hidden Markov model.

PubMed

Selvaraj, Lokesh; Ganesan, Balakrishnan

2014-01-01

Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy.
3D abnormal behavior recognition in power generation

NASA Astrophysics Data System (ADS)

Wei, Zhenhua; Li, Xuesen; Su, Jie; Lin, Jie

2011-06-01

So far most research of human behavior recognition focus on simple individual behavior, such as wave, crouch, jump and bend. This paper will focus on abnormal behavior with objects carrying in power generation. Such as using mobile communication device in main control room, taking helmet off during working and lying down in high place. Taking account of the color and shape are fixed, we adopted edge detecting by color tracking to recognize object in worker. This paper introduces a method, which using geometric character of skeleton and its angle to express sequence of three-dimensional human behavior data. Then adopting Semi-join critical step Hidden Markov Model, weighing probability of critical steps' output to reduce the computational complexity. Training model for every behavior, mean while select some skeleton frames from 3D behavior sample to form a critical step set. This set is a bridge linking 2D observation behavior with 3D human joints feature. The 3D reconstruction is not required during the 2D behavior recognition phase. In the beginning of recognition progress, finding the best match for every frame of 2D observed sample in 3D skeleton set. After that, 2D observed skeleton frames sample will be identified as a specifically 3D behavior by behavior-classifier. The effectiveness of the proposed algorithm is demonstrated with experiments in similar power generation environment.
Picture object recognition in an American black bear (Ursus americanus).

PubMed

Johnson-Ulrich, Zoe; Vonk, Jennifer; Humbyrd, Mary; Crowley, Marilyn; Wojtkowski, Ela; Yates, Florence; Allard, Stephanie

2016-11-01

Many animals have been tested for conceptual discriminations using two-dimensional images as stimuli, and many of these species appear to transfer knowledge from 2D images to analogous real life objects. We tested an American black bear for picture-object recognition using a two alternative forced choice task. She was presented with four unique sets of objects and corresponding pictures. The bear showed generalization from both objects to pictures and pictures to objects; however, her transfer was superior when transferring from real objects to pictures, suggesting that bears can recognize visual features from real objects within photographic images during discriminations.
Achromatical Optical Correlator

NASA Technical Reports Server (NTRS)

Chao, Tien-Hsin; Liu, Hua-Kuang

1989-01-01

Signal-to-noise ratio exceeds that of monochromatic correlator. Achromatical optical correlator uses multiple-pinhole diffraction of dispersed white light to form superposed multiple correlations of input and reference images in output plane. Set of matched spatial filters made by multiple-exposure holographic process, each exposure using suitably-scaled input image and suitable angle of reference beam. Recording-aperture mask translated to appropriate horizontal position for each exposure. Noncoherent illumination suitable for applications involving recognition of color and determination of scale. When fully developed achromatical correlators will be useful for recognition of patterns; for example, in industrial inspection and search for selected features in aerial photographs.
Gabor-based kernel PCA with fractional power polynomial models for face recognition.

PubMed

Liu, Chengjun

2004-05-01

This paper presents a novel Gabor-based kernel Principal Component Analysis (PCA) method by integrating the Gabor wavelet representation of face images and the kernel PCA method for face recognition. Gabor wavelets first derive desirable facial features characterized by spatial frequency, spatial locality, and orientation selectivity to cope with the variations due to illumination and facial expression changes. The kernel PCA method is then extended to include fractional power polynomial models for enhanced face recognition performance. A fractional power polynomial, however, does not necessarily define a kernel function, as it might not define a positive semidefinite Gram matrix. Note that the sigmoid kernels, one of the three classes of widely used kernel functions (polynomial kernels, Gaussian kernels, and sigmoid kernels), do not actually define a positive semidefinite Gram matrix either. Nevertheless, the sigmoid kernels have been successfully used in practice, such as in building support vector machines. In order to derive real kernel PCA features, we apply only those kernel PCA eigenvectors that are associated with positive eigenvalues. The feasibility of the Gabor-based kernel PCA method with fractional power polynomial models has been successfully tested on both frontal and pose-angled face recognition, using two data sets from the FERET database and the CMU PIE database, respectively. The FERET data set contains 600 frontal face images of 200 subjects, while the PIE data set consists of 680 images across five poses (left and right profiles, left and right half profiles, and frontal view) with two different facial expressions (neutral and smiling) of 68 subjects. The effectiveness of the Gabor-based kernel PCA method with fractional power polynomial models is shown in terms of both absolute performance indices and comparative performance against the PCA method, the kernel PCA method with polynomial kernels, the kernel PCA method with fractional power polynomial models, the Gabor wavelet-based PCA method, and the Gabor wavelet-based kernel PCA method with polynomial kernels.
Fuzzy Computing Model of Activity Recognition on WSN Movement Data for Ubiquitous Healthcare Measurement.

PubMed

Chiang, Shu-Yin; Kan, Yao-Chiang; Chen, Yun-Shan; Tu, Ying-Ching; Lin, Hsueh-Chun

2016-12-03

Ubiquitous health care (UHC) is beneficial for patients to ensure they complete therapeutic exercises by self-management at home. We designed a fuzzy computing model that enables recognizing assigned movements in UHC with privacy. The movements are measured by the self-developed body motion sensor, which combines both accelerometer and gyroscope chips to make an inertial sensing node compliant with a wireless sensor network (WSN). The fuzzy logic process was studied to calculate the sensor signals that would entail necessary features of static postures and dynamic motions. Combinations of the features were studied and the proper feature sets were chosen with compatible fuzzy rules. Then, a fuzzy inference system (FIS) can be generated to recognize the assigned movements based on the rules. We thus implemented both fuzzy and adaptive neuro-fuzzy inference systems in the model to distinguish static and dynamic movements. The proposed model can effectively reach the recognition scope of the assigned activity. Furthermore, two exercises of upper-limb flexion in physical therapy were applied for the model in which the recognition rate can stand for the passing rate of the assigned motions. Finally, a web-based interface was developed to help remotely measure movement in physical therapy for UHC.
Fuzzy Computing Model of Activity Recognition on WSN Movement Data for Ubiquitous Healthcare Measurement

PubMed Central

Chiang, Shu-Yin; Kan, Yao-Chiang; Chen, Yun-Shan; Tu, Ying-Ching; Lin, Hsueh-Chun

2016-01-01

Ubiquitous health care (UHC) is beneficial for patients to ensure they complete therapeutic exercises by self-management at home. We designed a fuzzy computing model that enables recognizing assigned movements in UHC with privacy. The movements are measured by the self-developed body motion sensor, which combines both accelerometer and gyroscope chips to make an inertial sensing node compliant with a wireless sensor network (WSN). The fuzzy logic process was studied to calculate the sensor signals that would entail necessary features of static postures and dynamic motions. Combinations of the features were studied and the proper feature sets were chosen with compatible fuzzy rules. Then, a fuzzy inference system (FIS) can be generated to recognize the assigned movements based on the rules. We thus implemented both fuzzy and adaptive neuro-fuzzy inference systems in the model to distinguish static and dynamic movements. The proposed model can effectively reach the recognition scope of the assigned activity. Furthermore, two exercises of upper-limb flexion in physical therapy were applied for the model in which the recognition rate can stand for the passing rate of the assigned motions. Finally, a web-based interface was developed to help remotely measure movement in physical therapy for UHC. PMID:27918482
Face recognition algorithm using extended vector quantization histogram features.

PubMed

Yan, Yan; Lee, Feifei; Wu, Xueqian; Chen, Qiu

2018-01-01

In this paper, we propose a face recognition algorithm based on a combination of vector quantization (VQ) and Markov stationary features (MSF). The VQ algorithm has been shown to be an effective method for generating features; it extracts a codevector histogram as a facial feature representation for face recognition. Still, the VQ histogram features are unable to convey spatial structural information, which to some extent limits their usefulness in discrimination. To alleviate this limitation of VQ histograms, we utilize Markov stationary features (MSF) to extend the VQ histogram-based features so as to add spatial structural information. We demonstrate the effectiveness of our proposed algorithm by achieving recognition results superior to those of several state-of-the-art methods on publicly available face databases.
Improving Speaker Recognition by Biometric Voice Deconstruction

PubMed Central

Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

2015-01-01

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245

Improving Speaker Recognition by Biometric Voice Deconstruction.

PubMed

Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

2015-01-01

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.
The recognition of graphical patterns invariant to geometrical transformation of the models

NASA Astrophysics Data System (ADS)

Ileană, Ioan; Rotar, Corina; Muntean, Maria; Ceuca, Emilian

2010-11-01

In case that a pattern recognition system is used for images recognition (in robot vision, handwritten recognition etc.), the system must have the capacity to identify an object indifferently of its size or position in the image. The problem of the invariance of recognition can be approached in some fundamental modes. One may apply the similarity criterion used in associative recall. The original pattern is replaced by a mathematical transform that assures some invariance (e.g. the value of two-dimensional Fourier transformation is translation invariant, the value of Mellin transformation is scale invariant). In a different approach the original pattern is represented through a set of features, each of them being coded indifferently of the position, orientation or position of the pattern. Generally speaking, it is easy to obtain invariance in relation with one transformation group, but is difficult to obtain simultaneous invariance at rotation, translation and scale. In this paper we analyze some methods to achieve invariant recognition of images, particularly for digit images. A great number of experiments are due and the conclusions are underplayed in the paper.
Multi-source feature extraction and target recognition in wireless sensor networks based on adaptive distributed wavelet compression algorithms

NASA Astrophysics Data System (ADS)

Hortos, William S.

2008-04-01

Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at participating nodes. Therefore, the feature-extraction method based on the Haar DWT is presented that employs a maximum-entropy measure to determine significant wavelet coefficients. Features are formed by calculating the energy of coefficients grouped around the competing clusters. A DWT-based feature extraction algorithm used for vehicle classification in WSNs can be enhanced by an added rule for selecting the optimal number of resolution levels to improve the correct classification rate and reduce energy consumption expended in local algorithm computations. Published field trial data for vehicular ground targets, measured with multiple sensor types, are used to evaluate the wavelet-assisted algorithms. Extracted features are used in established target recognition routines, e.g., the Bayesian minimum-error-rate classifier, to compare the effects on the classification performance of the wavelet compression. Simulations of feature sets and recognition routines at different resolution levels in target scenarios indicate the impact on classification rates, while formulas are provided to estimate reduction in resource use due to distributed compression.
Single-Word Recognition Need Not Depend on Single-Word Features: Narrative Coherence Counteracts Effects of Single-Word Features that Lexical Decision Emphasizes.

PubMed

Teng, Dan W; Wallot, Sebastian; Kelty-Stephen, Damian G

2016-12-01

Research on reading comprehension of connected text emphasizes reliance on single-word features that organize a stable, mental lexicon of words and that speed or slow the recognition of each new word. However, the time needed to recognize a word might not actually be as fixed as previous research indicates, and the stability of the mental lexicon may change with task demands. The present study explores the effects of narrative coherence in self-paced story reading to single-word feature effects in lexical decision. We presented single strings of letters to 24 participants, in both lexical decision and self-paced story reading. Both tasks included the same words composing a set of adjective-noun pairs. Reading times revealed that the tasks, and the order of the presentation of the tasks, changed and/or eliminated familiar effects of single-word features. Specifically, experiencing the lexical-decision task first gradually emphasized the role of single-word features, and experiencing the self-paced story-reading task afterwards counteracted the effect of single-word features. We discuss the implications that task-dependence and narrative coherence might have for the organization of the mental lexicon. Future work will need to consider what architectures suit the apparent flexibility with which task can accentuate or diminish effects of single-word features.
Weighted Feature Gaussian Kernel SVM for Emotion Recognition

PubMed Central

Jia, Qingxuan

2016-01-01

Emotion recognition with weighted feature based on facial expression is a challenging research topic and has attracted great attention in the past few years. This paper presents a novel method, utilizing subregion recognition rate to weight kernel function. First, we divide the facial expression image into some uniform subregions and calculate corresponding recognition rate and weight. Then, we get a weighted feature Gaussian kernel function and construct a classifier based on Support Vector Machine (SVM). At last, the experimental results suggest that the approach based on weighted feature Gaussian kernel function has good performance on the correct rate in emotion recognition. The experiments on the extended Cohn-Kanade (CK+) dataset show that our method has achieved encouraging recognition results compared to the state-of-the-art methods. PMID:27807443
Application of Pattern Recognition Techniques to the Classification of Full-Term and Preterm Infant Cry.

PubMed

Orlandi, Silvia; Reyes Garcia, Carlos Alberto; Bandini, Andrea; Donzelli, Gianpaolo; Manfredi, Claudia

2016-11-01

Scientific and clinical advances in perinatology and neonatology have enhanced the chances of survival of preterm and very low weight neonates. Infant cry analysis is a suitable noninvasive complementary tool to assess the neurologic state of infants particularly important in the case of preterm neonates. This article aims at exploiting differences between full-term and preterm infant cry with robust automatic acoustical analysis and data mining techniques. Twenty-two acoustical parameters are estimated in more than 3000 cry units from cry recordings of 28 full-term and 10 preterm newborns. Feature extraction is performed through the BioVoice dedicated software tool, developed at the Biomedical Engineering Lab, University of Firenze, Italy. Classification and pattern recognition is based on genetic algorithms for the selection of the best attributes. Training is performed comparing four classifiers: Logistic Curve, Multilayer Perceptron, Support Vector Machine, and Random Forest and three different testing options: full training set, 10-fold cross-validation, and 66% split. Results show that the best feature set is made up by 10 parameters capable to assess differences between preterm and full-term newborns with about 87% of accuracy. Best results are obtained with the Random Forest method (receiver operating characteristic area, 0.94). These 10 cry features might convey important additional information to assist the clinical specialist in the diagnosis and follow-up of possible delays or disorders in the neurologic development due to premature birth in this extremely vulnerable population of patients. The proposed approach is a first step toward an automatic infant cry recognition system for fast and proper identification of risk in preterm babies. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Learning and recognition of on-premise signs from weakly labeled street view images.

PubMed

Tsai, Tsung-Hung; Cheng, Wen-Huang; You, Chuang-Wen; Hu, Min-Chun; Tsui, Arvin Wen; Chi, Heng-Yu

2014-03-01

Camera-enabled mobile devices are commonly used as interaction platforms for linking the user's virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e.g., appearing in arbitrary size), accompanied with complex environmental conditions (e.g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Google's Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.
Morphological and wavelet features towards sonographic thyroid nodules evaluation.

PubMed

Tsantis, Stavros; Dimitropoulos, Nikos; Cavouras, Dionisis; Nikiforidis, George

2009-03-01

This paper presents a computer-based classification scheme that utilized various morphological and novel wavelet-based features towards malignancy risk evaluation of thyroid nodules in ultrasonography. The study comprised 85 ultrasound images-patients that were cytological confirmed (54 low-risk and 31 high-risk). A set of 20 features (12 based on nodules boundary shape and 8 based on wavelet local maxima located within each nodule) has been generated. Two powerful pattern recognition algorithms (support vector machines and probabilistic neural networks) have been designed and developed in order to quantify the power of differentiation of the introduced features. A comparative study has also been held, in order to estimate the impact speckle had onto the classification procedure. The diagnostic sensitivity and specificity of both classifiers was made by means of receiver operating characteristics (ROC) analysis. In the speckle-free feature set, the area under the ROC curve was 0.96 for the support vector machines classifier whereas for the probabilistic neural networks was 0.91. In the feature set with speckle, the corresponding areas under the ROC curves were 0.88 and 0.86 respectively for the two classifiers. The proposed features can increase the classification accuracy and decrease the rate of missing and misdiagnosis in thyroid cancer control.
Chinese character recognition based on Gabor feature extraction and CNN

NASA Astrophysics Data System (ADS)

Xiong, Yudian; Lu, Tongwei; Jiang, Yongyuan

2018-03-01

As an important application in the field of text line recognition and office automation, Chinese character recognition has become an important subject of pattern recognition. However, due to the large number of Chinese characters and the complexity of its structure, there is a great difficulty in the Chinese character recognition. In order to solve this problem, this paper proposes a method of printed Chinese character recognition based on Gabor feature extraction and Convolution Neural Network(CNN). The main steps are preprocessing, feature extraction, training classification. First, the gray-scale Chinese character image is binarized and normalized to reduce the redundancy of the image data. Second, each image is convoluted with Gabor filter with different orientations, and the feature map of the eight orientations of Chinese characters is extracted. Third, the feature map through Gabor filters and the original image are convoluted with learning kernels, and the results of the convolution is the input of pooling layer. Finally, the feature vector is used to classify and recognition. In addition, the generalization capacity of the network is improved by Dropout technology. The experimental results show that this method can effectively extract the characteristics of Chinese characters and recognize Chinese characters.
Target recognition of ladar range images using slice image: comparison of four improved algorithms

NASA Astrophysics Data System (ADS)

Xia, Wenze; Han, Shaokun; Cao, Jingya; Wang, Liang; Zhai, Yu; Cheng, Yang

2017-07-01

Compared with traditional 3-D shape data, ladar range images possess properties of strong noise, shape degeneracy, and sparsity, which make feature extraction and representation difficult. The slice image is an effective feature descriptor to resolve this problem. We propose four improved algorithms on target recognition of ladar range images using slice image. In order to improve resolution invariance of the slice image, mean value detection instead of maximum value detection is applied in these four improved algorithms. In order to improve rotation invariance of the slice image, three new improved feature descriptors-which are feature slice image, slice-Zernike moments, and slice-Fourier moments-are applied to the last three improved algorithms, respectively. Backpropagation neural networks are used as feature classifiers in the last two improved algorithms. The performance of these four improved recognition systems is analyzed comprehensively in the aspects of the three invariances, recognition rate, and execution time. The final experiment results show that the improvements for these four algorithms reach the desired effect, the three invariances of feature descriptors are not directly related to the final recognition performance of recognition systems, and these four improved recognition systems have different performances under different conditions.
Event Recognition for Contactless Activity Monitoring Using Phase-Modulated Continuous Wave Radar.

PubMed

Forouzanfar, Mohamad; Mabrouk, Mohamed; Rajan, Sreeraman; Bolic, Miodrag; Dajani, Hilmi R; Groza, Voicu Z

2017-02-01

The use of remote sensing technologies such as radar is gaining popularity as a technique for contactless detection of physiological signals and analysis of human motion. This paper presents a methodology for classifying different events in a collection of phase modulated continuous wave radar returns. The primary application of interest is to monitor inmates where the presence of human vital signs amidst different, interferences needs to be identified. A comprehensive set of features is derived through time and frequency domain analyses of the radar returns. The Bhattacharyya distance is used to preselect the features with highest class separability as the possible candidate features for use in the classification process. The uncorrelated linear discriminant analysis is performed to decorrelate, denoise, and reduce the dimension of the candidate feature set. Linear and quadratic Bayesian classifiers are designed to distinguish breathing, different human motions, and nonhuman motions. The performance of these classifiers is evaluated on a pilot dataset of radar returns that contained different events including breathing, stopped breathing, simple human motions, and movement of fan and water. Our proposed pattern classification system achieved accuracies of up to 93% in stationary subject detection, 90% in stop-breathing detection, and 86% in interference detection. Our proposed radar pattern recognition system was able to accurately distinguish the predefined events amidst interferences. Besides inmate monitoring and suicide attempt detection, this paper can be extended to other radar applications such as home-based monitoring of elderly people, apnea detection, and home occupancy detection.
An Improved Iris Recognition Algorithm Based on Hybrid Feature and ELM

NASA Astrophysics Data System (ADS)

Wang, Juan

2018-03-01

The iris image is easily polluted by noise and uneven light. This paper proposed an improved extreme learning machine (ELM) based iris recognition algorithm with hybrid feature. 2D-Gabor filters and GLCM is employed to generate a multi-granularity hybrid feature vector. 2D-Gabor filter and GLCM feature work for capturing low-intermediate frequency and high frequency texture information, respectively. Finally, we utilize extreme learning machine for iris recognition. Experimental results reveal our proposed ELM based multi-granularity iris recognition algorithm (ELM-MGIR) has higher accuracy of 99.86%, and lower EER of 0.12% under the premise of real-time performance. The proposed ELM-MGIR algorithm outperforms other mainstream iris recognition algorithms.
Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization.

PubMed

Dai, Hong-Jie; Lai, Po-Ting; Chang, Yung-Chun; Tsai, Richard Tzong-Han

2015-01-01

The functions of chemical compounds and drugs that affect biological processes and their particular effect on the onset and treatment of diseases have attracted increasing interest with the advancement of research in the life sciences. To extract knowledge from the extensive literatures on such compounds and drugs, the organizers of BioCreative IV administered the CHEMical Compound and Drug Named Entity Recognition (CHEMDNER) task to establish a standard dataset for evaluating state-of-the-art chemical entity recognition methods. This study introduces the approach of our CHEMDNER system. Instead of emphasizing the development of novel feature sets for machine learning, this study investigates the effect of various tag schemes on the recognition of the names of chemicals and drugs by using conditional random fields. Experiments were conducted using combinations of different tokenization strategies and tag schemes to investigate the effects of tag set selection and tokenization method on the CHEMDNER task. This study presents the performance of CHEMDNER of three more representative tag schemes-IOBE, IOBES, and IOB12E-when applied to a widely utilized IOB tag set and combined with the coarse-/fine-grained tokenization methods. The experimental results thus reveal that the fine-grained tokenization strategy performance best in terms of precision, recall and F-scores when the IOBES tag set was utilized. The IOBES model with fine-grained tokenization yielded the best-F-scores in the six chemical entity categories other than the "Multiple" entity category. Nonetheless, no significant improvement was observed when a more representative tag schemes was used with the coarse or fine-grained tokenization rules. The best F-scores that were achieved using the developed system on the test dataset of the CHEMDNER task were 0.833 and 0.815 for the chemical documents indexing and the chemical entity mention recognition tasks, respectively. The results herein highlight the importance of tag set selection and the use of different tokenization strategies. Fine-grained tokenization combined with the tag set IOBES most effectively recognizes chemical and drug names. To the best of the authors' knowledge, this investigation is the first comprehensive investigation use of various tag set schemes combined with different tokenization strategies for the recognition of chemical entities.
Correlated Topic Vector for Scene Classification.

PubMed

Wei, Pengxu; Qin, Fei; Wan, Fang; Zhu, Yi; Jiao, Jianbin; Ye, Qixiang

2017-07-01

Scene images usually involve semantic correlations, particularly when considering large-scale image data sets. This paper proposes a novel generative image representation, correlated topic vector, to model such semantic correlations. Oriented from the correlated topic model, correlated topic vector intends to naturally utilize the correlations among topics, which are seldom considered in the conventional feature encoding, e.g., Fisher vector, but do exist in scene images. It is expected that the involvement of correlations can increase the discriminative capability of the learned generative model and consequently improve the recognition accuracy. Incorporated with the Fisher kernel method, correlated topic vector inherits the advantages of Fisher vector. The contributions to the topics of visual words have been further employed by incorporating the Fisher kernel framework to indicate the differences among scenes. Combined with the deep convolutional neural network (CNN) features and Gibbs sampling solution, correlated topic vector shows great potential when processing large-scale and complex scene image data sets. Experiments on two scene image data sets demonstrate that correlated topic vector improves significantly the deep CNN features, and outperforms existing Fisher kernel-based features.
Vehicle logo recognition using multi-level fusion model

NASA Astrophysics Data System (ADS)

Ming, Wei; Xiao, Jianli

2018-04-01

Vehicle logo recognition plays an important role in manufacturer identification and vehicle recognition. This paper proposes a new vehicle logo recognition algorithm. It has a hierarchical framework, which consists of two fusion levels. At the first level, a feature fusion model is employed to map the original features to a higher dimension feature space. In this space, the vehicle logos become more recognizable. At the second level, a weighted voting strategy is proposed to promote the accuracy and the robustness of the recognition results. To evaluate the performance of the proposed algorithm, extensive experiments are performed, which demonstrate that the proposed algorithm can achieve high recognition accuracy and work robustly.
An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.

PubMed

Lozano-Diez, Alicia; Zazo, Ruben; Toledano, Doroteo T; Gonzalez-Rodriguez, Joaquin

2017-01-01

Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.
Identifying significant environmental features using feature recognition.

DOT National Transportation Integrated Search

2015-10-01

The Department of Environmental Analysis at the Kentucky Transportation Cabinet has expressed an interest in feature-recognition capability because it may help analysts identify environmentally sensitive features in the landscape, : including those r...
Artificially intelligent recognition of Arabic speaker using voice print-based local features

NASA Astrophysics Data System (ADS)

Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz

2016-11-01

Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.
Combining heterogenous features for 3D hand-held object recognition

NASA Astrophysics Data System (ADS)

Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang

2014-10-01

Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
Intelligibility Evaluation of Pathological Speech through Multigranularity Feature Extraction and Optimization.

PubMed

Fang, Chunying; Li, Haifeng; Ma, Lin; Zhang, Mancai

2017-01-01

Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S -transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S -transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F -score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation.

Image Recognition and Feature Detection in Solar Physics

NASA Astrophysics Data System (ADS)

Martens, Petrus C.

2012-05-01

The Solar Dynamics Observatory (SDO) data repository will dwarf the archives of all previous solar physics missions put together. NASA recognized early on that the traditional methods of analyzing the data -- solar scientists and grad students in particular analyzing the images by hand -- would simply not work and tasked our Feature Finding Team (FFT) with developing automated feature recognition modules for solar events and phenomena likely to be observed by SDO. Having these metadata available on-line will enable solar scientist to conduct statistical studies involving large sets of events that would be impossible now with traditional means. We have followed a two-track approach in our project: we have been developing some existing task-specific solar feature finding modules to be "pipe-line" ready for the stream of SDO data, plus we are designing a few new modules. Secondly, we took it upon us to develop an entirely new "trainable" module that would be capable of identifying different types of solar phenomena starting from a limited number of user-provided examples. Both approaches are now reaching fruition, and I will show examples and movies with results from several of our feature finding modules. In the second part of my presentation I will focus on our “trainable” module, which is the most innovative in character. First, there is the strong similarity between solar and medical X-ray images with regard to their texture, which has allowed us to apply some advances made in medical image recognition. Second, we have found that there is a strong similarity between the way our trainable module works and the way our brain recognizes images. The brain can quickly recognize similar images from key characteristics, just as our code does. We conclude from that that our approach represents the beginning of a more human-like procedure for computer image recognition.
Automated feature detection and identification in digital point-ordered signals

DOEpatents

Oppenlander, Jane E.; Loomis, Kent C.; Brudnoy, David M.; Levy, Arthur J.

1998-01-01

A computer-based automated method to detect and identify features in digital point-ordered signals. The method is used for processing of non-destructive test signals, such as eddy current signals obtained from calibration standards. The signals are first automatically processed to remove noise and to determine a baseline. Next, features are detected in the signals using mathematical morphology filters. Finally, verification of the features is made using an expert system of pattern recognition methods and geometric criteria. The method has the advantage that standard features can be, located without prior knowledge of the number or sequence of the features. Further advantages are that standard features can be differentiated from irrelevant signal features such as noise, and detected features are automatically verified by parameters extracted from the signals. The method proceeds fully automatically without initial operator set-up and without subjective operator feature judgement.
Distorted Character Recognition Via An Associative Neural Network

NASA Astrophysics Data System (ADS)

Messner, Richard A.; Szu, Harold H.

1987-03-01

The purpose of this paper is two-fold. First, it is intended to provide some preliminary results of a character recognition scheme which has foundations in on-going neural network architecture modeling, and secondly, to apply some of the neural network results in a real application area where thirty years of effort has had little effect on providing the machine an ability to recognize distorted objects within the same object class. It is the author's belief that the time is ripe to start applying in ernest the results of over twenty years of effort in neural modeling to some of the more difficult problems which seem so hard to solve by conventional means. The character recognition scheme proposed utilizes a preprocessing stage which performs a 2-dimensional Walsh transform of an input cartesian image field, then sequency filters this spectrum into three feature bands. Various features are then extracted and organized into three sets of feature vectors. These vector patterns that are stored and recalled associatively. Two possible associative neural memory models are proposed for further investigation. The first being an outer-product linear matrix associative memory with a threshold function controlling the strength of the output pattern (similar to Kohonen's crosscorrelation approach [1]). The second approach is based upon a modified version of Grossberg's neural architecture [2] which provides better self-organizing properties due to its adaptive nature. Preliminary results of the sequency filtering and feature extraction preprocessing stage and discussion about the use of the proposed neural architectures is included.
Human activity recognition based on feature selection in smart home using back-propagation algorithm.

PubMed

Fang, Hongqing; He, Lei; Si, Hao; Liu, Peng; Xie, Xiaolei

2014-09-01

In this paper, Back-propagation(BP) algorithm has been used to train the feed forward neural network for human activity recognition in smart home environments, and inter-class distance method for feature selection of observed motion sensor events is discussed and tested. And then, the human activity recognition performances of neural network using BP algorithm have been evaluated and compared with other probabilistic algorithms: Naïve Bayes(NB) classifier and Hidden Markov Model(HMM). The results show that different feature datasets yield different activity recognition accuracy. The selection of unsuitable feature datasets increases the computational complexity and degrades the activity recognition accuracy. Furthermore, neural network using BP algorithm has relatively better human activity recognition performances than NB classifier and HMM. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Four-Channel Biosignal Analysis and Feature Extraction for Automatic Emotion Recognition

NASA Astrophysics Data System (ADS)

Kim, Jonghwa; André, Elisabeth

This paper investigates the potential of physiological signals as a reliable channel for automatic recognition of user's emotial state. For the emotion recognition, little attention has been paid so far to physiological signals compared to audio-visual emotion channels such as facial expression or speech. All essential stages of automatic recognition system using biosignals are discussed, from recording physiological dataset up to feature-based multiclass classification. Four-channel biosensors are used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to search the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by emotion recognition results.
Score-Level Fusion of Phase-Based and Feature-Based Fingerprint Matching Algorithms

NASA Astrophysics Data System (ADS)

Ito, Koichi; Morita, Ayumi; Aoki, Takafumi; Nakajima, Hiroshi; Kobayashi, Koji; Higuchi, Tatsuo

This paper proposes an efficient fingerprint recognition algorithm combining phase-based image matching and feature-based matching. In our previous work, we have already proposed an efficient fingerprint recognition algorithm using Phase-Only Correlation (POC), and developed commercial fingerprint verification units for access control applications. The use of Fourier phase information of fingerprint images makes it possible to achieve robust recognition for weakly impressed, low-quality fingerprint images. This paper presents an idea of improving the performance of POC-based fingerprint matching by combining it with feature-based matching, where feature-based matching is introduced in order to improve recognition efficiency for images with nonlinear distortion. Experimental evaluation using two different types of fingerprint image databases demonstrates efficient recognition performance of the combination of the POC-based algorithm and the feature-based algorithm.
An enhanced feature set for pattern recognition based contrast enhancement of contact-less captured latent fingerprints in digitized crime scene forensics

NASA Astrophysics Data System (ADS)

Hildebrandt, Mario; Kiltz, Stefan; Dittmann, Jana; Vielhauer, Claus

2014-02-01

In crime scene forensics latent fingerprints are found on various substrates. Nowadays primarily physical or chemical preprocessing techniques are applied for enhancing the visibility of the fingerprint trace. In order to avoid altering the trace it has been shown that contact-less sensors offer a non-destructive acquisition approach. Here, the exploitation of fingerprint or substrate properties and the utilization of signal processing techniques are an essential requirement to enhance the fingerprint visibility. However, especially the optimal sensory is often substrate-dependent. An enhanced generic pattern recognition based contrast enhancement approach for scans of a chromatic white light sensor is introduced in Hildebrandt et al.1 using statistical, structural and Benford's law2 features for blocks of 50 micron. This approach achieves very good results for latent fingerprints on cooperative, non-textured, smooth substrates. However, on textured and structured substrates the error rates are very high and the approach thus unsuitable for forensic use cases. We propose the extension of the feature set with semantic features derived from known Gabor filter based exemplar fingerprint enhancement techniques by suggesting an Epsilon-neighborhood of each block in order to achieve an improved accuracy (called fingerprint ridge orientation semantics). Furthermore, we use rotation invariant Hu moments as an extension of the structural features and two additional preprocessing methods (separate X- and Y Sobel operators). This results in a 408-dimensional feature space. In our experiments we investigate and report the recognition accuracy for eight substrates, each with ten latent fingerprints: white furniture surface, veneered plywood, brushed stainless steel, aluminum foil, "Golden-Oak" veneer, non-metallic matte car body finish, metallic car body finish and blued metal. In comparison to Hildebrandt et al.,1 our evaluation shows a significant reduction of the error rates by 15.8 percent points on brushed stainless steel using the same classifier. This also allows for a successful biometric matching of 3 of the 8 latent fingerprint samples with the corresponding exemplar fingerprint on this particular substrate. For contrast enhancement analysis of classification results we suggest to use known Visual Quality Indexes (VQI)3 as a contrast enhancement quality indicator and discuss our first preliminary results using the exemplary chosen VQI Edge Similarity Score (ESS),4 showing a tendency that higher image differences between a substrate containing a fingerprint and a substrate with a blank surface correlate with a higher recognition accuracy between a latent fingerprint and an exemplar fingerprint. Those first preliminary results support further research into VQIs as contrast enhancement quality indicator for a given feature space.
Fine grained recognition of masonry walls for built heritage assessment

NASA Astrophysics Data System (ADS)

Oses, N.; Dornaika, F.; Moujahid, A.

2015-01-01

This paper presents the ground work carried out to achieve automatic fine grained recognition of stone masonry. This is a necessary first step in the development of the analysis tool. The built heritage that will be assessed consists of stone masonry constructions and many of the features analysed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, we apply image processing on digital images of the elements under inspection. The main contribution of the paper is the performance evaluation of the automatic categorization of masonry walls from a set of extracted straight line segments. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls using machine learning paradigms. These include classifiers as well as automatic feature selection.
Automated Detection of Diabetic Retinopathy using Deep Learning.

PubMed

Lam, Carson; Yi, Darvin; Guo, Margaret; Lindsey, Tony

2018-01-01

Diabetic retinopathy is a leading cause of blindness among working-age adults. Early detection of this condition is critical for good prognosis. In this paper, we demonstrate the use of convolutional neural networks (CNNs) on color fundus images for the recognition task of diabetic retinopathy staging. Our network models achieved test metric performance comparable to baseline literature results, with validation sensitivity of 95%. We additionally explored multinomial classification models, and demonstrate that errors primarily occur in the misclassification of mild disease as normal due to the CNNs inability to detect subtle disease features. We discovered that preprocessing with contrast limited adaptive histogram equalization and ensuring dataset fidelity by expert verification of class labels improves recognition of subtle features. Transfer learning on pretrained GoogLeNet and AlexNet models from ImageNet improved peak test set accuracies to 74.5%, 68.8%, and 57.2% on 2-ary, 3-ary, and 4-ary classification models, respectively.
Target recognition based on convolutional neural network

NASA Astrophysics Data System (ADS)

Wang, Liqiang; Wang, Xin; Xi, Fubiao; Dong, Jian

2017-11-01

One of the important part of object target recognition is the feature extraction, which can be classified into feature extraction and automatic feature extraction. The traditional neural network is one of the automatic feature extraction methods, while it causes high possibility of over-fitting due to the global connection. The deep learning algorithm used in this paper is a hierarchical automatic feature extraction method, trained with the layer-by-layer convolutional neural network (CNN), which can extract the features from lower layers to higher layers. The features are more discriminative and it is beneficial to the object target recognition.
Human red blood cell recognition enhancement with three-dimensional morphological features obtained by digital holographic imaging

NASA Astrophysics Data System (ADS)

Jaferzadeh, Keyvan; Moon, Inkyu

2016-12-01

The classification of erythrocytes plays an important role in the field of hematological diagnosis, specifically blood disorders. Since the biconcave shape of red blood cell (RBC) is altered during the different stages of hematological disorders, we believe that the three-dimensional (3-D) morphological features of erythrocyte provide better classification results than conventional two-dimensional (2-D) features. Therefore, we introduce a set of 3-D features related to the morphological and chemical properties of RBC profile and try to evaluate the discrimination power of these features against 2-D features with a neural network classifier. The 3-D features include erythrocyte surface area, volume, average cell thickness, sphericity index, sphericity coefficient and functionality factor, MCH and MCHSD, and two newly introduced features extracted from the ring section of RBC at the single-cell level. In contrast, the 2-D features are RBC projected surface area, perimeter, radius, elongation, and projected surface area to perimeter ratio. All features are obtained from images visualized by off-axis digital holographic microscopy with a numerical reconstruction algorithm, and four categories of biconcave (doughnut shape), flat-disc, stomatocyte, and echinospherocyte RBCs are interested. Our experimental results demonstrate that the 3-D features can be more useful in RBC classification than the 2-D features. Finally, we choose the best feature set of the 2-D and 3-D features by sequential forward feature selection technique, which yields better discrimination results. We believe that the final feature set evaluated with a neural network classification strategy can improve the RBC classification accuracy.
Face recognition using slow feature analysis and contourlet transform

NASA Astrophysics Data System (ADS)

Wang, Yuehao; Peng, Lingling; Zhe, Fuchuan

2018-04-01

In this paper we propose a novel face recognition approach based on slow feature analysis (SFA) in contourlet transform domain. This method firstly use contourlet transform to decompose the face image into low frequency and high frequency part, and then takes technological advantages of slow feature analysis for facial feature extraction. We named the new method combining the slow feature analysis and contourlet transform as CT-SFA. The experimental results on international standard face database demonstrate that the new face recognition method is effective and competitive.
Saliency image of feature building for image quality assessment

NASA Astrophysics Data System (ADS)

Ju, Xinuo; Sun, Jiyin; Wang, Peng

2011-11-01

The purpose and method of image quality assessment are quite different for automatic target recognition (ATR) and traditional application. Local invariant feature detectors, mainly including corner detectors, blob detectors and region detectors etc., are widely applied for ATR. A saliency model of feature was proposed to evaluate feasibility of ATR in this paper. The first step consisted of computing the first-order derivatives on horizontal orientation and vertical orientation, and computing DoG maps in different scales respectively. Next, saliency images of feature were built based auto-correlation matrix in different scale. Then, saliency images of feature of different scales amalgamated. Experiment were performed on a large test set, including infrared images and optical images, and the result showed that the salient regions computed by this model were consistent with real feature regions computed by mostly local invariant feature extraction algorithms.
Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.

PubMed

Hsu, Wei-Yen

2013-12-01

In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.
Pattern recognition tool based on complex network-based approach

NASA Astrophysics Data System (ADS)

Casanova, Dalcimar; Backes, André Ricardo; Martinez Bruno, Odemir

2013-02-01

This work proposed a generalization of the method proposed by the authors: 'A complex network-based approach for boundary shape analysis'. Instead of modelling a contour into a graph and use complex networks rules to characterize it, here, we generalize the technique. This way, the work proposes a mathematical tool for characterization signals, curves and set of points. To evaluate the pattern description power of the proposal, an experiment of plat identification based on leaf veins image are conducted. Leaf vein is a taxon characteristic used to plant identification proposes, and one of its characteristics is that these structures are complex, and difficult to be represented as a signal or curves and this way to be analyzed in a classical pattern recognition approach. Here, we model the veins as a set of points and model as graphs. As features, we use the degree and joint degree measurements in a dynamic evolution. The results demonstrates that the technique has a good power of discrimination and can be used for plant identification, as well as other complex pattern recognition tasks.
Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

PubMed Central

Álvarez, Aitor; Sierra, Basilio; Arruti, Andoni; López-Gil, Juan-Miguel; Garay-Vitoria, Nestor

2015-01-01

In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first layer to select the optimal subset from the standard base classifiers. The good performance of the proposed new paradigm was demonstrated over different configurations and datasets. First, several CSS stacking classifiers were constructed on the RekEmozio dataset, using some specific standard base classifiers and a total of 123 spectral, quality and prosodic features computed using in-house feature extraction algorithms. These initial CSS stacking classifiers were compared to other multi-classifier systems and the employed standard classifiers built on the same set of speech features. Then, new CSS stacking classifiers were built on RekEmozio using a different set of both acoustic parameters (extended version of the Geneva Minimalistic Acoustic Parameter Set (eGeMAPS)) and standard classifiers and employing the best meta-classifier of the initial experiments. The performance of these two CSS stacking classifiers was evaluated and compared. Finally, the new paradigm was tested on the well-known Berlin Emotional Speech database. We compared the performance of single, standard stacking and CSS stacking systems using the same parametrization of the second phase. All of the classifications were performed at the categorical level, including the six primary emotions plus the neutral one. PMID:26712757
NATIONAL PREPAREDNESS: Technologies to Secure Federal Buildings

DTIC Science & Technology

2002-04-25

Medium, some resistance based on sensitivity of eye Facial recognition Facial features are captured and compared Dependent on lighting, positioning...two primary types of facial recognition technology used to create templates: 1. Local feature analysis—Dozens of images from regions of the face are...an adjacent feature. Attachment I—Access Control Technologies: Biometrics Facial Recognition How the technology works
Featuring Old/New Recognition: The Two Faces of the Pseudoword Effect

ERIC Educational Resources Information Center

Joordens, Steve; Ozubko, Jason D.; Niewiadomski, Marty W.

2008-01-01

In his analysis of the pseudoword effect, [Greene, R.L. (2004). Recognition memory for pseudowords. "Journal of Memory and Language," 50, 259-267.] suggests nonwords can feel more familiar that words in a recognition context if the orthographic features of the nonword match well with the features of the items presented at study. One possible…
Computer vision

NASA Technical Reports Server (NTRS)

Gennery, D.; Cunningham, R.; Saund, E.; High, J.; Ruoff, C.

1981-01-01

The field of computer vision is surveyed and assessed, key research issues are identified, and possibilities for a future vision system are discussed. The problems of descriptions of two and three dimensional worlds are discussed. The representation of such features as texture, edges, curves, and corners are detailed. Recognition methods are described in which cross correlation coefficients are maximized or numerical values for a set of features are measured. Object tracking is discussed in terms of the robust matching algorithms that must be devised. Stereo vision, camera control and calibration, and the hardware and systems architecture are discussed.
Optimal Geometrical Set for Automated Marker Placement to Virtualized Real-Time Facial Emotions

PubMed Central

Maruthapillai, Vasanthan; Murugappan, Murugappan

2016-01-01

In recent years, real-time face recognition has been a major topic of interest in developing intelligent human-machine interaction systems. Over the past several decades, researchers have proposed different algorithms for facial expression recognition, but there has been little focus on detection in real-time scenarios. The present work proposes a new algorithmic method of automated marker placement used to classify six facial expressions: happiness, sadness, anger, fear, disgust, and surprise. Emotional facial expressions were captured using a webcam, while the proposed algorithm placed a set of eight virtual markers on each subject’s face. Facial feature extraction methods, including marker distance (distance between each marker to the center of the face) and change in marker distance (change in distance between the original and new marker positions), were used to extract three statistical features (mean, variance, and root mean square) from the real-time video sequence. The initial position of each marker was subjected to the optical flow algorithm for marker tracking with each emotional facial expression. Finally, the extracted statistical features were mapped into corresponding emotional facial expressions using two simple non-linear classifiers, K-nearest neighbor and probabilistic neural network. The results indicate that the proposed automated marker placement algorithm effectively placed eight virtual markers on each subject’s face and gave a maximum mean emotion classification rate of 96.94% using the probabilistic neural network. PMID:26859884

Optimal Geometrical Set for Automated Marker Placement to Virtualized Real-Time Facial Emotions.

PubMed

Maruthapillai, Vasanthan; Murugappan, Murugappan

2016-01-01

In recent years, real-time face recognition has been a major topic of interest in developing intelligent human-machine interaction systems. Over the past several decades, researchers have proposed different algorithms for facial expression recognition, but there has been little focus on detection in real-time scenarios. The present work proposes a new algorithmic method of automated marker placement used to classify six facial expressions: happiness, sadness, anger, fear, disgust, and surprise. Emotional facial expressions were captured using a webcam, while the proposed algorithm placed a set of eight virtual markers on each subject's face. Facial feature extraction methods, including marker distance (distance between each marker to the center of the face) and change in marker distance (change in distance between the original and new marker positions), were used to extract three statistical features (mean, variance, and root mean square) from the real-time video sequence. The initial position of each marker was subjected to the optical flow algorithm for marker tracking with each emotional facial expression. Finally, the extracted statistical features were mapped into corresponding emotional facial expressions using two simple non-linear classifiers, K-nearest neighbor and probabilistic neural network. The results indicate that the proposed automated marker placement algorithm effectively placed eight virtual markers on each subject's face and gave a maximum mean emotion classification rate of 96.94% using the probabilistic neural network.
Use of the recognition heuristic depends on the domain's recognition validity, not on the recognition validity of selected sets of objects.

PubMed

Pohl, Rüdiger F; Michalkiewicz, Martha; Erdfelder, Edgar; Hilbig, Benjamin E

2017-07-01

According to the recognition-heuristic theory, decision makers solve paired comparisons in which one object is recognized and the other not by recognition alone, inferring that recognized objects have higher criterion values than unrecognized ones. However, success-and thus usefulness-of this heuristic depends on the validity of recognition as a cue, and adaptive decision making, in turn, requires that decision makers are sensitive to it. To this end, decision makers could base their evaluation of the recognition validity either on the selected set of objects (the set's recognition validity), or on the underlying domain from which the objects were drawn (the domain's recognition validity). In two experiments, we manipulated the recognition validity both in the selected set of objects and between domains from which the sets were drawn. The results clearly show that use of the recognition heuristic depends on the domain's recognition validity, not on the set's recognition validity. In other words, participants treat all sets as roughly representative of the underlying domain and adjust their decision strategy adaptively (only) with respect to the more general environment rather than the specific items they are faced with.
Permutation coding technique for image recognition systems.

PubMed

Kussul, Ernst M; Baidyk, Tatiana N; Wunsch, Donald C; Makeyev, Oleksandr; Martín, Anabel

2006-11-01

A feature extractor and neural classifier for image recognition systems are proposed. The proposed feature extractor is based on the concept of random local descriptors (RLDs). It is followed by the encoder that is based on the permutation coding technique that allows to take into account not only detected features but also the position of each feature on the image and to make the recognition process invariant to small displacements. The combination of RLDs and permutation coding permits us to obtain a sufficiently general description of the image to be recognized. The code generated by the encoder is used as an input data for the neural classifier. Different types of images were used to test the proposed image recognition system. It was tested in the handwritten digit recognition problem, the face recognition problem, and the microobject shape recognition problem. The results of testing are very promising. The error rate for the Modified National Institute of Standards and Technology (MNIST) database is 0.44% and for the Olivetti Research Laboratory (ORL) database it is 0.1%.
Kruskal-Wallis-based computationally efficient feature selection for face recognition.

PubMed

Ali Khan, Sajid; Hussain, Ayyaz; Basit, Abdul; Akram, Sheeraz

2014-01-01

Face recognition in today's technological world, and face recognition applications attain much more importance. Most of the existing work used frontal face images to classify face image. However these techniques fail when applied on real world face images. The proposed technique effectively extracts the prominent facial features. Most of the features are redundant and do not contribute to representing face. In order to eliminate those redundant features, computationally efficient algorithm is used to select the more discriminative face features. Extracted features are then passed to classification step. In the classification step, different classifiers are ensemble to enhance the recognition accuracy rate as single classifier is unable to achieve the high accuracy. Experiments are performed on standard face database images and results are compared with existing techniques.
Score level fusion scheme based on adaptive local Gabor features for face-iris-fingerprint multimodal biometric

NASA Astrophysics Data System (ADS)

He, Fei; Liu, Yuanning; Zhu, Xiaodong; Huang, Chun; Han, Ye; Chen, Ying

2014-05-01

A multimodal biometric system has been considered a promising technique to overcome the defects of unimodal biometric systems. We have introduced a fusion scheme to gain a better understanding and fusion method for a face-iris-fingerprint multimodal biometric system. In our case, we use particle swarm optimization to train a set of adaptive Gabor filters in order to achieve the proper Gabor basic functions for each modality. For a closer analysis of texture information, two different local Gabor features for each modality are produced by the corresponding Gabor coefficients. Next, all matching scores of the two Gabor features for each modality are projected to a single-scalar score via a trained, supported, vector regression model for a final decision. A large-scale dataset is formed to validate the proposed scheme using the Facial Recognition Technology database-fafb and CASIA-V3-Interval together with FVC2004-DB2a datasets. The experimental results demonstrate that as well as achieving further powerful local Gabor features of multimodalities and obtaining better recognition performance by their fusion strategy, our architecture also outperforms some state-of-the-art individual methods and other fusion approaches for face-iris-fingerprint multimodal biometric systems.
Identity Recognition Algorithm Using Improved Gabor Feature Selection of Gait Energy Image

NASA Astrophysics Data System (ADS)

Chao, LIANG; Ling-yao, JIA; Dong-cheng, SHI

2017-01-01

This paper describes an effective gait recognition approach based on Gabor features of gait energy image. In this paper, the kernel Fisher analysis combined with kernel matrix is proposed to select dominant features. The nearest neighbor classifier based on whitened cosine distance is used to discriminate different gait patterns. The approach proposed is tested on the CASIA and USF gait databases. The results show that our approach outperforms other state of gait recognition approaches in terms of recognition accuracy and robustness.
A fast 3-D object recognition algorithm for the vision system of a special-purpose dexterous manipulator

NASA Technical Reports Server (NTRS)

Hung, Stephen H. Y.

1989-01-01

A fast 3-D object recognition algorithm that can be used as a quick-look subsystem to the vision system for the Special-Purpose Dexterous Manipulator (SPDM) is described. Global features that can be easily computed from range data are used to characterize the images of a viewer-centered model of an object. This algorithm will speed up the processing by eliminating the low level processing whenever possible. It may identify the object, reject a set of bad data in the early stage, or create a better environment for a more powerful algorithm to carry the work further.
SIR-A imagery in geologic studies of the Sierra Madre Oriental, northeastern Mexico. Part 1 (Regional stratigraphy): The use of morphostratigraphic units in remote sensing mapping

NASA Technical Reports Server (NTRS)

Longoria, J. F.; Jimenez, O. H.

1985-01-01

SIR-A imaging was used in geological studies of sedimentary terrains in the Sierra Madre Oriental, northeastern Mexico. Geological features such as regional strike and dip, bedding, folding and faulting were readily detected on the image. The recognition of morphostructural units in the imagery, coupled with field verification, enabled geological mapping of the region at the scale of 1:250 000. Structural profiling lead to the elaboration of a morphostructural map allowing the recognition of an echelon folds and field trends which were used to postulate the ectonic setting of the region.
Speaker emotion recognition: from classical classifiers to deep neural networks

NASA Astrophysics Data System (ADS)

Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

2018-04-01

Speaker emotion recognition is considered among the most challenging tasks in recent years. In fact, automatic systems for security, medicine or education can be improved when considering the speech affective state. In this paper, a twofold approach for speech emotion classification is proposed. At the first side, a relevant set of features is adopted, and then at the second one, numerous supervised training techniques, involving classic methods as well as deep learning, are experimented. Experimental results indicate that deep architecture can improve classification performance on two affective databases, the Berlin Dataset of Emotional Speech and the SAVEE Dataset Surrey Audio-Visual Expressed Emotion.
Method and System for Object Recognition Search

NASA Technical Reports Server (NTRS)

Duong, Tuan A. (Inventor); Duong, Vu A. (Inventor); Stubberud, Allen R. (Inventor)

2012-01-01

A method for object recognition using shape and color features of the object to be recognized. An adaptive architecture is used to recognize and adapt the shape and color features for moving objects to enable object recognition.
Developing a benchmark for emotional analysis of music

PubMed Central

Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the ‘Emotion in Music’ task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER. PMID:28282400
Developing a benchmark for emotional analysis of music.

PubMed

Aljanaki, Anna; Yang, Yi-Hsuan; Soleymani, Mohammad

2017-01-01

Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the 'Emotion in Music' task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER.
Learning Human Actions by Combining Global Dynamics and Local Appearance.

PubMed

Luo, Guan; Yang, Shuang; Tian, Guodong; Yuan, Chunfeng; Hu, Weiming; Maybank, Stephen J

2014-12-01

In this paper, we address the problem of human action recognition through combining global temporal dynamics and local visual spatio-temporal appearance features. For this purpose, in the global temporal dimension, we propose to model the motion dynamics with robust linear dynamical systems (LDSs) and use the model parameters as motion descriptors. Since LDSs live in a non-Euclidean space and the descriptors are in non-vector form, we propose a shift invariant subspace angles based distance to measure the similarity between LDSs. In the local visual dimension, we construct curved spatio-temporal cuboids along the trajectories of densely sampled feature points and describe them using histograms of oriented gradients (HOG). The distance between motion sequences is computed with the Chi-Squared histogram distance in the bag-of-words framework. Finally we perform classification using the maximum margin distance learning method by combining the global dynamic distances and the local visual distances. We evaluate our approach for action recognition on five short clips data sets, namely Weizmann, KTH, UCF sports, Hollywood2 and UCF50, as well as three long continuous data sets, namely VIRAT, ADL and CRIM13. We show competitive results as compared with current state-of-the-art methods.
Processing statistics: an examination of focused and distributed attention using event related potentials.

PubMed

Baijal, Shruti; Nakatani, Chie; van Leeuwen, Cees; Srinivasan, Narayanan

2013-06-07

Human observers show remarkable efficiency in statistical estimation; they are able, for instance, to estimate the mean size of visual objects, even if their number exceeds the capacity limits of focused attention. This ability has been understood as the result of a distinct mode of attention, i.e. distributed attention. Compared to the focused attention mode, working memory representations under distributed attention are proposed to be more compressed, leading to reduced working memory loads. An alternate proposal is that distributed attention uses less structured, feature-level representations. These would fill up working memory (WM) more, even when target set size is low. Using event-related potentials, we compared WM loading in a typical distributed attention task (mean size estimation) to that in a corresponding focused attention task (object recognition), using a measure called contralateral delay activity (CDA). Participants performed both tasks on 2, 4, or 8 different-sized target disks. In the recognition task, CDA amplitude increased with set size; notably, however, in the mean estimation task the CDA amplitude was high regardless of set size. In particular for set-size 2, the amplitude was higher in the mean estimation task than in the recognition task. The result showed that the task involves full WM loading even with a low target set size. This suggests that in the distributed attention mode, representations are not compressed, but rather less structured than under focused attention conditions. Copyright © 2012 Elsevier Ltd. All rights reserved.
Free-Form Region Description with Second-Order Pooling.

PubMed

Carreira, João; Caseiro, Rui; Batista, Jorge; Sminchisescu, Cristian

2015-06-01

Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing.
Robust recognition of loud and Lombard speech in the fighter cockpit environment

NASA Astrophysics Data System (ADS)

Stanton, Bill J., Jr.

1988-08-01

There are a number of challenges associated with incorporating speech recognition technology into the fighter cockpit. One of the major problems is the wide range of variability in the pilot's voice. That can result from changing levels of stress and workload. Increasing the training set to include abnormal speech is not an attractive option because of the innumerable conditions that would have to be represented and the inordinate amount of time to collect such a training set. A more promising approach is to study subsets of abnormal speech that have been produced under controlled cockpit conditions with the purpose of characterizing reliable shifts that occur relative to normal speech. Such was the initiative of this research. Analyses were conducted for 18 features on 17671 phoneme tokens across eight speakers for normal, loud, and Lombard speech. It was discovered that there was a consistent migration of energy in the sonorants. This discovery of reliable energy shifts led to the development of a method to reduce or eliminate these shifts in the Euclidean distances between LPC log magnitude spectra. This combination significantly improved recognition performance of loud and Lombard speech. Discrepancies in recognition error rates between normal and abnormal speech were reduced by approximately 50 percent for all eight speakers combined.
Influence of time and length size feature selections for human activity sequences recognition.

PubMed

Fang, Hongqing; Chen, Long; Srinivasan, Raghavendiran

2014-01-01

In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances. © 2013 ISA Published by ISA All rights reserved.
Automated Detection of Stereotypical Motor Movements in Autism Spectrum Disorder Using Recurrence Quantification Analysis

PubMed Central

Großekathöfer, Ulf; Manyakov, Nikolay V.; Mihajlović, Vojkan; Pandina, Gahan; Skalkin, Andrew; Ness, Seth; Bangerter, Abigail; Goodwin, Matthew S.

2017-01-01

A number of recent studies using accelerometer features as input to machine learning classifiers show promising results for automatically detecting stereotypical motor movements (SMM) in individuals with Autism Spectrum Disorder (ASD). However, replicating these results across different types of accelerometers and their position on the body still remains a challenge. We introduce a new set of features in this domain based on recurrence plot and quantification analyses that are orientation invariant and able to capture non-linear dynamics of SMM. Applying these features to an existing published data set containing acceleration data, we achieve up to 9% average increase in accuracy compared to current state-of-the-art published results. Furthermore, we provide evidence that a single torso sensor can automatically detect multiple types of SMM in ASD, and that our approach allows recognition of SMM with high accuracy in individuals when using a person-independent classifier. PMID:28261082
Automated Detection of Stereotypical Motor Movements in Autism Spectrum Disorder Using Recurrence Quantification Analysis.

PubMed

Großekathöfer, Ulf; Manyakov, Nikolay V; Mihajlović, Vojkan; Pandina, Gahan; Skalkin, Andrew; Ness, Seth; Bangerter, Abigail; Goodwin, Matthew S

2017-01-01

A number of recent studies using accelerometer features as input to machine learning classifiers show promising results for automatically detecting stereotypical motor movements (SMM) in individuals with Autism Spectrum Disorder (ASD). However, replicating these results across different types of accelerometers and their position on the body still remains a challenge. We introduce a new set of features in this domain based on recurrence plot and quantification analyses that are orientation invariant and able to capture non-linear dynamics of SMM. Applying these features to an existing published data set containing acceleration data, we achieve up to 9% average increase in accuracy compared to current state-of-the-art published results. Furthermore, we provide evidence that a single torso sensor can automatically detect multiple types of SMM in ASD, and that our approach allows recognition of SMM with high accuracy in individuals when using a person-independent classifier.
Automated diagnosis of fetal alcohol syndrome using 3D facial image analysis

PubMed Central

Fang, Shiaofen; McLaughlin, Jason; Fang, Jiandong; Huang, Jeffrey; Autti-Rämö, Ilona; Fagerlund, Åse; Jacobson, Sandra W.; Robinson, Luther K.; Hoyme, H. Eugene; Mattson, Sarah N.; Riley, Edward; Zhou, Feng; Ward, Richard; Moore, Elizabeth S.; Foroud, Tatiana

2012-01-01

Objectives Use three-dimensional (3D) facial laser scanned images from children with fetal alcohol syndrome (FAS) and controls to develop an automated diagnosis technique that can reliably and accurately identify individuals prenatally exposed to alcohol. Methods A detailed dysmorphology evaluation, history of prenatal alcohol exposure, and 3D facial laser scans were obtained from 149 individuals (86 FAS; 63 Control) recruited from two study sites (Cape Town, South Africa and Helsinki, Finland). Computer graphics, machine learning, and pattern recognition techniques were used to automatically identify a set of facial features that best discriminated individuals with FAS from controls in each sample. Results An automated feature detection and analysis technique was developed and applied to the two study populations. A unique set of facial regions and features were identified for each population that accurately discriminated FAS and control faces without any human intervention. Conclusion Our results demonstrate that computer algorithms can be used to automatically detect facial features that can discriminate FAS and control faces. PMID:18713153

Structure and weights optimisation of a modified Elman network emotion classifier using hybrid computational intelligence algorithms: a comparative study

NASA Astrophysics Data System (ADS)

Sheikhan, Mansour; Abbasnezhad Arabi, Mahdi; Gharavian, Davood

2015-10-01

Artificial neural networks are efficient models in pattern recognition applications, but their performance is dependent on employing suitable structure and connection weights. This study used a hybrid method for obtaining the optimal weight set and architecture of a recurrent neural emotion classifier based on gravitational search algorithm (GSA) and its binary version (BGSA), respectively. By considering the features of speech signal that were related to prosody, voice quality, and spectrum, a rich feature set was constructed. To select more efficient features, a fast feature selection method was employed. The performance of the proposed hybrid GSA-BGSA method was compared with similar hybrid methods based on particle swarm optimisation (PSO) algorithm and its binary version, PSO and discrete firefly algorithm, and hybrid of error back-propagation and genetic algorithm that were used for optimisation. Experimental tests on Berlin emotional database demonstrated the superior performance of the proposed method using a lighter network structure.
Using an Improved SIFT Algorithm and Fuzzy Closed-Loop Control Strategy for Object Recognition in Cluttered Scenes

PubMed Central

Nie, Haitao; Long, Kehui; Ma, Jun; Yue, Dan; Liu, Jinguo

2015-01-01

Partial occlusions, large pose variations, and extreme ambient illumination conditions generally cause the performance degradation of object recognition systems. Therefore, this paper presents a novel approach for fast and robust object recognition in cluttered scenes based on an improved scale invariant feature transform (SIFT) algorithm and a fuzzy closed-loop control method. First, a fast SIFT algorithm is proposed by classifying SIFT features into several clusters based on several attributes computed from the sub-orientation histogram (SOH), in the feature matching phase only features that share nearly the same corresponding attributes are compared. Second, a feature matching step is performed following a prioritized order based on the scale factor, which is calculated between the object image and the target object image, guaranteeing robust feature matching. Finally, a fuzzy closed-loop control strategy is applied to increase the accuracy of the object recognition and is essential for autonomous object manipulation process. Compared to the original SIFT algorithm for object recognition, the result of the proposed method shows that the number of SIFT features extracted from an object has a significant increase, and the computing speed of the object recognition processes increases by more than 40%. The experimental results confirmed that the proposed method performs effectively and accurately in cluttered scenes. PMID:25714094
Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction

PubMed Central

Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung

2017-01-01

Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images. PMID:28335510
Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction.

PubMed

Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung

2017-03-20

Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images.
Adoption of Speech Recognition Technology in Community Healthcare Nursing.

PubMed

Al-Masslawi, Dawood; Block, Lori; Ronquillo, Charlene

2016-01-01

Adoption of new health information technology is shown to be challenging. However, the degree to which new technology will be adopted can be predicted by measures of usefulness and ease of use. In this work these key determining factors are focused on for design of a wound documentation tool. In the context of wound care at home, consistent with evidence in the literature from similar settings, use of Speech Recognition Technology (SRT) for patient documentation has shown promise. To achieve a user-centred design, the results from a conducted ethnographic fieldwork are used to inform SRT features; furthermore, exploratory prototyping is used to collect feedback about the wound documentation tool from home care nurses. During this study, measures developed for healthcare applications of the Technology Acceptance Model will be used, to identify SRT features that improve usefulness (e.g. increased accuracy, saving time) or ease of use (e.g. lowering mental/physical effort, easy to remember tasks). The identified features will be used to create a low fidelity prototype that will be evaluated in future experiments.
Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.

PubMed

Wu, Yonghui; Jiang, Min; Lei, Jianbo; Xu, Hua

2015-01-01

Rapid growth in electronic health records (EHRs) use has led to an unprecedented expansion of available clinical data in electronic formats. However, much of the important healthcare information is locked in the narrative documents. Therefore Natural Language Processing (NLP) technologies, e.g., Named Entity Recognition that identifies boundaries and types of entities, has been extensively studied to unlock important clinical information in free text. In this study, we investigated a novel deep learning method to recognize clinical entities in Chinese clinical documents using the minimal feature engineering approach. We developed a deep neural network (DNN) to generate word embeddings from a large unlabeled corpus through unsupervised learning and another DNN for the NER task. The experiment results showed that the DNN with word embeddings trained from the large unlabeled corpus outperformed the state-of-the-art CRF's model in the minimal feature engineering setting, achieving the highest F1-score of 0.9280. Further analysis showed that word embeddings derived through unsupervised learning from large unlabeled corpus remarkably improved the DNN with randomized embedding, denoting the usefulness of unsupervised feature learning.
Subjective face recognition difficulties, aberrant sensibility, sleeping disturbances and aberrant eating habits in families with Asperger syndrome

PubMed Central

Nieminen-von Wendt, Taina; Paavonen, Juulia E; Ylisaukko-Oja, Tero; Sarenius, Susan; Källman, Tiia; Järvelä, Irma; von Wendt, Lennart

2005-01-01

Background The present study was undertaken in order to determine whether a set of clinical features, which are not included in the DSM-IV or ICD-10 for Asperger Syndrome (AS), are associated with AS in particular or whether they are merely a familial trait that is not related to the diagnosis. Methods Ten large families, a total of 138 persons, of whom 58 individuals fulfilled the diagnostic criteria for AS and another 56 did not to fulfill these criteria, were studied using a structured interview focusing on the possible presence of face recognition difficulties, aberrant sensibility and eating habits and sleeping disturbances. Results The prevalence for face recognition difficulties was 46.6% in individuals with AS compared with 10.7% in the control group. The corresponding figures for subjectively reported presence of aberrant sensibilities were 91.4% and 46.6%, for sleeping disturbances 48.3% and 23.2% and for aberrant eating habits 60.3% and 14.3%, respectively. Conclusion An aberrant processing of sensory information appears to be a common feature in AS. The impact of these and other clinical features that are not incorporated in the ICD-10 and DSM-IV on our understanding of AS may hitherto have been underestimated. These associated clinical traits may well be reflected by the behavioural characteristics of these individuals. PMID:15826308
Emotion Recognition from Single-Trial EEG Based on Kernel Fisher's Emotion Pattern and Imbalanced Quasiconformal Kernel Support Vector Machine

PubMed Central

Liu, Yi-Hung; Wu, Chien-Te; Cheng, Wei-Teng; Hsiao, Yu-Tsung; Chen, Po-Ming; Teng, Jyh-Tong

2014-01-01

Electroencephalogram-based emotion recognition (EEG-ER) has received increasing attention in the fields of health care, affective computing, and brain-computer interface (BCI). However, satisfactory ER performance within a bi-dimensional and non-discrete emotional space using single-trial EEG data remains a challenging task. To address this issue, we propose a three-layer scheme for single-trial EEG-ER. In the first layer, a set of spectral powers of different EEG frequency bands are extracted from multi-channel single-trial EEG signals. In the second layer, the kernel Fisher's discriminant analysis method is applied to further extract features with better discrimination ability from the EEG spectral powers. The feature vector produced by layer 2 is called a kernel Fisher's emotion pattern (KFEP), and is sent into layer 3 for further classification where the proposed imbalanced quasiconformal kernel support vector machine (IQK-SVM) serves as the emotion classifier. The outputs of the three layer EEG-ER system include labels of emotional valence and arousal. Furthermore, to collect effective training and testing datasets for the current EEG-ER system, we also use an emotion-induction paradigm in which a set of pictures selected from the International Affective Picture System (IAPS) are employed as emotion induction stimuli. The performance of the proposed three-layer solution is compared with that of other EEG spectral power-based features and emotion classifiers. Results on 10 healthy participants indicate that the proposed KFEP feature performs better than other spectral power features, and IQK-SVM outperforms traditional SVM in terms of the EEG-ER accuracy. Our findings also show that the proposed EEG-ER scheme achieves the highest classification accuracies of valence (82.68%) and arousal (84.79%) among all testing methods. PMID:25061837
Emotion recognition from single-trial EEG based on kernel Fisher's emotion pattern and imbalanced quasiconformal kernel support vector machine.

PubMed

Liu, Yi-Hung; Wu, Chien-Te; Cheng, Wei-Teng; Hsiao, Yu-Tsung; Chen, Po-Ming; Teng, Jyh-Tong

2014-07-24

Electroencephalogram-based emotion recognition (EEG-ER) has received increasing attention in the fields of health care, affective computing, and brain-computer interface (BCI). However, satisfactory ER performance within a bi-dimensional and non-discrete emotional space using single-trial EEG data remains a challenging task. To address this issue, we propose a three-layer scheme for single-trial EEG-ER. In the first layer, a set of spectral powers of different EEG frequency bands are extracted from multi-channel single-trial EEG signals. In the second layer, the kernel Fisher's discriminant analysis method is applied to further extract features with better discrimination ability from the EEG spectral powers. The feature vector produced by layer 2 is called a kernel Fisher's emotion pattern (KFEP), and is sent into layer 3 for further classification where the proposed imbalanced quasiconformal kernel support vector machine (IQK-SVM) serves as the emotion classifier. The outputs of the three layer EEG-ER system include labels of emotional valence and arousal. Furthermore, to collect effective training and testing datasets for the current EEG-ER system, we also use an emotion-induction paradigm in which a set of pictures selected from the International Affective Picture System (IAPS) are employed as emotion induction stimuli. The performance of the proposed three-layer solution is compared with that of other EEG spectral power-based features and emotion classifiers. Results on 10 healthy participants indicate that the proposed KFEP feature performs better than other spectral power features, and IQK-SVM outperforms traditional SVM in terms of the EEG-ER accuracy. Our findings also show that the proposed EEG-ER scheme achieves the highest classification accuracies of valence (82.68%) and arousal (84.79%) among all testing methods.
Recognition of rotated images using the multi-valued neuron and rotation-invariant 2D Fourier descriptors

NASA Astrophysics Data System (ADS)

Aizenberg, Evgeni; Bigio, Irving J.; Rodriguez-Diaz, Eladio

2012-03-01

The Fourier descriptors paradigm is a well-established approach for affine-invariant characterization of shape contours. In the work presented here, we extend this method to images, and obtain a 2D Fourier representation that is invariant to image rotation. The proposed technique retains phase uniqueness, and therefore structural image information is not lost. Rotation-invariant phase coefficients were used to train a single multi-valued neuron (MVN) to recognize satellite and human face images rotated by a wide range of angles. Experiments yielded 100% and 96.43% classification rate for each data set, respectively. Recognition performance was additionally evaluated under effects of lossy JPEG compression and additive Gaussian noise. Preliminary results show that the derived rotation-invariant features combined with the MVN provide a promising scheme for efficient recognition of rotated images.
Image ratio features for facial expression recognition application.

PubMed

Song, Mingli; Tao, Dacheng; Liu, Zicheng; Li, Xuelong; Zhou, Mengchu

2010-06-01

Video-based facial expression recognition is a challenging problem in computer vision and human-computer interaction. To target this problem, texture features have been extracted and widely used, because they can capture image intensity changes raised by skin deformation. However, existing texture features encounter problems with albedo and lighting variations. To solve both problems, we propose a new texture feature called image ratio features. Compared with previously proposed texture features, e.g., high gradient component features, image ratio features are more robust to albedo and lighting variations. In addition, to further improve facial expression recognition accuracy based on image ratio features, we combine image ratio features with facial animation parameters (FAPs), which describe the geometric motions of facial feature points. The performance evaluation is based on the Carnegie Mellon University Cohn-Kanade database, our own database, and the Japanese Female Facial Expression database. Experimental results show that the proposed image ratio feature is more robust to albedo and lighting variations, and the combination of image ratio features and FAPs outperforms each feature alone. In addition, we study asymmetric facial expressions based on our own facial expression database and demonstrate the superior performance of our combined expression recognition system.
Multi-font printed Mongolian document recognition system

NASA Astrophysics Data System (ADS)

Peng, Liangrui; Liu, Changsong; Ding, Xiaoqing; Wang, Hua; Jin, Jianming

2009-01-01

Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some characteristics, for example, one character may be part of another character, we define the character set for recognition according to the segmented components, and the components are combined into characters by rule-based post-processing module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented. For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and connected components. As Mongolian has different font-types which are categorized into two major groups, the parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant character recognition kernels are integrated. Experiments show that the presented methods are effective. The text recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
Shape and Color Features for Object Recognition Search

NASA Technical Reports Server (NTRS)

Duong, Tuan A.; Duong, Vu A.; Stubberud, Allen R.

2012-01-01

A bio-inspired shape feature of an object of interest emulates the integration of the saccadic eye movement and horizontal layer in vertebrate retina for object recognition search where a single object can be used one at a time. The optimal computational model for shape-extraction-based principal component analysis (PCA) was also developed to reduce processing time and enable the real-time adaptive system capability. A color feature of the object is employed as color segmentation to empower the shape feature recognition to solve the object recognition in the heterogeneous environment where a single technique - shape or color - may expose its difficulties. To enable the effective system, an adaptive architecture and autonomous mechanism were developed to recognize and adapt the shape and color feature of the moving object. The bio-inspired object recognition based on bio-inspired shape and color can be effective to recognize a person of interest in the heterogeneous environment where the single technique exposed its difficulties to perform effective recognition. Moreover, this work also demonstrates the mechanism and architecture of the autonomous adaptive system to enable the realistic system for the practical use in the future.
Biometric iris image acquisition system with wavefront coding technology

NASA Astrophysics Data System (ADS)

Hsieh, Sheng-Hsun; Yang, Hsi-Wen; Huang, Shao-Hung; Li, Yung-Hui; Tien, Chung-Hao

2013-09-01

Biometric signatures for identity recognition have been practiced for centuries. Basically, the personal attributes used for a biometric identification system can be classified into two areas: one is based on physiological attributes, such as DNA, facial features, retinal vasculature, fingerprint, hand geometry, iris texture and so on; the other scenario is dependent on the individual behavioral attributes, such as signature, keystroke, voice and gait style. Among these features, iris recognition is one of the most attractive approaches due to its nature of randomness, texture stability over a life time, high entropy density and non-invasive acquisition. While the performance of iris recognition on high quality image is well investigated, not too many studies addressed that how iris recognition performs subject to non-ideal image data, especially when the data is acquired in challenging conditions, such as long working distance, dynamical movement of subjects, uncontrolled illumination conditions and so on. There are three main contributions in this paper. Firstly, the optical system parameters, such as magnification and field of view, was optimally designed through the first-order optics. Secondly, the irradiance constraints was derived by optical conservation theorem. Through the relationship between the subject and the detector, we could estimate the limitation of working distance when the camera lens and CCD sensor were known. The working distance is set to 3m in our system with pupil diameter 86mm and CCD irradiance 0.3mW/cm2. Finally, We employed a hybrid scheme combining eye tracking with pan and tilt system, wavefront coding technology, filter optimization and post signal recognition to implement a robust iris recognition system in dynamic operation. The blurred image was restored to ensure recognition accuracy over 3m working distance with 400mm focal length and aperture F/6.3 optics. The simulation result as well as experiment validates the proposed code apertured imaging system, where the imaging volume was 2.57 times extended over the traditional optics, while keeping sufficient recognition accuracy.
Robust Indoor Human Activity Recognition Using Wireless Signals.

PubMed

Wang, Yi; Jiang, Xinli; Cao, Rongyu; Wang, Xiyang

2015-07-15

Wireless signals-based activity detection and recognition technology may be complementary to the existing vision-based methods, especially under the circumstance of occlusions, viewpoint change, complex background, lighting condition change, and so on. This paper explores the properties of the channel state information (CSI) of Wi-Fi signals, and presents a robust indoor daily human activity recognition framework with only one pair of transmission points (TP) and access points (AP). First of all, some indoor human actions are selected as primitive actions forming a training set. Then, an online filtering method is designed to make actions' CSI curves smooth and allow them to contain enough pattern information. Each primitive action pattern can be segmented from the outliers of its multi-input multi-output (MIMO) signals by a proposed segmentation method. Lastly, in online activities recognition, by selecting proper features and Support Vector Machine (SVM) based multi-classification, activities constituted by primitive actions can be recognized insensitive to the locations, orientations, and speeds.
Eye movement analysis for activity recognition using electrooculography.

PubMed

Bulling, Andreas; Ward, Jamie A; Gellersen, Hans; Tröster, Gerhard

2011-04-01

In this work, we investigate eye movement analysis as a new sensing modality for activity recognition. Eye movement data were recorded using an electrooculography (EOG) system. We first describe and evaluate algorithms for detecting three eye movement characteristics from EOG signals-saccades, fixations, and blinks-and propose a method for assessing repetitive patterns of eye movements. We then devise 90 different features based on these characteristics and select a subset of them using minimum redundancy maximum relevance (mRMR) feature selection. We validate the method using an eight participant study in an office environment using an example set of five activity classes: copying a text, reading a printed paper, taking handwritten notes, watching a video, and browsing the Web. We also include periods with no specific activity (the NULL class). Using a support vector machine (SVM) classifier and person-independent (leave-one-person-out) training, we obtain an average precision of 76.1 percent and recall of 70.5 percent over all classes and participants. The work demonstrates the promise of eye-based activity recognition (EAR) and opens up discussion on the wider applicability of EAR to other activities that are difficult, or even impossible, to detect using common sensing modalities.
Using Ontologies for the Online Recognition of Activities of Daily Living†

PubMed Central

2018-01-01

The recognition of activities of daily living is an important research area of interest in recent years. The process of activity recognition aims to recognize the actions of one or more people in a smart environment, in which a set of sensors has been deployed. Usually, all the events produced during each activity are taken into account to develop the classification models. However, the instant in which an activity started is unknown in a real environment. Therefore, only the most recent events are usually used. In this paper, we use statistics to determine the most appropriate length of that interval for each type of activity. In addition, we use ontologies to automatically generate features that serve as the input for the supervised learning algorithms that produce the classification model. The features are formed by combining the entities in the ontology, such as concepts and properties. The results obtained show a significant increase in the accuracy of the classification models generated with respect to the classical approach, in which only the state of the sensors is taken into account. Moreover, the results obtained in a simulation of a real environment under an event-based segmentation also show an improvement in most activities. PMID:29662011
Dissociations among functional subsystems governing melody recognition after right-hemisphere damage.

PubMed

Steinke, W R; Cuddy, L L; Jakobson, L S

2001-07-01

This study describes an amateur musician, KB, who became amusic following a right-hemisphere stroke. A series of assessments conducted post-stroke revealed that KB functioned in the normal range for most verbal skills. However, compared with controls matched in age and music training, KB showed severe loss of pitch and rhythmic processing abilities. His ability to recognise and identify familiar instrumental melodies was also lost. Despite these deficits, KB performed remarkably well when asked to recognise and identify familiar song melodies presented without accompanying lyrics. This dissociation between the ability to recognise/identify song vs. instrumental melodies was replicated across different sets of musical materials, including newly learned melodies. Analyses of the acoustical and musical features of song and instrumental melodies discounted an explanation of the dissociation based on these features alone. Rather, the results suggest a functional dissociation resulting from a focal brain lesion. We propose that, in the case of song melodies, there remains sufficient activation in KB's melody analysis system to coactivate an intact representation of both associative information and the lyrics in the speech lexicon, making recognition and identification possible. In the case of instrumental melodies, no such associative processes exist; thus recognition and identification do not occur.
Clustering of Farsi sub-word images for whole-book recognition

NASA Astrophysics Data System (ADS)

Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier

2015-01-01

Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.
Adversarial Feature Selection Against Evasion Attacks.

PubMed

Zhang, Fei; Chan, Patrick P K; Biggio, Battista; Yeung, Daniel S; Roli, Fabio

2016-03-01

Pattern recognition and machine learning techniques have been increasingly adopted in adversarial settings such as spam, intrusion, and malware detection, although their security against well-crafted attacks that aim to evade detection by manipulating data at test time has not yet been thoroughly assessed. While previous work has been mainly focused on devising adversary-aware classification algorithms to counter evasion attempts, only few authors have considered the impact of using reduced feature sets on classifier security against the same attacks. An interesting, preliminary result is that classifier security to evasion may be even worsened by the application of feature selection. In this paper, we provide a more detailed investigation of this aspect, shedding some light on the security properties of feature selection against evasion attacks. Inspired by previous work on adversary-aware classifiers, we propose a novel adversary-aware feature selection model that can improve classifier security against evasion attacks, by incorporating specific assumptions on the adversary's data manipulation strategy. We focus on an efficient, wrapper-based implementation of our approach, and experimentally validate its soundness on different application examples, including spam and malware detection.

[State Recognition of Solid Fermentation Process Based on Near Infrared Spectroscopy with Adaboost and Spectral Regression Discriminant Analysis].

PubMed

Yu, Shuang; Liu, Guo-hai; Xia, Rong-sheng; Jiang, Hui

2016-01-01

In order to achieve the rapid monitoring of process state of solid state fermentation (SSF), this study attempted to qualitative identification of process state of SSF of feed protein by use of Fourier transform near infrared (FT-NIR) spectroscopy analysis technique. Even more specifically, the FT-NIR spectroscopy combined with Adaboost-SRDA-NN integrated learning algorithm as an ideal analysis tool was used to accurately and rapidly monitor chemical and physical changes in SSF of feed protein without the need for chemical analysis. Firstly, the raw spectra of all the 140 fermentation samples obtained were collected by use of Fourier transform near infrared spectrometer (Antaris II), and the raw spectra obtained were preprocessed by use of standard normal variate transformation (SNV) spectral preprocessing algorithm. Thereafter, the characteristic information of the preprocessed spectra was extracted by use of spectral regression discriminant analysis (SRDA). Finally, nearest neighbors (NN) algorithm as a basic classifier was selected and building state recognition model to identify different fermentation samples in the validation set. Experimental results showed as follows: the SRDA-NN model revealed its superior performance by compared with other two different NN models, which were developed by use of the feature information form principal component analysis (PCA) and linear discriminant analysis (LDA), and the correct recognition rate of SRDA-NN model achieved 94.28% in the validation set. In this work, in order to further improve the recognition accuracy of the final model, Adaboost-SRDA-NN ensemble learning algorithm was proposed by integrated the Adaboost and SRDA-NN methods, and the presented algorithm was used to construct the online monitoring model of process state of SSF of feed protein. Experimental results showed as follows: the prediction performance of SRDA-NN model has been further enhanced by use of Adaboost lifting algorithm, and the correct recognition rate of the Adaboost-SRDA-NN model achieved 100% in the validation set. The overall results demonstrate that SRDA algorithm can effectively achieve the spectral feature information extraction to the spectral dimension reduction in model calibration process of qualitative analysis of NIR spectroscopy. In addition, the Adaboost lifting algorithm can improve the classification accuracy of the final model. The results obtained in this work can provide research foundation for developing online monitoring instruments for the monitoring of SSF process.
Auditory emotion recognition impairments in Schizophrenia: Relationship to acoustic features and cognition

PubMed Central

Gold, Rinat; Butler, Pamela; Revheim, Nadine; Leitman, David; Hansen, John A.; Gur, Ruben; Kantrowitz, Joshua T.; Laukka, Petri; Juslin, Patrik N.; Silipo, Gail S.; Javitt, Daniel C.

2013-01-01

Objective Schizophrenia is associated with deficits in ability to perceive emotion based upon tone of voice. The basis for this deficit, however, remains unclear and assessment batteries remain limited. We evaluated performance in schizophrenia on a novel voice emotion recognition battery with well characterized physical features, relative to impairments in more general emotional and cognitive function. Methods We studied in a primary sample of 92 patients relative to 73 controls. Stimuli were characterized according to both intended emotion and physical features (e.g., pitch, intensity) that contributed to the emotional percept. Parallel measures of visual emotion recognition, pitch perception, general cognition, and overall outcome were obtained. More limited measures were obtained in an independent replication sample of 36 patients, 31 age-matched controls, and 188 general comparison subjects. Results Patients showed significant, large effect size deficits in voice emotion recognition (F=25.4, p<.00001, d=1.1), and were preferentially impaired in recognition of emotion based upon pitch-, but not intensity-features (group X feature interaction: F=7.79, p=.006). Emotion recognition deficits were significantly correlated with pitch perception impairments both across (r=56, p<.0001) and within (r=.47, p<.0001) group. Path analysis showed both sensory-specific and general cognitive contributions to auditory emotion recognition deficits in schizophrenia. Similar patterns of results were observed in the replication sample. Conclusions The present study demonstrates impairments in auditory emotion recognition in schizophrenia relative to acoustic features of underlying stimuli. Furthermore, it provides tools and highlights the need for greater attention to physical features of stimuli used for study of social cognition in neuropsychiatric disorders. PMID:22362394
Infrared face recognition based on LBP histogram and KW feature selection

NASA Astrophysics Data System (ADS)

Xie, Zhihua

2014-07-01

The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).
[Research on spectra recognition method for cabbages and weeds based on PCA and SIMCA].

PubMed

Zu, Qin; Deng, Wei; Wang, Xiu; Zhao, Chun-Jiang

2013-10-01

In order to improve the accuracy and efficiency of weed identification, the difference of spectral reflectance was employed to distinguish between crops and weeds. Firstly, the different combinations of Savitzky-Golay (SG) convolutional derivation and multiplicative scattering correction (MSC) method were applied to preprocess the raw spectral data. Then the clustering analysis of various types of plants was completed by using principal component analysis (PCA) method, and the feature wavelengths which were sensitive for classifying various types of plants were extracted according to the corresponding loading plots of the optimal principal components in PCA results. Finally, setting the feature wavelengths as the input variables, the soft independent modeling of class analogy (SIMCA) classification method was used to identify the various types of plants. The experimental results of classifying cabbages and weeds showed that on the basis of the optimal pretreatment by a synthetic application of MSC and SG convolutional derivation with SG's parameters set as 1rd order derivation, 3th degree polynomial and 51 smoothing points, 23 feature wavelengths were extracted in accordance with the top three principal components in PCA results. When SIMCA method was used for classification while the previously selected 23 feature wavelengths were set as the input variables, the classification rates of the modeling set and the prediction set were respectively up to 98.6% and 100%.
Classifier dependent feature preprocessing methods

NASA Astrophysics Data System (ADS)

Rodriguez, Benjamin M., II; Peterson, Gilbert L.

2008-04-01

In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.
Fast traffic sign recognition with a rotation invariant binary pattern based feature.

PubMed

Yin, Shouyi; Ouyang, Peng; Liu, Leibo; Guo, Yike; Wei, Shaojun

2015-01-19

Robust and fast traffic sign recognition is very important but difficult for safe driving assistance systems. This study addresses fast and robust traffic sign recognition to enhance driving safety. The proposed method includes three stages. First, a typical Hough transformation is adopted to implement coarse-grained location of the candidate regions of traffic signs. Second, a RIBP (Rotation Invariant Binary Pattern) based feature in the affine and Gaussian space is proposed to reduce the time of traffic sign detection and achieve robust traffic sign detection in terms of scale, rotation, and illumination. Third, the techniques of ANN (Artificial Neutral Network) based feature dimension reduction and classification are designed to reduce the traffic sign recognition time. Compared with the current work, the experimental results in the public datasets show that this work achieves robustness in traffic sign recognition with comparable recognition accuracy and faster processing speed, including training speed and recognition speed.
Handwritten digits recognition based on immune network

NASA Astrophysics Data System (ADS)

Li, Yangyang; Wu, Yunhui; Jiao, Lc; Wu, Jianshe

2011-11-01

With the development of society, handwritten digits recognition technique has been widely applied to production and daily life. It is a very difficult task to solve these problems in the field of pattern recognition. In this paper, a new method is presented for handwritten digit recognition. The digit samples firstly are processed and features extraction. Based on these features, a novel immune network classification algorithm is designed and implemented to the handwritten digits recognition. The proposed algorithm is developed by Jerne's immune network model for feature selection and KNN method for classification. Its characteristic is the novel network with parallel commutating and learning. The performance of the proposed method is experimented to the handwritten number datasets MNIST and compared with some other recognition algorithms-KNN, ANN and SVM algorithm. The result shows that the novel classification algorithm based on immune network gives promising performance and stable behavior for handwritten digits recognition.
Fast Traffic Sign Recognition with a Rotation Invariant Binary Pattern Based Feature

PubMed Central

Yin, Shouyi; Ouyang, Peng; Liu, Leibo; Guo, Yike; Wei, Shaojun

2015-01-01

Robust and fast traffic sign recognition is very important but difficult for safe driving assistance systems. This study addresses fast and robust traffic sign recognition to enhance driving safety. The proposed method includes three stages. First, a typical Hough transformation is adopted to implement coarse-grained location of the candidate regions of traffic signs. Second, a RIBP (Rotation Invariant Binary Pattern) based feature in the affine and Gaussian space is proposed to reduce the time of traffic sign detection and achieve robust traffic sign detection in terms of scale, rotation, and illumination. Third, the techniques of ANN (Artificial Neutral Network) based feature dimension reduction and classification are designed to reduce the traffic sign recognition time. Compared with the current work, the experimental results in the public datasets show that this work achieves robustness in traffic sign recognition with comparable recognition accuracy and faster processing speed, including training speed and recognition speed. PMID:25608217
A triaxial accelerometer-based physical-activity recognition via augmented-signal features and a hierarchical recognizer.

PubMed

Khan, Adil Mehmood; Lee, Young-Koo; Lee, Sungyoung Y; Kim, Tae-Seong

2010-09-01

Physical-activity recognition via wearable sensors can provide valuable information regarding an individual's degree of functional ability and lifestyle. In this paper, we present an accelerometer sensor-based approach for human-activity recognition. Our proposed recognition method uses a hierarchical scheme. At the lower level, the state to which an activity belongs, i.e., static, transition, or dynamic, is recognized by means of statistical signal features and artificial-neural nets (ANNs). The upper level recognition uses the autoregressive (AR) modeling of the acceleration signals, thus, incorporating the derived AR-coefficients along with the signal-magnitude area and tilt angle to form an augmented-feature vector. The resulting feature vector is further processed by the linear-discriminant analysis and ANNs to recognize a particular human activity. Our proposed activity-recognition method recognizes three states and 15 activities with an average accuracy of 97.9% using only a single triaxial accelerometer attached to the subject's chest.
Dynamic facial expression recognition based on geometric and texture features

NASA Astrophysics Data System (ADS)

Li, Ming; Wang, Zengfu

2018-04-01

Recently, dynamic facial expression recognition in videos has attracted growing attention. In this paper, we propose a novel dynamic facial expression recognition method by using geometric and texture features. In our system, the facial landmark movements and texture variations upon pairwise images are used to perform the dynamic facial expression recognition tasks. For one facial expression sequence, pairwise images are created between the first frame and each of its subsequent frames. Integration of both geometric and texture features further enhances the representation of the facial expressions. Finally, Support Vector Machine is used for facial expression recognition. Experiments conducted on the extended Cohn-Kanade database show that our proposed method can achieve a competitive performance with other methods.
Artificial neural networks for acoustic target recognition

NASA Astrophysics Data System (ADS)

Robertson, James A.; Mossing, John C.; Weber, Bruce A.

1995-04-01

Acoustic sensors can be used to detect, track and identify non-line-of-sight targets passively. Attempts to alter acoustic emissions often result in an undesirable performance degradation. This research project investigates the use of neural networks for differentiating between features extracted from the acoustic signatures of sources. Acoustic data were filtered and digitized using a commercially available analog-digital convertor. The digital data was transformed to the frequency domain for additional processing using the FFT. Narrowband peak detection algorithms were incorporated to select peaks above a user defined SNR. These peaks were then used to generate a set of robust features which relate specifically to target components in varying background conditions. The features were then used as input into a backpropagation neural network. A K-means unsupervised clustering algorithm was used to determine the natural clustering of the observations. Comparisons between a feature set consisting of the normalized amplitudes of the first 250 frequency bins of the power spectrum and a set of 11 harmonically related features were made. Initial results indicate that even though some different target types had a tendency to group in the same clusters, the neural network was able to differentiate the targets. Successful identification of acoustic sources under varying operational conditions with high confidence levels was achieved.
Design of an Adaptive Human-Machine System Based on Dynamical Pattern Recognition of Cognitive Task-Load.

PubMed

Zhang, Jianhua; Yin, Zhong; Wang, Rubin

2017-01-01

This paper developed a cognitive task-load (CTL) classification algorithm and allocation strategy to sustain the optimal operator CTL levels over time in safety-critical human-machine integrated systems. An adaptive human-machine system is designed based on a non-linear dynamic CTL classifier, which maps a set of electroencephalogram (EEG) and electrocardiogram (ECG) related features to a few CTL classes. The least-squares support vector machine (LSSVM) is used as dynamic pattern classifier. A series of electrophysiological and performance data acquisition experiments were performed on seven volunteer participants under a simulated process control task environment. The participant-specific dynamic LSSVM model is constructed to classify the instantaneous CTL into five classes at each time instant. The initial feature set, comprising 56 EEG and ECG related features, is reduced to a set of 12 salient features (including 11 EEG-related features) by using the locality preserving projection (LPP) technique. An overall correct classification rate of about 80% is achieved for the 5-class CTL classification problem. Then the predicted CTL is used to adaptively allocate the number of process control tasks between operator and computer-based controller. Simulation results showed that the overall performance of the human-machine system can be improved by using the adaptive automation strategy proposed.
Losing face: impaired discrimination of featural and configural information in the mouth region of an inverted face.

PubMed

Tanaka, James W; Kaiser, Martha D; Hagen, Simen; Pierce, Lara J

2014-05-01

Given that all faces share the same set of features-two eyes, a nose, and a mouth-that are arranged in similar configuration, recognition of a specific face must depend on our ability to discern subtle differences in its featural and configural properties. An enduring question in the face-processing literature is whether featural or configural information plays a larger role in the recognition process. To address this question, the face dimensions task was designed, in which the featural and configural properties in the upper (eye) and lower (mouth) regions of a face were parametrically and independently manipulated. In a same-different task, two faces were sequentially presented and tested in their upright or in their inverted orientation. Inversion disrupted the perception of featural size (Exp. 1), featural shape (Exp. 2), and configural changes in the mouth region, but it had relatively little effect on the discrimination of featural size and shape and configural differences in the eye region. Inversion had little effect on the perception of information in the top and bottom halves of houses (Exp. 3), suggesting that the lower-half impairment was specific to faces. Spatial cueing to the mouth region eliminated the inversion effect (Exp. 4), suggesting that participants have a bias to attend to the eye region of an inverted face. The collective findings from these experiments suggest that inversion does not differentially impair featural or configural face perceptions, but rather impairs the perception of information in the mouth region of the face.
Is the emotion recognition deficit associated with frontotemporal dementia caused by selective inattention to diagnostic facial features?

PubMed

Oliver, Lindsay D; Virani, Karim; Finger, Elizabeth C; Mitchell, Derek G V

2014-07-01

Frontotemporal dementia (FTD) is a debilitating neurodegenerative disorder characterized by severely impaired social and emotional behaviour, including emotion recognition deficits. Though fear recognition impairments seen in particular neurological and developmental disorders can be ameliorated by reallocating attention to critical facial features, the possibility that similar benefits can be conferred to patients with FTD has yet to be explored. In the current study, we examined the impact of presenting distinct regions of the face (whole face, eyes-only, and eyes-removed) on the ability to recognize expressions of anger, fear, disgust, and happiness in 24 patients with FTD and 24 healthy controls. A recognition deficit was demonstrated across emotions by patients with FTD relative to controls. Crucially, removal of diagnostic facial features resulted in an appropriate decline in performance for both groups; furthermore, patients with FTD demonstrated a lack of disproportionate improvement in emotion recognition accuracy as a result of isolating critical facial features relative to controls. Thus, unlike some neurological and developmental disorders featuring amygdala dysfunction, the emotion recognition deficit observed in FTD is not likely driven by selective inattention to critical facial features. Patients with FTD also mislabelled negative facial expressions as happy more often than controls, providing further evidence for abnormalities in the representation of positive affect in FTD. This work suggests that the emotional expression recognition deficit associated with FTD is unlikely to be rectified by adjusting selective attention to diagnostic features, as has proven useful in other select disorders. Copyright © 2014 Elsevier Ltd. All rights reserved.
HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition.

PubMed

Lagorce, Xavier; Orchard, Garrick; Galluppi, Francesco; Shi, Bertram E; Benosman, Ryad B

2017-07-01

This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
Protein fold recognition using geometric kernel data fusion.

PubMed

Zakeri, Pooya; Jeuris, Ben; Vandebril, Raf; Moreau, Yves

2014-07-01

Various approaches based on features extracted from protein sequences and often machine learning methods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernel methods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a convex linear combination of base kernels. In addition to the limitation of linear combinations, working with such approaches could cause a loss of potentially useful information. We design several techniques to combine kernel matrices by taking more involved, geometry inspired means of these matrices instead of convex linear combinations. We consider various sequence-based protein features including information extracted directly from position-specific scoring matrices and local sequence alignment. We evaluate our methods for classification on the SCOP PDB-40D benchmark dataset for protein fold recognition. The best overall accuracy on the protein fold recognition test set obtained by our methods is ∼ 86.7%. This is an improvement over the results of the best existing approach. Moreover, our computational model has been developed by incorporating the functional domain composition of proteins through a hybridization model. It is observed that by using our proposed hybridization model, the protein fold recognition accuracy is further improved to 89.30%. Furthermore, we investigate the performance of our approach on the protein remote homology detection problem by fusing multiple string kernels. The MATLAB code used for our proposed geometric kernel fusion frameworks are publicly available at http://people.cs.kuleuven.be/∼raf.vandebril/homepage/software/geomean.php?menu=5/. © The Author 2014. Published by Oxford University Press.
A framework for semisupervised feature generation and its applications in biomedical literature mining.

PubMed

Li, Yanpeng; Hu, Xiaohua; Lin, Hongfei; Yang, Zhihao

2011-01-01

Feature representation is essential to machine learning and text mining. In this paper, we present a feature coupling generalization (FCG) framework for generating new features from unlabeled data. It selects two special types of features, i.e., example-distinguishing features (EDFs) and class-distinguishing features (CDFs) from original feature set, and then generalizes EDFs into higher-level features based on their coupling degrees with CDFs in unlabeled data. The advantage is: EDFs with extreme sparsity in labeled data can be enriched by their co-occurrences with CDFs in unlabeled data so that the performance of these low-frequency features can be greatly boosted and new information from unlabeled can be incorporated. We apply this approach to three tasks in biomedical literature mining: gene named entity recognition (NER), protein-protein interaction extraction (PPIE), and text classification (TC) for gene ontology (GO) annotation. New features are generated from over 20 GB unlabeled PubMed abstracts. The experimental results on BioCreative 2, AIMED corpus, and TREC 2005 Genomics Track show that 1) FCG can utilize well the sparse features ignored by supervised learning. 2) It improves the performance of supervised baselines by 7.8 percent, 5.0 percent, and 5.8 percent, respectively, in the tree tasks. 3) Our methods achieve 89.1, 64.5 F-score, and 60.1 normalized utility on the three benchmark data sets.
Automated Recognition of 3D Features in GPIR Images

NASA Technical Reports Server (NTRS)

Park, Han; Stough, Timothy; Fijany, Amir

2007-01-01

A method of automated recognition of three-dimensional (3D) features in images generated by ground-penetrating imaging radar (GPIR) is undergoing development. GPIR 3D images can be analyzed to detect and identify such subsurface features as pipes and other utility conduits. Until now, much of the analysis of GPIR images has been performed manually by expert operators who must visually identify and track each feature. The present method is intended to satisfy a need for more efficient and accurate analysis by means of algorithms that can automatically identify and track subsurface features, with minimal supervision by human operators. In this method, data from multiple sources (for example, data on different features extracted by different algorithms) are fused together for identifying subsurface objects. The algorithms of this method can be classified in several different ways. In one classification, the algorithms fall into three classes: (1) image-processing algorithms, (2) feature- extraction algorithms, and (3) a multiaxis data-fusion/pattern-recognition algorithm that includes a combination of machine-learning, pattern-recognition, and object-linking algorithms. The image-processing class includes preprocessing algorithms for reducing noise and enhancing target features for pattern recognition. The feature-extraction algorithms operate on preprocessed data to extract such specific features in images as two-dimensional (2D) slices of a pipe. Then the multiaxis data-fusion/ pattern-recognition algorithm identifies, classifies, and reconstructs 3D objects from the extracted features. In this process, multiple 2D features extracted by use of different algorithms and representing views along different directions are used to identify and reconstruct 3D objects. In object linking, which is an essential part of this process, features identified in successive 2D slices and located within a threshold radius of identical features in adjacent slices are linked in a directed-graph data structure. Relative to past approaches, this multiaxis approach offers the advantages of more reliable detections, better discrimination of objects, and provision of redundant information, which can be helpful in filling gaps in feature recognition by one of the component algorithms. The image-processing class also includes postprocessing algorithms that enhance identified features to prepare them for further scrutiny by human analysts (see figure). Enhancement of images as a postprocessing step is a significant departure from traditional practice, in which enhancement of images is a preprocessing step.
Case study of 3D fingerprints applications

PubMed Central

Liu, Feng; Liang, Jinrong; Shen, Linlin; Yang, Meng; Zhang, David; Lai, Zhihui

2017-01-01

Human fingers are 3D objects. More information will be provided if three dimensional (3D) fingerprints are available compared with two dimensional (2D) fingerprints. Thus, this paper firstly collected 3D finger point cloud data by Structured-light Illumination method. Additional features from 3D fingerprint images are then studied and extracted. The applications of these features are finally discussed. A series of experiments are conducted to demonstrate the helpfulness of 3D information to fingerprint recognition. Results show that a quick alignment can be easily implemented under the guidance of 3D finger shape feature even though this feature does not work for fingerprint recognition directly. The newly defined distinctive 3D shape ridge feature can be used for personal authentication with Equal Error Rate (EER) of ~8.3%. Also, it is helpful to remove false core point. Furthermore, a promising of EER ~1.3% is realized by combining this feature with 2D features for fingerprint recognition which indicates the prospect of 3D fingerprint recognition. PMID:28399141
Case study of 3D fingerprints applications.

PubMed

Liu, Feng; Liang, Jinrong; Shen, Linlin; Yang, Meng; Zhang, David; Lai, Zhihui

2017-01-01

Human fingers are 3D objects. More information will be provided if three dimensional (3D) fingerprints are available compared with two dimensional (2D) fingerprints. Thus, this paper firstly collected 3D finger point cloud data by Structured-light Illumination method. Additional features from 3D fingerprint images are then studied and extracted. The applications of these features are finally discussed. A series of experiments are conducted to demonstrate the helpfulness of 3D information to fingerprint recognition. Results show that a quick alignment can be easily implemented under the guidance of 3D finger shape feature even though this feature does not work for fingerprint recognition directly. The newly defined distinctive 3D shape ridge feature can be used for personal authentication with Equal Error Rate (EER) of ~8.3%. Also, it is helpful to remove false core point. Furthermore, a promising of EER ~1.3% is realized by combining this feature with 2D features for fingerprint recognition which indicates the prospect of 3D fingerprint recognition.

Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.

PubMed

Zhou, Pan; Lin, Zhouchen; Zhang, Chao

2016-05-01

Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.
Joint Feature Extraction and Classifier Design for ECG-Based Biometric Recognition.

PubMed

Gutta, Sandeep; Cheng, Qi

2016-03-01

Traditional biometric recognition systems often utilize physiological traits such as fingerprint, face, iris, etc. Recent years have seen a growing interest in electrocardiogram (ECG)-based biometric recognition techniques, especially in the field of clinical medicine. In existing ECG-based biometric recognition methods, feature extraction and classifier design are usually performed separately. In this paper, a multitask learning approach is proposed, in which feature extraction and classifier design are carried out simultaneously. Weights are assigned to the features within the kernel of each task. We decompose the matrix consisting of all the feature weights into sparse and low-rank components. The sparse component determines the features that are relevant to identify each individual, and the low-rank component determines the common feature subspace that is relevant to identify all the subjects. A fast optimization algorithm is developed, which requires only the first-order information. The performance of the proposed approach is demonstrated through experiments using the MIT-BIH Normal Sinus Rhythm database.
Facial expression recognition based on improved local ternary pattern and stacked auto-encoder

NASA Astrophysics Data System (ADS)

Wu, Yao; Qiu, Weigen

2017-08-01

In order to enhance the robustness of facial expression recognition, we propose a method of facial expression recognition based on improved Local Ternary Pattern (LTP) combined with Stacked Auto-Encoder (SAE). This method uses the improved LTP extraction feature, and then uses the improved depth belief network as the detector and classifier to extract the LTP feature. The combination of LTP and improved deep belief network is realized in facial expression recognition. The recognition rate on CK+ databases has improved significantly.
YADCLAN: yet another digitally-controlled linear artificial neuron.

PubMed

Frenger, Paul

2003-01-01

This paper updates the author's 1999 RMBS presentation on digitally controlled linear artificial neuron design. Each neuron is based on a standard operational amplifier having excitatory and inhibitory inputs, variable gain, an amplified linear analog output and an adjustable threshold comparator for digital output. This design employs a 1-wire serial network of digitally controlled potentiometers and resistors whose resistance values are set and read back under microprocessor supervision. This system embodies several unique and useful features, including: enhanced neuronal stability, dynamic reconfigurability and network extensibility. This artificial neuronal is being employed for feature extraction and pattern recognition in an advanced robotic application.
Receptive fields selection for binary feature description.

PubMed

Fan, Bin; Kong, Qingqun; Trzcinski, Tomasz; Wang, Zhiheng; Pan, Chunhong; Fua, Pascal

2014-06-01

Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG .accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.
Robust Radar Emitter Recognition Based on the Three-Dimensional Distribution Feature and Transfer Learning

PubMed Central

Yang, Zhutian; Qiu, Wei; Sun, Hongjian; Nallanathan, Arumugam

2016-01-01

Due to the increasing complexity of electromagnetic signals, there exists a significant challenge for radar emitter signal recognition. To address this challenge, multi-component radar emitter recognition under a complicated noise environment is studied in this paper. A novel radar emitter recognition approach based on the three-dimensional distribution feature and transfer learning is proposed. The cubic feature for the time-frequency-energy distribution is proposed to describe the intra-pulse modulation information of radar emitters. Furthermore, the feature is reconstructed by using transfer learning in order to obtain the robust feature against signal noise rate (SNR) variation. Last, but not the least, the relevance vector machine is used to classify radar emitter signals. Simulations demonstrate that the approach proposed in this paper has better performances in accuracy and robustness than existing approaches. PMID:26927111
Robust Radar Emitter Recognition Based on the Three-Dimensional Distribution Feature and Transfer Learning.

PubMed

Yang, Zhutian; Qiu, Wei; Sun, Hongjian; Nallanathan, Arumugam

2016-02-25

Due to the increasing complexity of electromagnetic signals, there exists a significant challenge for radar emitter signal recognition. To address this challenge, multi-component radar emitter recognition under a complicated noise environment is studied in this paper. A novel radar emitter recognition approach based on the three-dimensional distribution feature and transfer learning is proposed. The cubic feature for the time-frequency-energy distribution is proposed to describe the intra-pulse modulation information of radar emitters. Furthermore, the feature is reconstructed by using transfer learning in order to obtain the robust feature against signal noise rate (SNR) variation. Last, but not the least, the relevance vector machine is used to classify radar emitter signals. Simulations demonstrate that the approach proposed in this paper has better performances in accuracy and robustness than existing approaches.
Arabic OCR: toward a complete system

NASA Astrophysics Data System (ADS)

El-Bialy, Ahmed M.; Kandil, Ahmed H.; Hashish, Mohamed; Yamany, Sameh M.

1999-12-01

Latin and Chinese OCR systems have been studied extensively in the literature. Yet little work was performed for Arabic character recognition. This is due to the technical challenges found in the Arabic text. Due to its cursive nature, a powerful and stable text segmentation is needed. Also; features capturing the characteristics of the rich Arabic character representation are needed to build the Arabic OCR. In this paper a novel segmentation technique which is font and size independent is introduced. This technique can segment the cursive written text line even if the line suffers from small skewness. The technique is not sensitive to the location of the centerline of the text line and can segment different font sizes and type (for different character sets) occurring on the same line. Features extraction is considered one of the most important phases of the text reading system. Ideally, the features extracted from a character image should capture the essential characteristics of this character that are independent of the font type and size. In such ideal case, the classifier stores a single prototype per character. However, it is practically challenging to find such ideal set of features. In this paper, a set of features that reflect the topological aspects of Arabia characters is proposed. These proposed features integrated with a topological matching technique introduce an Arabic text reading system that is semi Omni.
Transfer Learning for Improved Audio-Based Human Activity Recognition.

PubMed

Ntalampiras, Stavros; Potamitis, Ilyas

2018-06-25

Human activities are accompanied by characteristic sound events, the processing of which might provide valuable information for automated human activity recognition. This paper presents a novel approach addressing the case where one or more human activities are associated with limited audio data, resulting in a potentially highly imbalanced dataset. Data augmentation is based on transfer learning; more specifically, the proposed method: (a) identifies the classes which are statistically close to the ones associated with limited data; (b) learns a multiple input, multiple output transformation; and (c) transforms the data of the closest classes so that it can be used for modeling the ones associated with limited data. Furthermore, the proposed framework includes a feature set extracted out of signal representations of diverse domains, i.e., temporal, spectral, and wavelet. Extensive experiments demonstrate the relevance of the proposed data augmentation approach under a variety of generative recognition schemes.
Autoregressive statistical pattern recognition algorithms for damage detection in civil structures

NASA Astrophysics Data System (ADS)

Yao, Ruigen; Pakzad, Shamim N.

2012-08-01

Statistical pattern recognition has recently emerged as a promising set of complementary methods to system identification for automatic structural damage assessment. Its essence is to use well-known concepts in statistics for boundary definition of different pattern classes, such as those for damaged and undamaged structures. In this paper, several statistical pattern recognition algorithms using autoregressive models, including statistical control charts and hypothesis testing, are reviewed as potentially competitive damage detection techniques. To enhance the performance of statistical methods, new feature extraction techniques using model spectra and residual autocorrelation, together with resampling-based threshold construction methods, are proposed. Subsequently, simulated acceleration data from a multi degree-of-freedom system is generated to test and compare the efficiency of the existing and proposed algorithms. Data from laboratory experiments conducted on a truss and a large-scale bridge slab model are then used to further validate the damage detection methods and demonstrate the superior performance of proposed algorithms.
Optimal wavelength band clustering for multispectral iris recognition.

PubMed

Gong, Yazhuo; Zhang, David; Shi, Pengfei; Yan, Jingqi

2012-07-01

This work explores the possibility of clustering spectral wavelengths based on the maximum dissimilarity of iris textures. The eventual goal is to determine how many bands of spectral wavelengths will be enough for iris multispectral fusion and to find these bands that will provide higher performance of iris multispectral recognition. A multispectral acquisition system was first designed for imaging the iris at narrow spectral bands in the range of 420 to 940 nm. Next, a set of 60 human iris images that correspond to the right and left eyes of 30 different subjects were acquired for an analysis. Finally, we determined that 3 clusters were enough to represent the 10 feature bands of spectral wavelengths using the agglomerative clustering based on two-dimensional principal component analysis. The experimental results suggest (1) the number, center, and composition of clusters of spectral wavelengths and (2) the higher performance of iris multispectral recognition based on a three wavelengths-bands fusion.
Deep learning with word embeddings improves biomedical named entity recognition.

PubMed

Habibi, Maryam; Weber, Leon; Neves, Mariana; Wiegandt, David Luis; Leser, Ulf

2017-07-15

Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/ . habibima@informatik.hu-berlin.de. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Deep learning with word embeddings improves biomedical named entity recognition

PubMed Central

Habibi, Maryam; Weber, Leon; Neves, Mariana; Wiegandt, David Luis; Leser, Ulf

2017-01-01

Abstract Motivation: Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. Results: We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. Availability and implementation: The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/. Contact: habibima@informatik.hu-berlin.de PMID:28881963
Research on gesture recognition of augmented reality maintenance guiding system based on improved SVM

NASA Astrophysics Data System (ADS)

Zhao, Shouwei; Zhang, Yong; Zhou, Bin; Ma, Dongxi

2014-09-01

Interaction is one of the key techniques of augmented reality (AR) maintenance guiding system. Because of the complexity of the maintenance guiding system's image background and the high dimensionality of gesture characteristics, the whole process of gesture recognition can be divided into three stages which are gesture segmentation, gesture characteristic feature modeling and trick recognition. In segmentation stage, for solving the misrecognition of skin-like region, a segmentation algorithm combing background mode and skin color to preclude some skin-like regions is adopted. In gesture characteristic feature modeling of image attributes stage, plenty of characteristic features are analyzed and acquired, such as structure characteristics, Hu invariant moments features and Fourier descriptor. In trick recognition stage, a classifier based on Support Vector Machine (SVM) is introduced into the augmented reality maintenance guiding process. SVM is a novel learning method based on statistical learning theory, processing academic foundation and excellent learning ability, having a lot of issues in machine learning area and special advantages in dealing with small samples, non-linear pattern recognition at high dimension. The gesture recognition of augmented reality maintenance guiding system is realized by SVM after the granulation of all the characteristic features. The experimental results of the simulation of number gesture recognition and its application in augmented reality maintenance guiding system show that the real-time performance and robustness of gesture recognition of AR maintenance guiding system can be greatly enhanced by improved SVM.
From scores to face templates: a model-based approach.

PubMed

Mohanty, Pranab; Sarkar, Sudeep; Kasturi, Rangachar

2007-12-01

Regeneration of templates from match scores has security and privacy implications related to any biometric authentication system. We propose a novel paradigm to reconstruct face templates from match scores using a linear approach. It proceeds by first modeling the behavior of the given face recognition algorithm by an affine transformation. The goal of the modeling is to approximate the distances computed by a face recognition algorithm between two faces by distances between points, representing these faces, in an affine space. Given this space, templates from an independent image set (break-in) are matched only once with the enrolled template of the targeted subject and match scores are recorded. These scores are then used to embed the targeted subject in the approximating affine (non-orthogonal) space. Given the coordinates of the targeted subject in the affine space, the original template of the targeted subject is reconstructed using the inverse of the affine transformation. We demonstrate our ideas using three, fundamentally different, face recognition algorithms: Principal Component Analysis (PCA) with Mahalanobis cosine distance measure, Bayesian intra-extrapersonal classifier (BIC), and a feature-based commercial algorithm. To demonstrate the independence of the break-in set with the gallery set, we select face templates from two different databases: Face Recognition Grand Challenge (FRGC) and Facial Recognition Technology (FERET) Database (FERET). With an operational point set at 1 percent False Acceptance Rate (FAR) and 99 percent True Acceptance Rate (TAR) for 1,196 enrollments (FERET gallery), we show that at most 600 attempts (score computations) are required to achieve a 73 percent chance of breaking in as a randomly chosen target subject for the commercial face recognition system. With similar operational set up, we achieve a 72 percent and 100 percent chance of breaking in for the Bayesian and PCA based face recognition systems, respectively. With three different levels of score quantization, we achieve 69 percent, 68 percent and 49 percent probability of break-in, indicating the robustness of our proposed scheme to score quantization. We also show that the proposed reconstruction scheme has 47 percent more probability of breaking in as a randomly chosen target subject for the commercial system as compared to a hill climbing approach with the same number of attempts. Given that the proposed template reconstruction method uses distinct face templates to reconstruct faces, this work exposes a more severe form of vulnerability than a hill climbing kind of attack where incrementally different versions of the same face are used. Also, the ability of the proposed approach to reconstruct actual face templates of the users increases privacy concerns in biometric systems.
Gestural cue analysis in automated semantic miscommunication annotation

PubMed Central

Inoue, Masashi; Ogihara, Mitsunori; Hanada, Ryoko; Furuyama, Nobuhiro

2011-01-01

The automated annotation of conversational video by semantic miscommunication labels is a challenging topic. Although miscommunications are often obvious to the speakers as well as the observers, it is difficult for machines to detect them from the low-level features. We investigate the utility of gestural cues in this paper among various non-verbal features. Compared with gesture recognition tasks in human-computer interaction, this process is difficult due to the lack of understanding on which cues contribute to miscommunications and the implicitness of gestures. Nine simple gestural features are taken from gesture data, and both simple and complex classifiers are constructed using machine learning. The experimental results suggest that there is no single gestural feature that can predict or explain the occurrence of semantic miscommunication in our setting. PMID:23585724
Postprocessing for character recognition using pattern features and linguistic information

NASA Astrophysics Data System (ADS)

Yoshikawa, Takatoshi; Okamoto, Masayosi; Horii, Hiroshi

1993-04-01

We propose a new method of post-processing for character recognition using pattern features and linguistic information. This method corrects errors in the recognition of handwritten Japanese sentences containing Kanji characters. This post-process method is characterized by having two types of character recognition. Improving the accuracy of the character recognition rate of Japanese characters is made difficult by the large number of characters, and the existence of characters with similar patterns. Therefore, it is not practical for a character recognition system to recognize all characters in detail. First, this post-processing method generates a candidate character table by recognizing the simplest features of characters. Then, it selects words corresponding to the character from the candidate character table by referring to a word and grammar dictionary before selecting suitable words. If the correct character is included in the candidate character table, this process can correct an error, however, if the character is not included, it cannot correct an error. Therefore, if this method can presume a character does not exist in a candidate character table by using linguistic information (word and grammar dictionary). It then can verify a presumed character by character recognition using complex features. When this method is applied to an online character recognition system, the accuracy of character recognition improves 93.5% to 94.7%. This proved to be the case when it was used for the editorials of a Japanese newspaper (Asahi Shinbun).
Configural and Featural Face Processing Influences on Emotion Recognition in Schizophrenia and Bipolar Disorder.

PubMed

Van Rheenen, Tamsyn E; Joshua, Nicole; Castle, David J; Rossell, Susan L

2017-03-01

Emotion recognition impairments have been demonstrated in schizophrenia (Sz), but are less consistent and lesser in magnitude in bipolar disorder (BD). This may be related to the extent to which different face processing strategies are engaged during emotion recognition in each of these disorders. We recently showed that Sz patients had impairments in the use of both featural and configural face processing strategies, whereas BD patients were impaired only in the use of the latter. Here we examine the influence that these impairments have on facial emotion recognition in these cohorts. Twenty-eight individuals with Sz, 28 individuals with BD, and 28 healthy controls completed a facial emotion labeling task with two conditions designed to separate the use of featural and configural face processing strategies; part-based and whole-face emotion recognition. Sz patients performed worse than controls on both conditions, and worse than BD patients on the whole-face condition. BD patients performed worse than controls on the whole-face condition only. Configural processing deficits appear to influence the recognition of facial emotions in BD, whereas both configural and featural processing abnormalities impair emotion recognition in Sz. This may explain discrepancies in the profiles of emotion recognition between the disorders. (JINS, 2017, 23, 287-291).
Background feature descriptor for offline handwritten numeral recognition

NASA Astrophysics Data System (ADS)

Ming, Delie; Wang, Hao; Tian, Tian; Jie, Feiran; Lei, Bo

2011-11-01

This paper puts forward an offline handwritten numeral recognition method based on background structural descriptor (sixteen-value numerical background expression). Through encoding the background pixels in the image according to a certain rule, 16 different eigenvalues were generated, which reflected the background condition of every digit, then reflected the structural features of the digits. Through pattern language description of images by these features, automatic segmentation of overlapping digits and numeral recognition can be realized. This method is characterized by great deformation resistant ability, high recognition speed and easy realization. Finally, the experimental results and conclusions are presented. The experimental results of recognizing datasets from various practical application fields reflect that with this method, a good recognition effect can be achieved.
Facial Emotions Recognition using Gabor Transform and Facial Animation Parameters with Neural Networks

NASA Astrophysics Data System (ADS)

Harit, Aditya; Joshi, J. C., Col; Gupta, K. K.

2018-03-01

The paper proposed an automatic facial emotion recognition algorithm which comprises of two main components: feature extraction and expression recognition. The algorithm uses a Gabor filter bank on fiducial points to find the facial expression features. The resulting magnitudes of Gabor transforms, along with 14 chosen FAPs (Facial Animation Parameters), compose the feature space. There are two stages: the training phase and the recognition phase. Firstly, for the present 6 different emotions, the system classifies all training expressions in 6 different classes (one for each emotion) in the training stage. In the recognition phase, it recognizes the emotion by applying the Gabor bank to a face image, then finds the fiducial points, and then feeds it to the trained neural architecture.

Blood perfusion construction for infrared face recognition based on bio-heat transfer.

PubMed

Xie, Zhihua; Liu, Guodong

2014-01-01

To improve the performance of infrared face recognition for time-lapse data, a new construction of blood perfusion is proposed based on bio-heat transfer. Firstly, by quantifying the blood perfusion based on Pennes equation, the thermal information is converted into blood perfusion rate, which is stable facial biological feature of face image. Then, the separability discriminant criterion in Discrete Cosine Transform (DCT) domain is applied to extract the discriminative features of blood perfusion information. Experimental results demonstrate that the features of blood perfusion are more concentrative and discriminative for recognition than those of thermal information. The infrared face recognition based on the proposed blood perfusion is robust and can achieve better recognition performance compared with other state-of-the-art approaches.
Coding of visual object features and feature conjunctions in the human brain.

PubMed

Martinovic, Jasna; Gruber, Thomas; Müller, Matthias M

2008-01-01

Object recognition is achieved through neural mechanisms reliant on the activity of distributed coordinated neural assemblies. In the initial steps of this process, an object's features are thought to be coded very rapidly in distinct neural assemblies. These features play different functional roles in the recognition process--while colour facilitates recognition, additional contours and edges delay it. Here, we selectively varied the amount and role of object features in an entry-level categorization paradigm and related them to the electrical activity of the human brain. We found that early synchronizations (approx. 100 ms) increased quantitatively when more image features had to be coded, without reflecting their qualitative contribution to the recognition process. Later activity (approx. 200-400 ms) was modulated by the representational role of object features. These findings demonstrate that although early synchronizations may be sufficient for relatively crude discrimination of objects in visual scenes, they cannot support entry-level categorization. This was subserved by later processes of object model selection, which utilized the representational value of object features such as colour or edges to select the appropriate model and achieve identification.
Autonomous target recognition using remotely sensed surface vibration measurements

NASA Astrophysics Data System (ADS)

Geurts, James; Ruck, Dennis W.; Rogers, Steven K.; Oxley, Mark E.; Barr, Dallas N.

1993-09-01

The remotely measured surface vibration signatures of tactical military ground vehicles are investigated for use in target classification and identification friend or foe (IFF) systems. The use of remote surface vibration sensing by a laser radar reduces the effects of partial occlusion, concealment, and camouflage experienced by automatic target recognition systems using traditional imagery in a tactical battlefield environment. Linear Predictive Coding (LPC) efficiently represents the vibration signatures and nearest neighbor classifiers exploit the LPC feature set using a variety of distortion metrics. Nearest neighbor classifiers achieve an 88 percent classification rate in an eight class problem, representing a classification performance increase of thirty percent from previous efforts. A novel confidence figure of merit is implemented to attain a 100 percent classification rate with less than 60 percent rejection. The high classification rates are achieved on a target set which would pose significant problems to traditional image-based recognition systems. The targets are presented to the sensor in a variety of aspects and engine speeds at a range of 1 kilometer. The classification rates achieved demonstrate the benefits of using remote vibration measurement in a ground IFF system. The signature modeling and classification system can also be used to identify rotary and fixed-wing targets.
A hierarchical classification method for finger knuckle print recognition

NASA Astrophysics Data System (ADS)

Kong, Tao; Yang, Gongping; Yang, Lu

2014-12-01

Finger knuckle print has recently been seen as an effective biometric technique. In this paper, we propose a hierarchical classification method for finger knuckle print recognition, which is rooted in traditional score-level fusion methods. In the proposed method, we firstly take Gabor feature as the basic feature for finger knuckle print recognition and then a new decision rule is defined based on the predefined threshold. Finally, the minor feature speeded-up robust feature is conducted for these users, who cannot be recognized by the basic feature. Extensive experiments are performed to evaluate the proposed method, and experimental results show that it can achieve a promising performance.
Morphological self-organizing feature map neural network with applications to automatic target recognition

NASA Astrophysics Data System (ADS)

Zhang, Shijun; Jing, Zhongliang; Li, Jianxun

2005-01-01

The rotation invariant feature of the target is obtained using the multi-direction feature extraction property of the steerable filter. Combining the morphological operation top-hat transform with the self-organizing feature map neural network, the adaptive topological region is selected. Using the erosion operation, the topological region shrinkage is achieved. The steerable filter based morphological self-organizing feature map neural network is applied to automatic target recognition of binary standard patterns and real-world infrared sequence images. Compared with Hamming network and morphological shared-weight networks respectively, the higher recognition correct rate, robust adaptability, quick training, and better generalization of the proposed method are achieved.
Recognition algorithm for assisting ovarian cancer diagnosis from coregistered ultrasound and photoacoustic images: ex vivo study

NASA Astrophysics Data System (ADS)

Alqasemi, Umar; Kumavor, Patrick; Aguirre, Andres; Zhu, Quing

2012-12-01

Unique features and the underlining hypotheses of how these features may relate to the tumor physiology in coregistered ultrasound and photoacoustic images of ex vivo ovarian tissue are introduced. The images were first compressed with wavelet transform. The mean Radon transform of photoacoustic images was then computed and fitted with a Gaussian function to find the centroid of a suspicious area for shift-invariant recognition process. Twenty-four features were extracted from a training set by several methods, including Fourier transform, image statistics, and different composite filters. The features were chosen from more than 400 training images obtained from 33 ex vivo ovaries of 24 patients, and used to train three classifiers, including generalized linear model, neural network, and support vector machine (SVM). The SVM achieved the best training performance and was able to exclusively separate cancerous from non-cancerous cases with 100% sensitivity and specificity. At the end, the classifiers were used to test 95 new images obtained from 37 ovaries of 20 additional patients. The SVM classifier achieved 76.92% sensitivity and 95.12% specificity. Furthermore, if we assume that recognizing one image as a cancer is sufficient to consider an ovary as malignant, the SVM classifier achieves 100% sensitivity and 87.88% specificity.
Word-level recognition of multifont Arabic text using a feature vector matching approach

NASA Astrophysics Data System (ADS)

Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III

1996-03-01

Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.
Pattern recognition and image processing for environmental monitoring

NASA Astrophysics Data System (ADS)

Siddiqui, Khalid J.; Eastwood, DeLyle

1999-12-01

Pattern recognition (PR) and signal/image processing methods are among the most powerful tools currently available for noninvasively examining spectroscopic and other chemical data for environmental monitoring. Using spectral data, these systems have found a variety of applications employing analytical techniques for chemometrics such as gas chromatography, fluorescence spectroscopy, etc. An advantage of PR approaches is that they make no a prior assumption regarding the structure of the patterns. However, a majority of these systems rely on human judgment for parameter selection and classification. A PR problem is considered as a composite of four subproblems: pattern acquisition, feature extraction, feature selection, and pattern classification. One of the basic issues in PR approaches is to determine and measure the features useful for successful classification. Selection of features that contain the most discriminatory information is important because the cost of pattern classification is directly related to the number of features used in the decision rules. The state of the spectral techniques as applied to environmental monitoring is reviewed. A spectral pattern classification system combining the above components and automatic decision-theoretic approaches for classification is developed. It is shown how such a system can be used for analysis of large data sets, warehousing, and interpretation. In a preliminary test, the classifier was used to classify synchronous UV-vis fluorescence spectra of relatively similar petroleum oils with reasonable success.
Improved facial affect recognition in schizophrenia following an emotion intervention, but not training attention-to-facial-features or treatment-as-usual.

PubMed

Tsotsi, Stella; Kosmidis, Mary H; Bozikas, Vasilis P

2017-08-01

In schizophrenia, impaired facial affect recognition (FAR) has been associated with patients' overall social functioning. Interventions targeting attention or FAR per se have invariably yielded improved FAR performance in these patients. Here, we compared the effects of two interventions, one targeting FAR and one targeting attention-to-facial-features, with treatment-as-usual on patients' FAR performance. Thirty-nine outpatients with schizophrenia were randomly assigned to one of three groups: FAR intervention (training to recognize emotional information, conveyed by changes in facial features), attention-to-facial-features intervention (training to detect changes in facial features), and treatment-as-usual. Also, 24 healthy controls, matched for age and education, were assigned to one of the two interventions. Two FAR measurements, baseline and post-intervention, were conducted using an original experimental procedure with alternative sets of stimuli. We found improved FAR performance following the intervention targeting FAR in comparison to the other patient groups, which in fact was comparable to the pre-intervention performance of healthy controls in the corresponding intervention group. This improvement was more pronounced in recognizing fear. Our findings suggest that compared to interventions targeting attention, and treatment-as-usual, training programs targeting FAR can be more effective in improving FAR in patients with schizophrenia, particularly assisting them in perceiving threat-related information more accurately. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Improving protein fold recognition by extracting fold-specific features from predicted residue-residue contacts.

PubMed

Zhu, Jianwei; Zhang, Haicang; Li, Shuai Cheng; Wang, Chao; Kong, Lupeng; Sun, Shiwei; Zheng, Wei-Mou; Bu, Dongbo

2017-12-01

Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge. In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent performance improvement, indicating robustness of our approach. Furthermore, bi-clustering results of the extracted features are compatible with fold hierarchy of proteins, implying that these features are fold-specific. Together, these results suggest that the features extracted from predicted contacts are orthogonal to alignment-related features, and the combination of them could greatly facilitate fold recognition at superfamily/fold levels and template-based prediction of protein structures. Source code of DeepFR is freely available through https://github.com/zhujianwei31415/deepfr, and a web server is available through http://protein.ict.ac.cn/deepfr. zheng@itp.ac.cn or dbu@ict.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Research on Palmprint Identification Method Based on Quantum Algorithms

PubMed Central

Zhang, Zhanzhan

2014-01-01

Quantum image recognition is a technology by using quantum algorithm to process the image information. It can obtain better effect than classical algorithm. In this paper, four different quantum algorithms are used in the three stages of palmprint recognition. First, quantum adaptive median filtering algorithm is presented in palmprint filtering processing. Quantum filtering algorithm can get a better filtering result than classical algorithm through the comparison. Next, quantum Fourier transform (QFT) is used to extract pattern features by only one operation due to quantum parallelism. The proposed algorithm exhibits an exponential speed-up compared with discrete Fourier transform in the feature extraction. Finally, quantum set operations and Grover algorithm are used in palmprint matching. According to the experimental results, quantum algorithm only needs to apply square of N operations to find out the target palmprint, but the traditional method needs N times of calculation. At the same time, the matching accuracy of quantum algorithm is almost 100%. PMID:25105165
Physical Activity Recognition with Mobile Phones: Challenges, Methods, and Applications

NASA Astrophysics Data System (ADS)

Yang, Jun; Lu, Hong; Liu, Zhigang; Boda, Péter Pál

In this book chapter, we present a novel system that recognizes and records the physical activity of a person using a mobile phone. The sensor data is collected by built-in accelerometer sensor that measures the motion intensity of the device. The system recognizes five everyday activities in real-time, i.e., stationary, walking, running, bicycling, and in vehicle. We first introduce the sensor's data format, sensor calibration, signal projection, feature extraction, and selection methods. Then we have a detailed discussion and comparison of different choices of feature sets and classifiers. The design and implementation of one prototype system is presented along with resource and performance benchmark on Nokia N95 platform. Results show high recognition accuracies for distinguishing the five activities. The last part of the chapter introduces one demo application built on top of our system, physical activity diary, and a selection of potential applications in mobile wellness, mobile social sharing and contextual user interface domains.
Atoms of recognition in human and computer vision.

PubMed

Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel

2016-03-08

Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Attention-Based Recurrent Temporal Restricted Boltzmann Machine for Radar High Resolution Range Profile Sequence Recognition.

PubMed

Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang

2018-05-16

The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.
Toward open set recognition.

PubMed

Scheirer, Walter J; de Rezende Rocha, Anderson; Sapkota, Archana; Boult, Terrance E

2013-07-01

To date, almost all experimental evaluations of machine learning-based recognition algorithms in computer vision have taken the form of "closed set" recognition, whereby all testing classes are known at training time. A more realistic scenario for vision applications is "open set" recognition, where incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. This paper explores the nature of open set recognition and formalizes its definition as a constrained minimization problem. The open set recognition problem is not well addressed by existing algorithms because it requires strong generalization. As a step toward a solution, we introduce a novel "1-vs-set machine," which sculpts a decision space from the marginal distances of a 1-class or binary SVM with a linear kernel. This methodology applies to several different applications in computer vision where open set recognition is a challenging problem, including object recognition and face verification. We consider both in this work, with large scale cross-dataset experiments performed over the Caltech 256 and ImageNet sets, as well as face matching experiments performed over the Labeled Faces in the Wild set. The experiments highlight the effectiveness of machines adapted for open set evaluation compared to existing 1-class and binary SVMs for the same tasks.
The Use of Fuzzy Set Classification for Pattern Recognition of the Polygraph

DTIC Science & Technology

1993-12-01

actual feature extraction was done, It was decided to use the K-nearest neighbor ( KNN ) the data was preprocessed. The electrocardiogram classifier in...showing heart pulse, and a low frequency not known beforehand, and the KNN classifier does not component showing blood volume. The derivative of...the characteristics of the conventional KNN these six derived signals were detrended and filtered, classification method is that it assigns each
Efficient visual information for unfamiliar face matching despite viewpoint variations: It's not in the eyes!

PubMed

Royer, Jessica; Blais, Caroline; Barnabé-Lortie, Vincent; Carré, Mélissa; Leclerc, Josiane; Fiset, Daniel

2016-06-01

Faces are encountered in highly diverse angles in real-world settings. Despite this considerable diversity, most individuals are able to easily recognize familiar faces. The vast majority of studies in the field of face recognition have nonetheless focused almost exclusively on frontal views of faces. Indeed, a number of authors have investigated the diagnostic facial features for the recognition of frontal views of faces previously encoded in this same view. However, the nature of the information useful for identity matching when the encoded face and test face differ in viewing angle remains mostly unexplored. The present study addresses this issue using individual differences and bubbles, a method that pinpoints the facial features effectively used in a visual categorization task. Our results indicate that the use of features located in the center of the face, the lower left portion of the nose area and the center of the mouth, are significantly associated with individual efficiency to generalize a face's identity across different viewpoints. However, as faces become more familiar, the reliance on this area decreases, while the diagnosticity of the eye region increases. This suggests that a certain distinction can be made between the visual mechanisms subtending viewpoint invariance and face recognition in the case of unfamiliar face identification. Our results further support the idea that the eye area may only come into play when the face stimulus is particularly familiar to the observer. Copyright © 2016 Elsevier Ltd. All rights reserved.
Investigation of Time Series Representations and Similarity Measures for Structural Damage Pattern Recognition

PubMed Central

Swartz, R. Andrew

2013-01-01

This paper investigates the time series representation methods and similarity measures for sensor data feature extraction and structural damage pattern recognition. Both model-based time series representation and dimensionality reduction methods are studied to compare the effectiveness of feature extraction for damage pattern recognition. The evaluation of feature extraction methods is performed by examining the separation of feature vectors among different damage patterns and the pattern recognition success rate. In addition, the impact of similarity measures on the pattern recognition success rate and the metrics for damage localization are also investigated. The test data used in this study are from the System Identification to Monitor Civil Engineering Structures (SIMCES) Z24 Bridge damage detection tests, a rigorous instrumentation campaign that recorded the dynamic performance of a concrete box-girder bridge under progressively increasing damage scenarios. A number of progressive damage test case datasets and damage test data with different damage modalities are used. The simulation results show that both time series representation methods and similarity measures have significant impact on the pattern recognition success rate. PMID:24191136
Robust kernel representation with statistical local features for face recognition.

PubMed

Yang, Meng; Zhang, Lei; Shiu, Simon Chi-Keung; Zhang, David

2013-06-01

Factors such as misalignment, pose variation, and occlusion make robust face recognition a difficult problem. It is known that statistical features such as local binary pattern are effective for local feature extraction, whereas the recently proposed sparse or collaborative representation-based classification has shown interesting results in robust face recognition. In this paper, we propose a novel robust kernel representation model with statistical local features (SLF) for robust face recognition. Initially, multipartition max pooling is used to enhance the invariance of SLF to image registration error. Then, a kernel-based representation model is proposed to fully exploit the discrimination information embedded in the SLF, and robust regression is adopted to effectively handle the occlusion in face images. Extensive experiments are conducted on benchmark face databases, including extended Yale B, AR (A. Martinez and R. Benavente), multiple pose, illumination, and expression (multi-PIE), facial recognition technology (FERET), face recognition grand challenge (FRGC), and labeled faces in the wild (LFW), which have different variations of lighting, expression, pose, and occlusions, demonstrating the promising performance of the proposed method.
Low-resolution expression recognition based on central oblique average CS-LBP with adaptive threshold

NASA Astrophysics Data System (ADS)

Han, Sheng; Xi, Shi-qiong; Geng, Wei-dong

2017-11-01

In order to solve the problem of low recognition rate of traditional feature extraction operators under low-resolution images, a novel algorithm of expression recognition is proposed, named central oblique average center-symmetric local binary pattern (CS-LBP) with adaptive threshold (ATCS-LBP). Firstly, the features of face images can be extracted by the proposed operator after pretreatment. Secondly, the obtained feature image is divided into blocks. Thirdly, the histogram of each block is computed independently and all histograms can be connected serially to create a final feature vector. Finally, expression classification is achieved by using support vector machine (SVM) classifier. Experimental results on Japanese female facial expression (JAFFE) database show that the proposed algorithm can achieve a recognition rate of 81.9% when the resolution is as low as 16×16, which is much better than that of the traditional feature extraction operators.

A System for Mailpiece ZIP Code Assignment through Contextual Analysis. Phase 2

DTIC Science & Technology

1991-03-01

Segmentation Address Block Interpretation Automatic Feature Generation Word Recognition Feature Detection Word Verification Optical Character Recognition Directory...in the Phase III effort. 1.1 Motivation The United States Postal Service (USPS) deploys large numbers of optical character recognition (OCR) machines...4):208-218, November 1986. [2] Gronmeyer, L. K., Ruffin, B. W., Lybanon, M. A., Neely, P. L., and Pierce, S. E. An Overview of Optical Character Recognition (OCR
Assessment of Homomorphic Analysis for Human Activity Recognition from Acceleration Signals.

PubMed

Vanrell, Sebastian Rodrigo; Milone, Diego Humberto; Rufiner, Hugo Leonardo

2017-07-03

Unobtrusive activity monitoring can provide valuable information for medical and sports applications. In recent years, human activity recognition has moved to wearable sensors to deal with unconstrained scenarios. Accelerometers are the preferred sensors due to their simplicity and availability. Previous studies have examined several \\azul{classic} techniques for extracting features from acceleration signals, including time-domain, time-frequency, frequency-domain, and other heuristic features. Spectral and temporal features are the preferred ones and they are generally computed from acceleration components, leaving the acceleration magnitude potential unexplored. In this study, based on homomorphic analysis, a new type of feature extraction stage is proposed in order to exploit discriminative activity information present in acceleration signals. Homomorphic analysis can isolate the information about whole body dynamics and translate it into a compact representation, called cepstral coefficients. Experiments have explored several configurations of the proposed features, including size of representation, signals to be used, and fusion with other features. Cepstral features computed from acceleration magnitude obtained one of the highest recognition rates. In addition, a beneficial contribution was found when time-domain and moving pace information was included in the feature vector. Overall, the proposed system achieved a recognition rate of 91.21% on the publicly available SCUT-NAA dataset. To the best of our knowledge, this is the highest recognition rate on this dataset.
Facial expression recognition based on improved deep belief networks

NASA Astrophysics Data System (ADS)

Wu, Yao; Qiu, Weigen

2017-08-01

In order to improve the robustness of facial expression recognition, a method of face expression recognition based on Local Binary Pattern (LBP) combined with improved deep belief networks (DBNs) is proposed. This method uses LBP to extract the feature, and then uses the improved deep belief networks as the detector and classifier to extract the LBP feature. The combination of LBP and improved deep belief networks is realized in facial expression recognition. In the JAFFE (Japanese Female Facial Expression) database on the recognition rate has improved significantly.
Coding visual features extracted from video sequences.

PubMed

Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano

2014-05-01

Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
Hierarchical ensemble of global and local classifiers for face recognition.

PubMed

Su, Yu; Shan, Shiguang; Chen, Xilin; Gao, Wen

2009-08-01

In the literature of psychophysics and neurophysiology, many studies have shown that both global and local features are crucial for face representation and recognition. This paper proposes a novel face recognition method which exploits both global and local discriminative features. In this method, global features are extracted from the whole face images by keeping the low-frequency coefficients of Fourier transform, which we believe encodes the holistic facial information, such as facial contour. For local feature extraction, Gabor wavelets are exploited considering their biological relevance. After that, Fisher's linear discriminant (FLD) is separately applied to the global Fourier features and each local patch of Gabor features. Thus, multiple FLD classifiers are obtained, each embodying different facial evidences for face recognition. Finally, all these classifiers are combined to form a hierarchical ensemble classifier. We evaluate the proposed method using two large-scale face databases: FERET and FRGC version 2.0. Experiments show that the results of our method are impressively better than the best known results with the same evaluation protocol.
A state-based approach to trend recognition and failure prediction for the Space Station Freedom

NASA Technical Reports Server (NTRS)

Nelson, Kyle S.; Hadden, George D.

1992-01-01

A state-based reasoning approach to trend recognition and failure prediction for the Altitude Determination, and Control System (ADCS) of the Space Station Freedom (SSF) is described. The problem domain is characterized by features (e.g., trends and impending failures) that develop over a variety of time spans, anywhere from several minutes to several years. Our state-based reasoning approach, coupled with intelligent data screening, allows features to be tracked as they develop in a time-dependent manner. That is, each state machine has the ability to encode a time frame for the feature it detects. As features are detected, they are recorded and can be used as input to other state machines, creating a hierarchical feature recognition scheme. Furthermore, each machine can operate independently of the others, allowing simultaneous tracking of features. State-based reasoning was implemented in the trend recognition and the prognostic modules of a prototype Space Station Freedom Maintenance and Diagnostic System (SSFMDS) developed at Honeywell's Systems and Research Center.
Improving Protein Fold Recognition by Deep Learning Networks.

PubMed

Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin

2015-12-04

For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl's benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.
Spoof Detection for Finger-Vein Recognition System Using NIR Camera.

PubMed

Nguyen, Dat Tien; Yoon, Hyo Sik; Pham, Tuyen Danh; Park, Kang Ryoung

2017-10-01

Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using presentation attack (fake) finger-vein images. As a result, spoof detection, named as presentation attack detection (PAD), is necessary in such recognition systems. Previous attempts to establish PAD methods primarily focused on designing feature extractors by hand (handcrafted feature extractor) based on the observations of the researchers about the difference between real (live) and presentation attack finger-vein images. Therefore, the detection performance was limited. Recently, the deep learning framework has been successfully applied in computer vision and delivered superior results compared to traditional handcrafted methods on various computer vision applications such as image-based face recognition, gender recognition and image classification. In this paper, we propose a PAD method for near-infrared (NIR) camera-based finger-vein recognition system using convolutional neural network (CNN) to enhance the detection ability of previous handcrafted methods. Using the CNN method, we can derive a more suitable feature extractor for PAD than the other handcrafted methods using a training procedure. We further process the extracted image features to enhance the presentation attack finger-vein image detection ability of the CNN method using principal component analysis method (PCA) for dimensionality reduction of feature space and support vector machine (SVM) for classification. Through extensive experimental results, we confirm that our proposed method is adequate for presentation attack finger-vein image detection and it can deliver superior detection results compared to CNN-based methods and other previous handcrafted methods.
Spoof Detection for Finger-Vein Recognition System Using NIR Camera

PubMed Central

Nguyen, Dat Tien; Yoon, Hyo Sik; Pham, Tuyen Danh; Park, Kang Ryoung

2017-01-01

Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using presentation attack (fake) finger-vein images. As a result, spoof detection, named as presentation attack detection (PAD), is necessary in such recognition systems. Previous attempts to establish PAD methods primarily focused on designing feature extractors by hand (handcrafted feature extractor) based on the observations of the researchers about the difference between real (live) and presentation attack finger-vein images. Therefore, the detection performance was limited. Recently, the deep learning framework has been successfully applied in computer vision and delivered superior results compared to traditional handcrafted methods on various computer vision applications such as image-based face recognition, gender recognition and image classification. In this paper, we propose a PAD method for near-infrared (NIR) camera-based finger-vein recognition system using convolutional neural network (CNN) to enhance the detection ability of previous handcrafted methods. Using the CNN method, we can derive a more suitable feature extractor for PAD than the other handcrafted methods using a training procedure. We further process the extracted image features to enhance the presentation attack finger-vein image detection ability of the CNN method using principal component analysis method (PCA) for dimensionality reduction of feature space and support vector machine (SVM) for classification. Through extensive experimental results, we confirm that our proposed method is adequate for presentation attack finger-vein image detection and it can deliver superior detection results compared to CNN-based methods and other previous handcrafted methods. PMID:28974031
Sound Processing Features for Speaker-Dependent and Phrase-Independent Emotion Recognition in Berlin Database

NASA Astrophysics Data System (ADS)

Anagnostopoulos, Christos Nikolaos; Vovoli, Eftichia

An emotion recognition framework based on sound processing could improve services in human-computer interaction. Various quantitative speech features obtained from sound processing of acting speech were tested, as to whether they are sufficient or not to discriminate between seven emotions. Multilayered perceptrons were trained to classify gender and emotions on the basis of a 24-input vector, which provide information about the prosody of the speaker over the entire sentence using statistics of sound features. Several experiments were performed and the results were presented analytically. Emotion recognition was successful when speakers and utterances were “known” to the classifier. However, severe misclassifications occurred during the utterance-independent framework. At least, the proposed feature vector achieved promising results for utterance-independent recognition of high- and low-arousal emotions.
Fuzzy based finger vein recognition with rotation invariant feature matching

NASA Astrophysics Data System (ADS)

Ezhilmaran, D.; Joseph, Rose Bindu

2017-11-01

Finger vein recognition is a promising biometric with commercial applications which is explored widely in the recent years. In this paper, a finger vein recognition system is proposed using rotation invariant feature descriptors for matching after enhancing the finger vein images with an interval type-2 fuzzy method. SIFT features are extracted and matched using a matching score based on Euclidian distance. Rotation invariance of the proposed method is verified in the experiment and the results are compared with SURF matching and minutiae matching. It is seen that rotation invariance is verified and the poor quality issues are solved efficiently with the designed system of finger vein recognition during the analysis. The experiments underlines the robustness and reliability of the interval type-2 fuzzy enhancement and SIFT feature matching.
Texture feature extraction based on wavelet transform and gray-level co-occurrence matrices applied to osteosarcoma diagnosis.

PubMed

Hu, Shan; Xu, Chao; Guan, Weiqiao; Tang, Yong; Liu, Yana

2014-01-01

Osteosarcoma is the most common malignant bone tumor among children and adolescents. In this study, image texture analysis was made to extract texture features from bone CR images to evaluate the recognition rate of osteosarcoma. To obtain the optimal set of features, Sym4 and Db4 wavelet transforms and gray-level co-occurrence matrices were applied to the image, with statistical methods being used to maximize the feature selection. To evaluate the performance of these methods, a support vector machine algorithm was used. The experimental results demonstrated that the Sym4 wavelet had a higher classification accuracy (93.44%) than the Db4 wavelet with respect to osteosarcoma occurrence in the epiphysis, whereas the Db4 wavelet had a higher classification accuracy (96.25%) for osteosarcoma occurrence in the diaphysis. Results including accuracy, sensitivity, specificity and ROC curves obtained using the wavelets were all higher than those obtained using the features derived from the GLCM method. It is concluded that, a set of texture features can be extracted from the wavelets and used in computer-aided osteosarcoma diagnosis systems. In addition, this study also confirms that multi-resolution analysis is a useful tool for texture feature extraction during bone CR image processing.
The McIntosh Archive: A solar feature database spanning four solar cycles

NASA Astrophysics Data System (ADS)

Gibson, S. E.; Malanushenko, A. V.; Hewins, I.; McFadden, R.; Emery, B.; Webb, D. F.; Denig, W. F.

2016-12-01

The McIntosh Archive consists of a set of hand-drawn solar Carrington maps created by Patrick McIntosh from 1964 to 2009. McIntosh used mainly H-alpha, He-1 10830 and photospheric magnetic measurements from both ground-based and NASA satellite observations. With these he traced coronal holes, polarity inversion lines, filaments, sunspots and plage, yielding a unique 45-year record of the features associated with the large-scale solar magnetic field. We will present the results of recent efforts to preserve and digitize this archive. Most of the original hand-drawn maps have been scanned, a method for processing these scans into digital, searchable format has been developed and streamlined, and an archival repository at NOAA's National Centers for Environmental Information (NCEI) has been created. We will demonstrate how Solar Cycle 23 data may now be accessed and how it may be utilized for scientific applications. In addition, we will discuss how this database of human-recognized features, which overlaps with the onset of high-resolution, continuous modern solar data, may act as a training set for computer feature recognition algorithms.
Wearable Devices for Classification of Inadequate Posture at Work Using Neural Networks

PubMed Central

Barkallah, Eya; Freulard, Johan; Otis, Martin J. -D.; Ngomo, Suzy; Ayena, Johannes C.; Desrosiers, Christian

2017-01-01

Inadequate postures adopted by an operator at work are among the most important risk factors in Work-related Musculoskeletal Disorders (WMSDs). Although several studies have focused on inadequate posture, there is limited information on its identification in a work context. The aim of this study is to automatically differentiate between adequate and inadequate postures using two wearable devices (helmet and instrumented insole) with an inertial measurement unit (IMU) and force sensors. From the force sensors located inside the insole, the center of pressure (COP) is computed since it is considered an important parameter in the analysis of posture. In a first step, a set of 60 features is computed with a direct approach, and later reduced to eight via a hybrid feature selection. A neural network is then employed to classify the current posture of a worker, yielding a recognition rate of 90%. In a second step, an innovative graphic approach is proposed to extract three additional features for the classification. This approach represents the main contribution of this study. Combining both approaches improves the recognition rate to 95%. Our results suggest that neural network could be applied successfully for the classification of adequate and inadequate posture. PMID:28862665
Face recognition via sparse representation of SIFT feature on hexagonal-sampling image

NASA Astrophysics Data System (ADS)

Zhang, Daming; Zhang, Xueyong; Li, Lu; Liu, Huayong

2018-04-01

This paper investigates a face recognition approach based on Scale Invariant Feature Transform (SIFT) feature and sparse representation. The approach takes advantage of SIFT which is local feature other than holistic feature in classical Sparse Representation based Classification (SRC) algorithm and possesses strong robustness to expression, pose and illumination variations. Since hexagonal image has more inherit merits than square image to make recognition process more efficient, we extract SIFT keypoint in hexagonal-sampling image. Instead of matching SIFT feature, firstly the sparse representation of each SIFT keypoint is given according the constructed dictionary; secondly these sparse vectors are quantized according dictionary; finally each face image is represented by a histogram and these so-called Bag-of-Words vectors are classified by SVM. Due to use of local feature, the proposed method achieves better result even when the number of training sample is small. In the experiments, the proposed method gave higher face recognition rather than other methods in ORL and Yale B face databases; also, the effectiveness of the hexagonal-sampling in the proposed method is verified.
Emotion recognition from EEG using higher order crossings.

PubMed

Petrantonakis, Panagiotis C; Hadjileontiadis, Leontios J

2010-03-01

Electroencephalogram (EEG)-based emotion recognition is a relatively new field in the affective computing area with challenging issues regarding the induction of the emotional states and the extraction of the features in order to achieve optimum classification performance. In this paper, a novel emotion evocation and EEG-based feature extraction technique is presented. In particular, the mirror neuron system concept was adapted to efficiently foster emotion induction by the process of imitation. In addition, higher order crossings (HOC) analysis was employed for the feature extraction scheme and a robust classification method, namely HOC-emotion classifier (HOC-EC), was implemented testing four different classifiers [quadratic discriminant analysis (QDA), k-nearest neighbor, Mahalanobis distance, and support vector machines (SVMs)], in order to accomplish efficient emotion recognition. Through a series of facial expression image projection, EEG data have been collected by 16 healthy subjects using only 3 EEG channels, namely Fp1, Fp2, and a bipolar channel of F3 and F4 positions according to 10-20 system. Two scenarios were examined using EEG data from a single-channel and from combined-channels, respectively. Compared with other feature extraction methods, HOC-EC appears to outperform them, achieving a 62.3% (using QDA) and 83.33% (using SVM) classification accuracy for the single-channel and combined-channel cases, respectively, differentiating among the six basic emotions, i.e., happiness, surprise, anger, fear, disgust, and sadness. As the emotion class-set reduces its dimension, the HOC-EC converges toward maximum classification rate (100% for five or less emotions), justifying the efficiency of the proposed approach. This could facilitate the integration of HOC-EC in human machine interfaces, such as pervasive healthcare systems, enhancing their affective character and providing information about the user's emotional status (e.g., identifying user's emotion experiences, recurring affective states, time-dependent emotional trends).
Evaluation of Spectral and Prosodic Features of Speech Affected by Orthodontic Appliances Using the Gmm Classifier

NASA Astrophysics Data System (ADS)

Přibil, Jiří; Přibilová, Anna; Ďuračkoá, Daniela

2014-01-01

The paper describes our experiment with using the Gaussian mixture models (GMM) for classification of speech uttered by a person wearing orthodontic appliances. For the GMM classification, the input feature vectors comprise the basic and the complementary spectral properties as well as the supra-segmental parameters. Dependence of classification correctness on the number of the parameters in the input feature vector and on the computation complexity is also evaluated. In addition, an influence of the initial setting of the parameters for GMM training process was analyzed. Obtained recognition results are compared visually in the form of graphs as well as numerically in the form of tables and confusion matrices for tested sentences uttered using three configurations of orthodontic appliances.
Invariant approach to the character classification

NASA Astrophysics Data System (ADS)

Šariri, Kristina; Demoli, Nazif

2008-04-01

Image moments analysis is a very useful tool which allows image description invariant to translation and rotation, scale change and some types of image distortions. The aim of this work was development of simple method for fast and reliable classification of characters by using Hu's and affine moment invariants. Measure of Eucleidean distance was used as a discrimination feature with statistical parameters estimated. The method was tested in classification of Times New Roman font letters as well as sets of the handwritten characters. It is shown that using all Hu's and three affine invariants as discrimination set improves recognition rate by 30%.
Recognising discourse causality triggers in the biomedical domain.

PubMed

Mihăilă, Claudiu; Ananiadou, Sophia

2013-12-01

Current domain-specific information extraction systems represent an important resource for biomedical researchers, who need to process vast amounts of knowledge in a short time. Automatic discourse causality recognition can further reduce their workload by suggesting possible causal connections and aiding in the curation of pathway models. We describe here an approach to the automatic identification of discourse causality triggers in the biomedical domain using machine learning. We create several baselines and experiment with and compare various parameter settings for three algorithms, i.e. Conditional Random Fields (CRF), Support Vector Machines (SVM) and Random Forests (RF). We also evaluate the impact of lexical, syntactic, and semantic features on each of the algorithms, showing that semantics improves the performance in all cases. We test our comprehensive feature set on two corpora containing gold standard annotations of causal relations, and demonstrate the need for more gold standard data. The best performance of 79.35% F-score is achieved by CRFs when using all three feature types.
Study of wavelet packet energy entropy for emotion classification in speech and glottal signals

NASA Astrophysics Data System (ADS)

He, Ling; Lech, Margaret; Zhang, Jing; Ren, Xiaomei; Deng, Lihua

2013-07-01

The automatic speech emotion recognition has important applications in human-machine communication. Majority of current research in this area is focused on finding optimal feature parameters. In recent studies, several glottal features were examined as potential cues for emotion differentiation. In this study, a new type of feature parameter is proposed, which calculates energy entropy on values within selected Wavelet Packet frequency bands. The modeling and classification tasks are conducted using the classical GMM algorithm. The experiments use two data sets: the Speech Under Simulated Emotion (SUSE) data set annotated with three different emotions (angry, neutral and soft) and Berlin Emotional Speech (BES) database annotated with seven different emotions (angry, bored, disgust, fear, happy, sad and neutral). The average classification accuracy achieved for the SUSE data (74%-76%) is significantly higher than the accuracy achieved for the BES data (51%-54%). In both cases, the accuracy was significantly higher than the respective random guessing levels (33% for SUSE and 14.3% for BES).

Composing alarms: considering the musical aspects of auditory alarm design.

PubMed

Gillard, Jessica; Schutz, Michael

2016-12-01

Short melodies are commonly linked to referents in jingles, ringtones, movie themes, and even auditory displays (i.e., sounds used in human-computer interactions). While melody associations can be quite effective, auditory alarms in medical devices are generally poorly learned and highly confused. Here, we draw on approaches and stimuli from both music cognition (melody recognition) and human factors (alarm design) to analyze the patterns of confusions in a paired-associate alarm-learning task involving both a standardized melodic alarm set (Experiment 1) and a set of novel melodies (Experiment 2). Although contour played a role in confusions (consistent with previous research), we observed several cases where melodies with similar contours were rarely confused - melodies holding musically distinctive features. This exploratory work suggests that salient features formed by an alarm's melodic structure (such as repeated notes, distinct contours, and easily recognizable intervals) can increase the likelihood of correct alarm identification. We conclude that the use of musical principles and features may help future efforts to improve the design of auditory alarms.
Internal versus external features in triggering the brain waveforms for conjunction and feature faces in recognition.

PubMed

Nie, Aiqing; Jiang, Jingguo; Fu, Qiao

2014-08-20

Previous research has found that conjunction faces (whose internal features, e.g. eyes, nose, and mouth, and external features, e.g. hairstyle and ears, are from separate studied faces) and feature faces (partial features of these are studied) can produce higher false alarms than both old and new faces (i.e. those that are exactly the same as the studied faces and those that have not been previously presented) in recognition. The event-related potentials (ERPs) that relate to conjunction and feature faces at recognition, however, have not been described as yet; in addition, the contributions of different facial features toward ERPs have not been differentiated. To address these issues, the present study compared the ERPs elicited by old faces, conjunction faces (the internal and the external features were from two studied faces), old internal feature faces (whose internal features were studied), and old external feature faces (whose external features were studied) with those of new faces separately. The results showed that old faces not only elicited an early familiarity-related FN400, but a more anterior distributed late old/new effect that reflected recollection. Conjunction faces evoked similar late brain waveforms as old internal feature faces, but not to old external feature faces. These results suggest that, at recognition, old faces hold higher familiarity than compound faces in the profiles of ERPs and internal facial features are more crucial than external ones in triggering the brain waveforms that are characterized as reflecting the result of familiarity.
A novel binary shape context for 3D local surface description

NASA Astrophysics Data System (ADS)

Dong, Zhen; Yang, Bisheng; Liu, Yuan; Liang, Fuxun; Li, Bijun; Zang, Yufu

2017-08-01

3D local surface description is now at the core of many computer vision technologies, such as 3D object recognition, intelligent driving, and 3D model reconstruction. However, most of the existing 3D feature descriptors still suffer from low descriptiveness, weak robustness, and inefficiency in both time and memory. To overcome these challenges, this paper presents a robust and descriptive 3D Binary Shape Context (BSC) descriptor with high efficiency in both time and memory. First, a novel BSC descriptor is generated for 3D local surface description, and the performance of the BSC descriptor under different settings of its parameters is analyzed. Next, the descriptiveness, robustness, and efficiency in both time and memory of the BSC descriptor are evaluated and compared to those of several state-of-the-art 3D feature descriptors. Finally, the performance of the BSC descriptor for 3D object recognition is also evaluated on a number of popular benchmark datasets, and an urban-scene dataset is collected by a terrestrial laser scanner system. Comprehensive experiments demonstrate that the proposed BSC descriptor obtained high descriptiveness, strong robustness, and high efficiency in both time and memory and achieved high recognition rates of 94.8%, 94.1% and 82.1% on the considered UWA, Queen, and WHU datasets, respectively.
Margined winner-take-all: New learning rule for pattern recognition.

PubMed

Fukushima, Kunihiko

2018-01-01

The neocognitron is a deep (multi-layered) convolutional neural network that can be trained to recognize visual patterns robustly. In the intermediate layers of the neocognitron, local features are extracted from input patterns. In the deepest layer, based on the features extracted in the intermediate layers, input patterns are classified into classes. A method called IntVec (interpolating-vector) is used for this purpose. This paper proposes a new learning rule called margined Winner-Take-All (mWTA) for training the deepest layer. Every time when a training pattern is presented during the learning, if the result of recognition by WTA (Winner-Take-All) is an error, a new cell is generated in the deepest layer. Here we put a certain amount of margin to the WTA. In other words, only during the learning, a certain amount of handicap is given to cells of classes other than that of the training vector, and the winner is chosen under this handicap. By introducing the margin to the WTA, we can generate a compact set of cells, with which a high recognition rate can be obtained with a small computational cost. The ability of this mWTA is demonstrated by computer simulation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Membership generation using multilayer neural network

NASA Technical Reports Server (NTRS)

Kim, Jaeseok

1992-01-01

There has been intensive research in neural network applications to pattern recognition problems. Particularly, the back-propagation network has attracted many researchers because of its outstanding performance in pattern recognition applications. In this section, we describe a new method to generate membership functions from training data using a multilayer neural network. The basic idea behind the approach is as follows. The output values of a sigmoid activation function of a neuron bear remarkable resemblance to membership values. Therefore, we can regard the sigmoid activation values as the membership values in fuzzy set theory. Thus, in order to generate class membership values, we first train a suitable multilayer network using a training algorithm such as the back-propagation algorithm. After the training procedure converges, the resulting network can be treated as a membership generation network, where the inputs are feature values and the outputs are membership values in the different classes. This method allows fairly complex membership functions to be generated because the network is highly nonlinear in general. Also, it is to be noted that the membership functions are generated from a classification point of view. For pattern recognition applications, this is highly desirable, although the membership values may not be indicative of the degree of typicality of a feature value in a particular class.
An ERP investigation of visual word recognition in syllabary scripts.

PubMed

Okano, Kana; Grainger, Jonathan; Holcomb, Phillip J

2013-06-01

The bimodal interactive-activation model has been successfully applied to understanding the neurocognitive processes involved in reading words in alphabetic scripts, as reflected in the modulation of ERP components in masked repetition priming. In order to test the generalizability of this approach, in the present study we examined word recognition in a different writing system, the Japanese syllabary scripts hiragana and katakana. Native Japanese participants were presented with repeated or unrelated pairs of Japanese words in which the prime and target words were both in the same script (within-script priming, Exp. 1) or were in the opposite script (cross-script priming, Exp. 2). As in previous studies with alphabetic scripts, in both experiments the N250 (sublexical processing) and N400 (lexical-semantic processing) components were modulated by priming, although the time course was somewhat delayed. The earlier N/P150 effect (visual feature processing) was present only in "Experiment 1: Within-script priming", in which the prime and target words shared visual features. Overall, the results provide support for the hypothesis that visual word recognition involves a generalizable set of neurocognitive processes that operate in similar manners across different writing systems and languages, as well as pointing to the viability of the bimodal interactive-activation framework for modeling such processes.
An ERP Investigation of Visual Word Recognition in Syllabary Scripts

PubMed Central

Okano, Kana; Grainger, Jonathan; Holcomb, Phillip J.

2013-01-01

The bi-modal interactive-activation model has been successfully applied to understanding the neuro-cognitive processes involved in reading words in alphabetic scripts, as reflected in the modulation of ERP components in masked repetition priming. In order to test the generalizability of this approach, the current study examined word recognition in a different writing system, the Japanese syllabary scripts Hiragana and Katakana. Native Japanese participants were presented with repeated or unrelated pairs of Japanese words where the prime and target words were both in the same script (within-script priming, Experiment 1) or were in the opposite script (cross-script priming, Experiment 2). As in previous studies with alphabetic scripts, in both experiments the N250 (sub-lexical processing) and N400 (lexical-semantic processing) components were modulated by priming, although the time-course was somewhat delayed. The earlier N/P150 effect (visual feature processing) was present only in Experiment 1 where prime and target words shared visual features. Overall, the results provide support for the hypothesis that visual word recognition involves a generalizable set of neuro-cognitive processes that operate in a similar manner across different writing systems and languages, as well as pointing to the viability of the bi-modal interactive activation framework for modeling such processes. PMID:23378278
Sparse Feature Extraction for Pose-Tolerant Face Recognition.

PubMed

Abiantun, Ramzi; Prabhu, Utsav; Savvides, Marios

2014-10-01

Automatic face recognition performance has been steadily improving over years of research, however it remains significantly affected by a number of factors such as illumination, pose, expression, resolution and other factors that can impact matching scores. The focus of this paper is the pose problem which remains largely overlooked in most real-world applications. Specifically, we focus on one-to-one matching scenarios where a query face image of a random pose is matched against a set of gallery images. We propose a method that relies on two fundamental components: (a) A 3D modeling step to geometrically correct the viewpoint of the face. For this purpose, we extend a recent technique for efficient synthesis of 3D face models called 3D Generic Elastic Model. (b) A sparse feature extraction step using subspace modeling and ℓ1-minimization to induce pose-tolerance in coefficient space. This in return enables the synthesis of an equivalent frontal-looking face, which can be used towards recognition. We show significant performance improvements in verification rates compared to commercial matchers, and also demonstrate the resilience of the proposed method with respect to degrading input quality. We find that the proposed technique is able to match non-frontal images to other non-frontal images of varying angles.
Formal implementation of a performance evaluation model for the face recognition system.

PubMed

Shin, Yong-Nyuo; Kim, Jason; Lee, Yong-Jun; Shin, Woochang; Choi, Jin-Young

2008-01-01

Due to usability features, practical applications, and its lack of intrusiveness, face recognition technology, based on information, derived from individuals' facial features, has been attracting considerable attention recently. Reported recognition rates of commercialized face recognition systems cannot be admitted as official recognition rates, as they are based on assumptions that are beneficial to the specific system and face database. Therefore, performance evaluation methods and tools are necessary to objectively measure the accuracy and performance of any face recognition system. In this paper, we propose and formalize a performance evaluation model for the biometric recognition system, implementing an evaluation tool for face recognition systems based on the proposed model. Furthermore, we performed evaluations objectively by providing guidelines for the design and implementation of a performance evaluation system, formalizing the performance test process.
Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data

PubMed Central

Cao, Beiming; Kim, Myungjong; Mau, Ted; Wang, Jun

2017-01-01

Individuals with larynx (vocal folds) impaired have problems in controlling their glottal vibration, producing whispered speech with extreme hoarseness. Standard automatic speech recognition using only acoustic cues is typically ineffective for whispered speech because the corresponding spectral characteristics are distorted. Articulatory cues such as the tongue and lip motion may help in recognizing whispered speech since articulatory motion patterns are generally not affected. In this paper, we investigated whispered speech recognition for patients with reconstructed larynx using articulatory movement data. A data set with both acoustic and articulatory motion data was collected from a patient with surgically reconstructed larynx using an electromagnetic articulograph. Two speech recognition systems, Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network-HMM (DNN-HMM), were used in the experiments. Experimental results showed adding either tongue or lip motion data to acoustic features such as mel-frequency cepstral coefficient (MFCC) significantly reduced the phone error rates on both speech recognition systems. Adding both tongue and lip data achieved the best performance. PMID:29423453
Variogram-based feature extraction for neural network recognition of logos

NASA Astrophysics Data System (ADS)

Pham, Tuan D.

2003-03-01

This paper presents a new approach for extracting spatial features of images based on the theory of regionalized variables. These features can be effectively used for automatic recognition of logo images using neural networks. Experimental results on a public-domain logo database show the effectiveness of the proposed approach.
The Word Shape Hypothesis Re-Examined: Evidence for an External Feature Advantage in Visual Word Recognition

ERIC Educational Resources Information Center

Beech, John R.; Mayall, Kate A.

2005-01-01

This study investigates the relative roles of internal and external letter features in word recognition. In Experiment 1 the efficacy of outer word fragments (words with all their horizontal internal features removed) was compared with inner word fragments (words with their outer features removed) as primes in a forward masking paradigm. These…
Atypical development of configural face recognition in children with autism, Down syndrome and Williams syndrome.

PubMed

Dimitriou, D; Leonard, H C; Karmiloff-Smith, A; Johnson, M H; Thomas, M S C

2015-05-01

Configural processing in face recognition is a sensitivity to the spacing between facial features. It has been argued both that its presence represents a high level of expertise in face recognition, and also that it is a developmentally vulnerable process. We report a cross-syndrome investigation of the development of configural face recognition in school-aged children with autism, Down syndrome and Williams syndrome compared with a typically developing comparison group. Cross-sectional trajectory analyses were used to compare configural and featural face recognition utilising the 'Jane faces' task. Trajectories were constructed linking featural and configural performance either to chronological age or to different measures of mental age (receptive vocabulary, visuospatial construction), as well as the Benton face recognition task. An emergent inversion effect across age for detecting configural but not featural changes in faces was established as the marker of typical development. Children from clinical groups displayed atypical profiles that differed across all groups. We discuss the implications for the nature of face processing within the respective developmental disorders, and how the cross-sectional syndrome comparison informs the constraints that shape the typical development of face recognition. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection.

PubMed

Li, Baopu; Meng, Max Q-H

2012-05-01

Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.
Machine-assisted verification of latent fingerprints: first results for nondestructive contact-less optical acquisition techniques with a CWL sensor

NASA Astrophysics Data System (ADS)

Hildebrandt, Mario; Kiltz, Stefan; Krapyvskyy, Dmytro; Dittmann, Jana; Vielhauer, Claus; Leich, Marcus

2011-11-01

A machine-assisted analysis of traces from crime scenes might be possible with the advent of new high-resolution non-destructive contact-less acquisition techniques for latent fingerprints. This requires reliable techniques for the automatic extraction of fingerprint features from latent and exemplar fingerprints for matching purposes using pattern recognition approaches. Therefore, we evaluate the NIST Biometric Image Software for the feature extraction and verification of contact-lessly acquired latent fingerprints to determine potential error rates. Our exemplary test setup includes 30 latent fingerprints from 5 people in two test sets that are acquired from different surfaces using a chromatic white light sensor. The first test set includes 20 fingerprints on two different surfaces. It is used to determine the feature extraction performance. The second test set includes one latent fingerprint on 10 different surfaces and an exemplar fingerprint to determine the verification performance. This utilized sensing technique does not require a physical or chemical visibility enhancement of the fingerprint residue, thus the original trace remains unaltered for further investigations. No particular feature extraction and verification techniques have been applied to such data, yet. Hence, we see the need for appropriate algorithms that are suitable to support forensic investigations.
Convex Hull Aided Registration Method (CHARM).

PubMed

Fan, Jingfan; Yang, Jian; Zhao, Yitian; Ai, Danni; Liu, Yonghuai; Wang, Ge; Wang, Yongtian

2017-09-01

Non-rigid registration finds many applications such as photogrammetry, motion tracking, model retrieval, and object recognition. In this paper we propose a novel convex hull aided registration method (CHARM) to match two point sets subject to a non-rigid transformation. First, two convex hulls are extracted from the source and target respectively. Then, all points of the point sets are projected onto the reference plane through each triangular facet of the hulls. From these projections, invariant features are extracted and matched optimally. The matched feature point pairs are mapped back onto the triangular facets of the convex hulls to remove outliers that are outside any relevant triangular facet. The rigid transformation from the source to the target is robustly estimated by the random sample consensus (RANSAC) scheme through minimizing the distance between the matched feature point pairs. Finally, these feature points are utilized as the control points to achieve non-rigid deformation in the form of thin-plate spline of the entire source point set towards the target one. The experimental results based on both synthetic and real data show that the proposed algorithm outperforms several state-of-the-art ones with respect to sampling, rotational angle, and data noise. In addition, the proposed CHARM algorithm also shows higher computational efficiency compared to these methods.
On the recognition of emotional vocal expressions: motivations for a holistic approach.

PubMed

Esposito, Anna; Esposito, Antonietta M

2012-10-01

Human beings seem to be able to recognize emotions from speech very well and information communication technology aims to implement machines and agents that can do the same. However, to be able to automatically recognize affective states from speech signals, it is necessary to solve two main technological problems. The former concerns the identification of effective and efficient processing algorithms capable of capturing emotional acoustic features from speech sentences. The latter focuses on finding computational models able to classify, with an approximation as good as human listeners, a given set of emotional states. This paper will survey these topics and provide some insights for a holistic approach to the automatic analysis, recognition and synthesis of affective states.
A distinguishing method of printed and handwritten legal amount on Chinese bank check

NASA Astrophysics Data System (ADS)

Zhu, Ningbo; Lou, Zhen; Yang, Jingyu

2003-09-01

While carrying out Optical Chinese Character Recognition, distinguishing the font between printed and handwritten characters at the early phase is necessary, because there is so much difference between the methods on recognizing these two types of characters. In this paper, we proposed a good method on how to banish seals and its relative standards that can judge whether they should be banished. Meanwhile, an approach on clearing up scattered noise shivers after image segmentation is presented. Four sets of classifying features that show discrimination between printed and handwritten characters are well adopted. The proposed approach was applied to an automatic check processing system and tested on about 9031 checks. The recognition rate is more than 99.5%.
Three-dimensional fingerprint recognition by using convolution neural network

NASA Astrophysics Data System (ADS)

Tian, Qianyu; Gao, Nan; Zhang, Zonghua

2018-01-01

With the development of science and technology and the improvement of social information, fingerprint recognition technology has become a hot research direction and been widely applied in many actual fields because of its feasibility and reliability. The traditional two-dimensional (2D) fingerprint recognition method relies on matching feature points. This method is not only time-consuming, but also lost three-dimensional (3D) information of fingerprint, with the fingerprint rotation, scaling, damage and other issues, a serious decline in robustness. To solve these problems, 3D fingerprint has been used to recognize human being. Because it is a new research field, there are still lots of challenging problems in 3D fingerprint recognition. This paper presents a new 3D fingerprint recognition method by using a convolution neural network (CNN). By combining 2D fingerprint and fingerprint depth map into CNN, and then through another CNN feature fusion, the characteristics of the fusion complete 3D fingerprint recognition after classification. This method not only can preserve 3D information of fingerprints, but also solves the problem of CNN input. Moreover, the recognition process is simpler than traditional feature point matching algorithm. 3D fingerprint recognition rate by using CNN is compared with other fingerprint recognition algorithms. The experimental results show that the proposed 3D fingerprint recognition method has good recognition rate and robustness.
A face and palmprint recognition approach based on discriminant DCT feature extraction.

PubMed

Jing, Xiao-Yuan; Zhang, David

2004-12-01

In the field of image processing and recognition, discrete cosine transform (DCT) and linear discrimination are two widely used techniques. Based on them, we present a new face and palmprint recognition approach in this paper. It first uses a two-dimensional separability judgment to select the DCT frequency bands with favorable linear separability. Then from the selected bands, it extracts the linear discriminative features by an improved Fisherface method and performs the classification by the nearest neighbor classifier. We detailedly analyze theoretical advantages of our approach in feature extraction. The experiments on face databases and palmprint database demonstrate that compared to the state-of-the-art linear discrimination methods, our approach obtains better classification performance. It can significantly improve the recognition rates for face and palmprint data and effectively reduce the dimension of feature space.

Joint and collaborative representation with local Volterra kernels convolution feature for face recognition

NASA Astrophysics Data System (ADS)

Feng, Guang; Li, Hengjian; Dong, Jiwen; Chen, Xi; Yang, Huiru

2018-04-01

In this paper, we proposed a joint and collaborative representation with Volterra kernel convolution feature (JCRVK) for face recognition. Firstly, the candidate face images are divided into sub-blocks in the equal size. The blocks are extracted feature using the two-dimensional Voltera kernels discriminant analysis, which can better capture the discrimination information from the different faces. Next, the proposed joint and collaborative representation is employed to optimize and classify the local Volterra kernels features (JCR-VK) individually. JCR-VK is very efficiently for its implementation only depending on matrix multiplication. Finally, recognition is completed by using the majority voting principle. Extensive experiments on the Extended Yale B and AR face databases are conducted, and the results show that the proposed approach can outperform other recently presented similar dictionary algorithms on recognition accuracy.
Breaking object correspondence across saccades impairs object recognition: The role of color and luminance.

PubMed

Poth, Christian H; Schneider, Werner X

2016-09-01

Rapid saccadic eye movements bring the foveal region of the eye's retina onto objects for high-acuity vision. Saccades change the location and resolution of objects' retinal images. To perceive objects as visually stable across saccades, correspondence between the objects before and after the saccade must be established. We have previously shown that breaking object correspondence across the saccade causes a decrement in object recognition (Poth, Herwig, & Schneider, 2015). Color and luminance can establish object correspondence, but it is unknown how these surface features contribute to transsaccadic visual processing. Here, we investigated whether changing the surface features color-and-luminance and color alone across saccades impairs postsaccadic object recognition. Participants made saccades to peripheral objects, which either maintained or changed their surface features across the saccade. After the saccade, participants briefly viewed a letter within the saccade target object (terminated by a pattern mask). Postsaccadic object recognition was assessed as participants' accuracy in reporting the letter. Experiment A used the colors green and red with different luminances as surface features, Experiment B blue and yellow with approximately the same luminances. Changing the surface features across the saccade deteriorated postsaccadic object recognition in both experiments. These findings reveal a link between object recognition and object correspondence relying on the surface features colors and luminance, which is currently not addressed in theories of transsaccadic perception. We interpret the findings within a recent theory ascribing this link to visual attention (Schneider, 2013).
Quantitative analysis and feature recognition in 3-D microstructural data sets

NASA Astrophysics Data System (ADS)

Lewis, A. C.; Suh, C.; Stukowski, M.; Geltmacher, A. B.; Spanos, G.; Rajan, K.

2006-12-01

A three-dimensional (3-D) reconstruction of an austenitic stainless-steel microstructure was used as input for an image-based finite-element model to simulate the anisotropic elastic mechanical response of the microstructure. The quantitative data-mining and data-warehousing techniques used to correlate regions of high stress with critical microstructural features are discussed. Initial analysis of elastic stresses near grain boundaries due to mechanical loading revealed low overall correlation with their location in the microstructure. However, the use of data-mining and feature-tracking techniques to identify high-stress outliers revealed that many of these high-stress points are generated near grain boundaries and grain edges (triple junctions). These techniques also allowed for the differentiation between high stresses due to boundary conditions of the finite volume reconstructed, and those due to 3-D microstructural features.
Iris Matching Based on Personalized Weight Map.

PubMed

Dong, Wenbo; Sun, Zhenan; Tan, Tieniu

2011-09-01

Iris recognition typically involves three steps, namely, iris image preprocessing, feature extraction, and feature matching. The first two steps of iris recognition have been well studied, but the last step is less addressed. Each human iris has its unique visual pattern and local image features also vary from region to region, which leads to significant differences in robustness and distinctiveness among the feature codes derived from different iris regions. However, most state-of-the-art iris recognition methods use a uniform matching strategy, where features extracted from different regions of the same person or the same region for different individuals are considered to be equally important. This paper proposes a personalized iris matching strategy using a class-specific weight map learned from the training images of the same iris class. The weight map can be updated online during the iris recognition procedure when the successfully recognized iris images are regarded as the new training data. The weight map reflects the robustness of an encoding algorithm on different iris regions by assigning an appropriate weight to each feature code for iris matching. Such a weight map trained by sufficient iris templates is convergent and robust against various noise. Extensive and comprehensive experiments demonstrate that the proposed personalized iris matching strategy achieves much better iris recognition performance than uniform strategies, especially for poor quality iris images.
Recognition during recall failure: Semantic feature matching as a mechanism for recognition of semantic cues when recall fails.

PubMed

Cleary, Anne M; Ryals, Anthony J; Wagner, Samantha R

2016-01-01

Research suggests that a feature-matching process underlies cue familiarity-detection when cued recall with graphemic cues fails. When a test cue (e.g., potchbork) overlaps in graphemic features with multiple unrecalled studied items (e.g., patchwork, pitchfork, pocketbook, pullcork), higher cue familiarity ratings are given during recall failure of all of the targets than when the cue overlaps in graphemic features with only one studied target and that target fails to be recalled (e.g., patchwork). The present study used semantic feature production norms (McRae et al., Behavior Research Methods, Instruments, & Computers, 37, 547-559, 2005) to examine whether the same holds true when the cues are semantic in nature (e.g., jaguar is used to cue cheetah). Indeed, test cues (e.g., cedar) that overlapped in semantic features (e.g., a_tree, has_bark, etc.) with four unretrieved studied items (e.g., birch, oak, pine, willow) received higher cue familiarity ratings during recall failure than test cues that overlapped in semantic features with only two (also unretrieved) studied items (e.g., birch, oak), which in turn received higher familiarity ratings during recall failure than cues that did not overlap in semantic features with any studied items. These findings suggest that the feature-matching theory of recognition during recall failure can accommodate recognition of semantic cues during recall failure, providing a potential mechanism for conceptually-based forms of cue recognition during target retrieval failure. They also provide converging evidence for the existence of the semantic features envisaged in feature-based models of semantic knowledge representation and for those more concretely specified by the production norms of McRae et al. (Behavior Research Methods, Instruments, & Computers, 37, 547-559, 2005).
Salient Feature Identification and Analysis using Kernel-Based Classification Techniques for Synthetic Aperture Radar Automatic Target Recognition

DTIC Science & Technology

2014-03-27

and machine learning for a range of research including such topics as medical imaging [10] and handwriting recognition [11]. The type of feature...1989. [11] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “Online handwriting recognition with support vector machines-a kernel approach,” in Eighth...International Workshop on Frontiers in Handwriting Recognition, pp. 49–54, IEEE, 2002. [12] C. Cortes and V. Vapnik, “Support-vector networks,” Machine
A new method for recognizing quadric surfaces from range data and its application to telerobotics and automation, final phase

NASA Technical Reports Server (NTRS)

Mielke, Roland; Dcunha, Ivan; Alvertos, Nicolas

1994-01-01

In the final phase of the proposed research a complete top to down three dimensional object recognition scheme has been proposed. The various three dimensional objects included spheres, cones, cylinders, ellipsoids, paraboloids, and hyperboloids. Utilizing a newly developed blob determination technique, a given range scene with several non-cluttered quadric surfaces is segmented. Next, using the earlier (phase 1) developed alignment scheme, each of the segmented objects are then aligned in a desired coordinate system. For each of the quadric surfaces based upon their intersections with certain pre-determined planes, a set of distinct features (curves) are obtained. A database with entities such as the equations of the planes and angular bounds of these planes has been created for each of the quadric surfaces. Real range data of spheres, cones, cylinders, and parallelpipeds have been utilized for the recognition process. The developed algorithm gave excellent results for the real data as well as for several sets of simulated range data.
Insights from Classifying Visual Concepts with Multiple Kernel Learning

PubMed Central

Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; Brefeld, Ulf; Müller, Klaus-Robert; Kawanabe, Motoaki

2012-01-01

Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25). PMID:22936970
Incorporating domain knowledge in chemical and biomedical named entity recognition with word representations.

PubMed

Munkhdalai, Tsendsuren; Li, Meijing; Batsuren, Khuyagbaatar; Park, Hyeon Ah; Choi, Nak Hyeon; Ryu, Keun Ho

2015-01-01

Chemical and biomedical Named Entity Recognition (NER) is an essential prerequisite task before effective text mining can begin for biochemical-text data. Exploiting unlabeled text data to leverage system performance has been an active and challenging research topic in text mining due to the recent growth in the amount of biomedical literature. We present a semi-supervised learning method that efficiently exploits unlabeled data in order to incorporate domain knowledge into a named entity recognition model and to leverage system performance. The proposed method includes Natural Language Processing (NLP) tasks for text preprocessing, learning word representation features from a large amount of text data for feature extraction, and conditional random fields for token classification. Other than the free text in the domain, the proposed method does not rely on any lexicon nor any dictionary in order to keep the system applicable to other NER tasks in bio-text data. We extended BANNER, a biomedical NER system, with the proposed method. This yields an integrated system that can be applied to chemical and drug NER or biomedical NER. We call our branch of the BANNER system BANNER-CHEMDNER, which is scalable over millions of documents, processing about 530 documents per minute, is configurable via XML, and can be plugged into other systems by using the BANNER Unstructured Information Management Architecture (UIMA) interface. BANNER-CHEMDNER achieved an 85.68% and an 86.47% F-measure on the testing sets of CHEMDNER Chemical Entity Mention (CEM) and Chemical Document Indexing (CDI) subtasks, respectively, and achieved an 87.04% F-measure on the official testing set of the BioCreative II gene mention task, showing remarkable performance in both chemical and biomedical NER. BANNER-CHEMDNER system is available at: https://bitbucket.org/tsendeemts/banner-chemdner.
Reading Faces: From Features to Recognition.

PubMed

Guntupalli, J Swaroop; Gobbini, M Ida

2017-12-01

Chang and Tsao recently reported that the monkey face patch system encodes facial identity in a space of facial features as opposed to exemplars. Here, we discuss how such coding might contribute to face recognition, emphasizing the critical role of learning and interactions with other brain areas for optimizing the recognition of familiar faces. Copyright © 2017 Elsevier Ltd. All rights reserved.
Tensor Rank Preserving Discriminant Analysis for Facial Recognition.

PubMed

Tao, Dapeng; Guo, Yanan; Li, Yaotang; Gao, Xinbo

2017-10-12

Facial recognition, one of the basic topics in computer vision and pattern recognition, has received substantial attention in recent years. However, for those traditional facial recognition algorithms, the facial images are reshaped to a long vector, thereby losing part of the original spatial constraints of each pixel. In this paper, a new tensor-based feature extraction algorithm termed tensor rank preserving discriminant analysis (TRPDA) for facial image recognition is proposed; the proposed method involves two stages: in the first stage, the low-dimensional tensor subspace of the original input tensor samples was obtained; in the second stage, discriminative locality alignment was utilized to obtain the ultimate vector feature representation for subsequent facial recognition. On the one hand, the proposed TRPDA algorithm fully utilizes the natural structure of the input samples, and it applies an optimization criterion that can directly handle the tensor spectral analysis problem, thereby decreasing the computation cost compared those traditional tensor-based feature selection algorithms. On the other hand, the proposed TRPDA algorithm extracts feature by finding a tensor subspace that preserves most of the rank order information of the intra-class input samples. Experiments on the three facial databases are performed here to determine the effectiveness of the proposed TRPDA algorithm.
Multi-channel feature dictionaries for RGB-D object recognition

NASA Astrophysics Data System (ADS)

Lan, Xiaodong; Li, Qiming; Chong, Mina; Song, Jian; Li, Jun

2018-04-01

Hierarchical matching pursuit (HMP) is a popular feature learning method for RGB-D object recognition. However, the feature representation with only one dictionary for RGB channels in HMP does not capture sufficient visual information. In this paper, we propose multi-channel feature dictionaries based feature learning method for RGB-D object recognition. The process of feature extraction in the proposed method consists of two layers. The K-SVD algorithm is used to learn dictionaries in sparse coding of these two layers. In the first-layer, we obtain features by performing max pooling on sparse codes of pixels in a cell. And the obtained features of cells in a patch are concatenated to generate patch jointly features. Then, patch jointly features in the first-layer are used to learn the dictionary and sparse codes in the second-layer. Finally, spatial pyramid pooling can be applied to the patch jointly features of any layer to generate the final object features in our method. Experimental results show that our method with first or second-layer features can obtain a comparable or better performance than some published state-of-the-art methods.
Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning.

PubMed

Sadeghi, Zahra; Testolin, Alberto

2017-08-01

In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.
Multimodal biometric method that combines veins, prints, and shape of a finger

NASA Astrophysics Data System (ADS)

Kang, Byung Jun; Park, Kang Ryoung; Yoo, Jang-Hee; Kim, Jeong Nyeo

2011-01-01

Multimodal biometrics provides high recognition accuracy and population coverage by using various biometric features. A single finger contains finger veins, fingerprints, and finger geometry features; by using multimodal biometrics, information on these multiple features can be simultaneously obtained in a short time and their fusion can outperform the use of a single feature. This paper proposes a new finger recognition method based on the score-level fusion of finger veins, fingerprints, and finger geometry features. This research is novel in the following four ways. First, the performances of the finger-vein and fingerprint recognition are improved by using a method based on a local derivative pattern. Second, the accuracy of the finger geometry recognition is greatly increased by combining a Fourier descriptor with principal component analysis. Third, a fuzzy score normalization method is introduced; its performance is better than the conventional Z-score normalization method. Fourth, finger-vein, fingerprint, and finger geometry recognitions are combined by using three support vector machines and a weighted SUM rule. Experimental results showed that the equal error rate of the proposed method was 0.254%, which was lower than those of the other methods.
Incoherent optical generalized Hough transform: pattern recognition and feature extraction applications

NASA Astrophysics Data System (ADS)

Fernández, Ariel; Ferrari, José A.

2017-05-01

Pattern recognition and feature extraction are image processing applications of great interest in defect inspection and robot vision among others. In comparison to purely digital methods, the attractiveness of optical processors for pattern recognition lies in their highly parallel operation and real-time processing capability. This work presents an optical implementation of the generalized Hough transform (GHT), a well-established technique for recognition of geometrical features in binary images. Detection of a geometric feature under the GHT is accomplished by mapping the original image to an accumulator space; the large computational requirements for this mapping make the optical implementation an attractive alternative to digital-only methods. We explore an optical setup where the transformation is obtained, and the size and orientation parameters can be controlled, allowing for dynamic scale and orientation-variant pattern recognition. A compact system for the above purposes results from the use of an electrically tunable lens for scale control and a pupil mask implemented on a high-contrast spatial light modulator for orientation/shape variation of the template. Real-time can also be achieved. In addition, by thresholding of the GHT and optically inverse transforming, the previously detected features of interest can be extracted.
An adaptive deep Q-learning strategy for handwritten digit recognition.

PubMed

Qiao, Junfei; Wang, Gongming; Li, Wenjing; Chen, Min

2018-02-22

Handwritten digits recognition is a challenging problem in recent years. Although many deep learning-based classification algorithms are studied for handwritten digits recognition, the recognition accuracy and running time still need to be further improved. In this paper, an adaptive deep Q-learning strategy is proposed to improve accuracy and shorten running time for handwritten digit recognition. The adaptive deep Q-learning strategy combines the feature-extracting capability of deep learning and the decision-making of reinforcement learning to form an adaptive Q-learning deep belief network (Q-ADBN). First, Q-ADBN extracts the features of original images using an adaptive deep auto-encoder (ADAE), and the extracted features are considered as the current states of Q-learning algorithm. Second, Q-ADBN receives Q-function (reward signal) during recognition of the current states, and the final handwritten digits recognition is implemented by maximizing the Q-function using Q-learning algorithm. Finally, experimental results from the well-known MNIST dataset show that the proposed Q-ADBN has a superiority to other similar methods in terms of accuracy and running time. Copyright © 2018 Elsevier Ltd. All rights reserved.
Modeling of skin cancer dermatoscopy images

NASA Astrophysics Data System (ADS)

Iralieva, Malica B.; Myakinin, Oleg O.; Bratchenko, Ivan A.; Zakharov, Valery P.

2018-04-01

An early identified cancer is more likely to effective respond to treatment and has a less expensive treatment as well. Dermatoscopy is one of general diagnostic techniques for skin cancer early detection that allows us in vivo evaluation of colors and microstructures on skin lesions. Digital phantoms with known properties are required during new instrument developing to compare sample's features with data from the instrument. An algorithm for image modeling of skin cancer is proposed in the paper. Steps of the algorithm include setting shape, texture generation, adding texture and normal skin background setting. The Gaussian represents the shape, and then the texture generation based on a fractal noise algorithm is responsible for spatial chromophores distributions, while the colormap applied to the values corresponds to spectral properties. Finally, a normal skin image simulated by mixed Monte Carlo method using a special online tool is added as a background. Varying of Asymmetry, Borders, Colors and Diameter settings is shown to be fully matched to the ABCD clinical recognition algorithm. The asymmetry is specified by setting different standard deviation values of Gaussian in different parts of image. The noise amplitude is increased to set the irregular borders score. Standard deviation is changed to determine size of the lesion. Colors are set by colormap changing. The algorithm for simulating different structural elements is required to match with others recognition algorithms.
Method to assess the temporal persistence of potential biometric features: Application to oculomotor, gait, face and brain structure databases

PubMed Central

Nixon, Mark S.; Komogortsev, Oleg V.

2017-01-01

We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before. PMID:28575030
Method to assess the temporal persistence of potential biometric features: Application to oculomotor, gait, face and brain structure databases.

PubMed

Friedman, Lee; Nixon, Mark S; Komogortsev, Oleg V

2017-01-01

We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before.
The effects of TIS and MI on the texture features in ultrasonic fatty liver images

NASA Astrophysics Data System (ADS)

Zhao, Yuan; Cheng, Xinyao; Ding, Mingyue

2017-03-01

Nonalcoholic fatty liver disease (NAFLD) is prevalent and has a worldwide distribution now. Although ultrasound imaging technology has been deemed as the common method to diagnose fatty liver, it is not able to detect NAFLD in its early stage and limited by the diagnostic instruments and some other factors. B-scan image feature extraction of fatty liver can assist doctor to analyze the patient's situation and enhance the efficiency and accuracy of clinical diagnoses. However, some uncertain factors in ultrasonic diagnoses are often been ignored during feature extraction. In this study, the nonalcoholic fatty liver rabbit model was made and its liver ultrasound images were collected by setting different Thermal index of soft tissue (TIS) and mechanical index (MI). Then, texture features were calculated based on gray level co-occurrence matrix (GLCM) and the impacts of TIS and MI on these features were analyzed and discussed. Furthermore, the receiver operating characteristic (ROC) curve was used to evaluate whether each feature was effective or not when TIS and MI were given. The results showed that TIS and MI do affect the features extracted from the healthy liver, while the texture features of fatty liver are relatively stable. In addition, TIS set to 0.3 and MI equal to 0.9 might be a better choice when using a computer aided diagnosis (CAD) method for fatty liver recognition.

Research on improving image recognition robustness by combining multiple features with associative memory

NASA Astrophysics Data System (ADS)

Guo, Dongwei; Wang, Zhe

2018-05-01

Convolutional neural networks (CNN) achieve great success in computer vision, it can learn hierarchical representation from raw pixels and has outstanding performance in various image recognition tasks [1]. However, CNN is easy to be fraudulent in terms of it is possible to produce images totally unrecognizable to human eyes that CNNs believe with near certainty are familiar objects. [2]. In this paper, an associative memory model based on multiple features is proposed. Within this model, feature extraction and classification are carried out by CNN, T-SNE and exponential bidirectional associative memory neural network (EBAM). The geometric features extracted from CNN and the digital features extracted from T-SNE are associated by EBAM. Thus we ensure the recognition of robustness by a comprehensive assessment of the two features. In our model, we can get only 8% error rate with fraudulent data. In systems that require a high safety factor or some key areas, strong robustness is extremely important, if we can ensure the image recognition robustness, network security will be greatly improved and the social production efficiency will be extremely enhanced.
Feature extraction for face recognition via Active Shape Model (ASM) and Active Appearance Model (AAM)

NASA Astrophysics Data System (ADS)

Iqtait, M.; Mohamad, F. S.; Mamat, M.

2018-03-01

Biometric is a pattern recognition system which is used for automatic recognition of persons based on characteristics and features of an individual. Face recognition with high recognition rate is still a challenging task and usually accomplished in three phases consisting of face detection, feature extraction, and expression classification. Precise and strong location of trait point is a complicated and difficult issue in face recognition. Cootes proposed a Multi Resolution Active Shape Models (ASM) algorithm, which could extract specified shape accurately and efficiently. Furthermore, as the improvement of ASM, Active Appearance Models algorithm (AAM) is proposed to extracts both shape and texture of specified object simultaneously. In this paper we give more details about the two algorithms and give the results of experiments, testing their performance on one dataset of faces. We found that the ASM is faster and gains more accurate trait point location than the AAM, but the AAM gains a better match to the texture.
Automatic detection and recognition of traffic signs in stereo images based on features and probabilistic neural networks

NASA Astrophysics Data System (ADS)

Sheng, Yehua; Zhang, Ka; Ye, Chun; Liang, Cheng; Li, Jian

2008-04-01

Considering the problem of automatic traffic sign detection and recognition in stereo images captured under motion conditions, a new algorithm for traffic sign detection and recognition based on features and probabilistic neural networks (PNN) is proposed in this paper. Firstly, global statistical color features of left image are computed based on statistics theory. Then for red, yellow and blue traffic signs, left image is segmented to three binary images by self-adaptive color segmentation method. Secondly, gray-value projection and shape analysis are used to confirm traffic sign regions in left image. Then stereo image matching is used to locate the homonymy traffic signs in right image. Thirdly, self-adaptive image segmentation is used to extract binary inner core shapes of detected traffic signs. One-dimensional feature vectors of inner core shapes are computed by central projection transformation. Fourthly, these vectors are input to the trained probabilistic neural networks for traffic sign recognition. Lastly, recognition results in left image are compared with recognition results in right image. If results in stereo images are identical, these results are confirmed as final recognition results. The new algorithm is applied to 220 real images of natural scenes taken by the vehicle-borne mobile photogrammetry system in Nanjing at different time. Experimental results show a detection and recognition rate of over 92%. So the algorithm is not only simple, but also reliable and high-speed on real traffic sign detection and recognition. Furthermore, it can obtain geometrical information of traffic signs at the same time of recognizing their types.
Vehicle Color Recognition with Vehicle-Color Saliency Detection and Dual-Orientational Dimensionality Reduction of CNN Deep Features

NASA Astrophysics Data System (ADS)

Zhang, Qiang; Li, Jiafeng; Zhuo, Li; Zhang, Hui; Li, Xiaoguang

2017-12-01

Color is one of the most stable attributes of vehicles and often used as a valuable cue in some important applications. Various complex environmental factors, such as illumination, weather, noise and etc., result in the visual characteristics of the vehicle color being obvious diversity. Vehicle color recognition in complex environments has been a challenging task. The state-of-the-arts methods roughly take the whole image for color recognition, but many parts of the images such as car windows; wheels and background contain no color information, which will have negative impact on the recognition accuracy. In this paper, a novel vehicle color recognition method using local vehicle-color saliency detection and dual-orientational dimensionality reduction of convolutional neural network (CNN) deep features has been proposed. The novelty of the proposed method includes two parts: (1) a local vehicle-color saliency detection method has been proposed to determine the vehicle color region of the vehicle image and exclude the influence of non-color regions on the recognition accuracy; (2) dual-orientational dimensionality reduction strategy has been designed to greatly reduce the dimensionality of deep features that are learnt from CNN, which will greatly mitigate the storage and computational burden of the subsequent processing, while improving the recognition accuracy. Furthermore, linear support vector machine is adopted as the classifier to train the dimensionality reduced features to obtain the recognition model. The experimental results on public dataset demonstrate that the proposed method can achieve superior recognition performance over the state-of-the-arts methods.
Solving problems by interrogating sets of knowledge systems: Toward a theory of multiple knowledge systems

NASA Technical Reports Server (NTRS)

Dekorvin, Andre

1989-01-01

The main purpose is to develop a theory for multiple knowledge systems. A knowledge system could be a sensor or an expert system, but it must specialize in one feature. The problem is that we have an exhaustive list of possible answers to some query (such as what object is it). By collecting different feature values, in principle, it should be possible to give an answer to the query, or at least narrow down the list. Since a sensor, or for that matter an expert system, does not in most cases yield a precise value for the feature, uncertainty must be built into the model. Also, researchers must have a formal mechanism to be able to put the information together. Researchers chose to use the Dempster-Shafer approach to handle the problems mentioned above. Researchers introduce the concept of a state of recognition and point out that there is a relation between receiving updates and defining a set valued Markov Chain. Also, deciding what the value of the next set valued variable is can be phrased in terms of classical decision making theory such as minimizing the maximum regret. Other related problems are examined.
Fuzzy set methods for object recognition in space applications

NASA Technical Reports Server (NTRS)

Keller, James M.

1991-01-01

Progress on the following tasks is reported: (1) fuzzy set-based decision making methodologies; (2) feature calculation; (3) clustering for curve and surface fitting; and (4) acquisition of images. The general structure for networks based on fuzzy set connectives which are being used for information fusion and decision making in space applications is described. The structure and training techniques for such networks consisting of generalized means and gamma-operators are described. The use of other hybrid operators in multicriteria decision making is currently being examined. Numerous classical features on image regions such as gray level statistics, edge and curve primitives, texture measures from cooccurrance matrix, and size and shape parameters were implemented. Several fractal geometric features which may have a considerable impact on characterizing cluttered background, such as clouds, dense star patterns, or some planetary surfaces, were used. A new approach to a fuzzy C-shell algorithm is addressed. NASA personnel are in the process of acquiring suitable simulation data and hopefully videotaped actual shuttle imagery. Photographs have been digitized to use in the algorithms. Also, a model of the shuttle was assembled and a mechanism to orient this model in 3-D to digitize for experiments on pose estimation is being constructed.
Face-iris multimodal biometric scheme based on feature level fusion

NASA Astrophysics Data System (ADS)

Huo, Guang; Liu, Yuanning; Zhu, Xiaodong; Dong, Hongxing; He, Fei

2015-11-01

Unlike score level fusion, feature level fusion demands all the features extracted from unimodal traits with high distinguishability, as well as homogeneity and compatibility, which is difficult to achieve. Therefore, most multimodal biometric research focuses on score level fusion, whereas few investigate feature level fusion. We propose a face-iris recognition method based on feature level fusion. We build a special two-dimensional-Gabor filter bank to extract local texture features from face and iris images, and then transform them by histogram statistics into an energy-orientation variance histogram feature with lower dimensions and higher distinguishability. Finally, through a fusion-recognition strategy based on principal components analysis and support vector machine (FRSPS), feature level fusion and one-to-n identification are accomplished. The experimental results demonstrate that this method can not only effectively extract face and iris features but also provide higher recognition accuracy. Compared with some state-of-the-art fusion methods, the proposed method has a significant performance advantage.
Image preprocessing study on KPCA-based face recognition

NASA Astrophysics Data System (ADS)

Li, Xuan; Li, Dehua

2015-12-01

Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.
Face recognition algorithm based on Gabor wavelet and locality preserving projections

NASA Astrophysics Data System (ADS)

Liu, Xiaojie; Shen, Lin; Fan, Honghui

2017-07-01

In order to solve the effects of illumination changes and differences of personal features on the face recognition rate, this paper presents a new face recognition algorithm based on Gabor wavelet and Locality Preserving Projections (LPP). The problem of the Gabor filter banks with high dimensions was solved effectively, and also the shortcoming of the LPP on the light illumination changes was overcome. Firstly, the features of global image information were achieved, which used the good spatial locality and orientation selectivity of Gabor wavelet filters. Then the dimensions were reduced by utilizing the LPP, which well-preserved the local information of the image. The experimental results shown that this algorithm can effectively extract the features relating to facial expressions, attitude and other information. Besides, it can reduce influence of the illumination changes and the differences in personal features effectively, which improves the face recognition rate to 99.2%.
The recognition of emotional expression in prosopagnosia: decoding whole and part faces.

PubMed

Stephan, Blossom Christa Maree; Breen, Nora; Caine, Diana

2006-11-01

Prosopagnosia is currently viewed within the constraints of two competing theories of face recognition, one highlighting the analysis of features, the other focusing on configural processing of the whole face. This study investigated the role of feature analysis versus whole face configural processing in the recognition of facial expression. A prosopagnosic patient, SC made expression decisions from whole and incomplete (eyes-only and mouth-only) faces where features had been obscured. SC was impaired at recognizing some (e.g., anger, sadness, and fear), but not all (e.g., happiness) emotional expressions from the whole face. Analyses of his performance on incomplete faces indicated that his recognition of some expressions actually improved relative to his performance on the whole face condition. We argue that in SC interference from damaged configural processes seem to override an intact ability to utilize part-based or local feature cues.
Tuberculosis disease diagnosis using artificial immune recognition system.

PubMed

Shamshirband, Shahaboddin; Hessam, Somayeh; Javidnia, Hossein; Amiribesheli, Mohsen; Vahdat, Shaghayegh; Petković, Dalibor; Gani, Abdullah; Kiah, Miss Laiha Mat

2014-01-01

There is a high risk of tuberculosis (TB) disease diagnosis among conventional methods. This study is aimed at diagnosing TB using hybrid machine learning approaches. Patient epicrisis reports obtained from the Pasteur Laboratory in the north of Iran were used. All 175 samples have twenty features. The features are classified based on incorporating a fuzzy logic controller and artificial immune recognition system. The features are normalized through a fuzzy rule based on a labeling system. The labeled features are categorized into normal and tuberculosis classes using the Artificial Immune Recognition Algorithm. Overall, the highest classification accuracy reached was for the 0.8 learning rate (α) values. The artificial immune recognition system (AIRS) classification approaches using fuzzy logic also yielded better diagnosis results in terms of detection accuracy compared to other empirical methods. Classification accuracy was 99.14%, sensitivity 87.00%, and specificity 86.12%.
Learning dictionaries of sparse codes of 3D movements of body joints for real-time human activity understanding.

PubMed

Qi, Jin; Yang, Zhiyong

2014-01-01

Real-time human activity recognition is essential for human-robot interactions for assisted healthy independent living. Most previous work in this area is performed on traditional two-dimensional (2D) videos and both global and local methods have been used. Since 2D videos are sensitive to changes of lighting condition, view angle, and scale, researchers begun to explore applications of 3D information in human activity understanding in recently years. Unfortunately, features that work well on 2D videos usually don't perform well on 3D videos and there is no consensus on what 3D features should be used. Here we propose a model of human activity recognition based on 3D movements of body joints. Our method has three steps, learning dictionaries of sparse codes of 3D movements of joints, sparse coding, and classification. In the first step, space-time volumes of 3D movements of body joints are obtained via dense sampling and independent component analysis is then performed to construct a dictionary of sparse codes for each activity. In the second step, the space-time volumes are projected to the dictionaries and a set of sparse histograms of the projection coefficients are constructed as feature representations of the activities. Finally, the sparse histograms are used as inputs to a support vector machine to recognize human activities. We tested this model on three databases of human activities and found that it outperforms the state-of-the-art algorithms. Thus, this model can be used for real-time human activity recognition in many applications.
Video2vec Embeddings Recognize Events When Examples Are Scarce.

PubMed

Habibian, Amirhossein; Mensink, Thomas; Snoek, Cees G M

2017-10-01

This paper aims for event recognition when video examples are scarce or even completely absent. The key in such a challenging setting is a semantic video representation. Rather than building the representation from individual attribute detectors and their annotations, we propose to learn the entire representation from freely available web videos and their descriptions using an embedding between video features and term vectors. In our proposed embedding, which we call Video2vec, the correlations between the words are utilized to learn a more effective representation by optimizing a joint objective balancing descriptiveness and predictability. We show how learning the Video2vec embedding using a multimodal predictability loss, including appearance, motion and audio features, results in a better predictable representation. We also propose an event specific variant of Video2vec to learn a more accurate representation for the words, which are indicative of the event, by introducing a term sensitive descriptiveness loss. Our experiments on three challenging collections of web videos from the NIST TRECVID Multimedia Event Detection and Columbia Consumer Videos datasets demonstrate: i) the advantages of Video2vec over representations using attributes or alternative embeddings, ii) the benefit of fusing video modalities by an embedding over common strategies, iii) the complementarity of term sensitive descriptiveness and multimodal predictability for event recognition. By its ability to improve predictability of present day audio-visual video features, while at the same time maximizing their semantic descriptiveness, Video2vec leads to state-of-the-art accuracy for both few- and zero-example recognition of events in video.
Development of Personalized Urination Recognition Technology Using Smart Bands.

PubMed

Eun, Sung-Jong; Whangbo, Taeg-Keun; Park, Dong Kyun; Kim, Khae-Hawn

2017-04-01

This study collected and analyzed activity data sensed through smart bands worn by patients in order to resolve the clinical issues posed by using voiding charts. By developing a smart band-based algorithm for recognizing urination activity in patients, this study aimed to explore the feasibility of urination monitoring systems. This study aimed to develop an algorithm that recognizes urination based on a patient's posture and changes in posture. Motion data was obtained from a smart band on the arm. An algorithm that recognizes the 3 stages of urination (forward movement, urination, backward movement) was developed based on data collected from a 3-axis accelerometer and from tilt angle data. Real-time data were acquired from the smart band, and for data corresponding to a certain duration, the absolute value of the signals was calculated and then compared with the set threshold value to determine the occurrence of vibration signals. In feature extraction, the most essential information describing each pattern was identified after analyzing the characteristics of the data. The results of the feature extraction process were sorted using a classifier to detect urination. An experiment was carried out to assess the performance of the recognition technology proposed in this study. The final accuracy of the algorithm was calculated based on clinical guidelines for urologists. The experiment showed a high average accuracy of 90.4%, proving the robustness of the proposed algorithm. The proposed urination recognition technology draws on acceleration data and tilt angle data collected via a smart band; these data were then analyzed using a classifier after comparative analyses with standardized feature patterns.
Machinery running state identification based on discriminant semi-supervised local tangent space alignment for feature fusion and extraction

NASA Astrophysics Data System (ADS)

Su, Zuqiang; Xiao, Hong; Zhang, Yi; Tang, Baoping; Jiang, Yonghua

2017-04-01

Extraction of sensitive features is a challenging but key task in data-driven machinery running state identification. Aimed at solving this problem, a method for machinery running state identification that applies discriminant semi-supervised local tangent space alignment (DSS-LTSA) for feature fusion and extraction is proposed. Firstly, in order to extract more distinct features, the vibration signals are decomposed by wavelet packet decomposition WPD, and a mixed-domain feature set consisted of statistical features, autoregressive (AR) model coefficients, instantaneous amplitude Shannon entropy and WPD energy spectrum is extracted to comprehensively characterize the properties of machinery running state(s). Then, the mixed-dimension feature set is inputted into DSS-LTSA for feature fusion and extraction to eliminate redundant information and interference noise. The proposed DSS-LTSA can extract intrinsic structure information of both labeled and unlabeled state samples, and as a result the over-fitting problem of supervised manifold learning and blindness problem of unsupervised manifold learning are overcome. Simultaneously, class discrimination information is integrated within the dimension reduction process in a semi-supervised manner to improve sensitivity of the extracted fusion features. Lastly, the extracted fusion features are inputted into a pattern recognition algorithm to achieve the running state identification. The effectiveness of the proposed method is verified by a running state identification case in a gearbox, and the results confirm the improved accuracy of the running state identification.
Own- and Other-Race Face Identity Recognition in Children: The Effects of Pose and Feature Composition

ERIC Educational Resources Information Center

Anzures, Gizelle; Kelly, David J.; Pascalis, Olivier; Quinn, Paul C.; Slater, Alan M.; de Viviés, Xavier; Lee, Kang

2014-01-01

We used a matching-to-sample task and manipulated facial pose and feature composition to examine the other-race effect (ORE) in face identity recognition between 5 and 10 years of age. Overall, the present findings provide a genuine measure of own- and other-race face identity recognition in children that is independent of photographic and image…
Recognition of handwritten similar Chinese characters by self-growing probabilistic decision-based neural network.

PubMed

Fu, H C; Xu, Y Y; Chang, H Y

1999-12-01

Recognition of similar (confusion) characters is a difficult problem in optical character recognition (OCR). In this paper, we introduce a neural network solution that is capable of modeling minor differences among similar characters, and is robust to various personal handwriting styles. The Self-growing Probabilistic Decision-based Neural Network (SPDNN) is a probabilistic type neural network, which adopts a hierarchical network structure with nonlinear basis functions and a competitive credit-assignment scheme. Based on the SPDNN model, we have constructed a three-stage recognition system. First, a coarse classifier determines a character to be input to one of the pre-defined subclasses partitioned from a large character set, such as Chinese mixed with alphanumerics. Then a character recognizer determines the input image which best matches the reference character in the subclass. Lastly, the third module is a similar character recognizer, which can further enhance the recognition accuracy among similar or confusing characters. The prototype system has demonstrated a successful application of SPDNN to similar handwritten Chinese recognition for the public database CCL/HCCR1 (5401 characters x200 samples). Regarding performance, experiments on the CCL/HCCR1 database produced 90.12% recognition accuracy with no rejection, and 94.11% accuracy with 6.7% rejection, respectively. This recognition accuracy represents about 4% improvement on the previously announced performance. As to processing speed, processing before recognition (including image preprocessing, segmentation, and feature extraction) requires about one second for an A4 size character image, and recognition consumes approximately 0.27 second per character on a Pentium-100 based personal computer, without use of any hardware accelerator or co-processor.
Multi-resolution analysis for ear recognition using wavelet features

NASA Astrophysics Data System (ADS)

Shoaib, M.; Basit, A.; Faye, I.

2016-11-01

Security is very important and in order to avoid any physical contact, identification of human when they are moving is necessary. Ear biometric is one of the methods by which a person can be identified using surveillance cameras. Various techniques have been proposed to increase the ear based recognition systems. In this work, a feature extraction method for human ear recognition based on wavelet transforms is proposed. The proposed features are approximation coefficients and specific details of level two after applying various types of wavelet transforms. Different wavelet transforms are applied to find the suitable wavelet. Minimum Euclidean distance is used as a matching criterion. Results achieved by the proposed method are promising and can be used in real time ear recognition system.
Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition.

PubMed

Jauregi Unanue, Iñigo; Zare Borzeshi, Ehsan; Piccardi, Massimo

2017-12-01

Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning approaches such as recurrent neural networks (RNNs) have proved capable of automatically learning effective features from either random assignments or automated word "embeddings". (i) To create a highly accurate DNR and CCE system that avoids conventional, time-consuming feature engineering. (ii) To create richer, more specialized word embeddings by using health domain datasets such as MIMIC-III. (iii) To evaluate our systems over three contemporary datasets. Two deep learning methods, namely the Bidirectional LSTM and the Bidirectional LSTM-CRF, are evaluated. A CRF model is set as the baseline to compare the deep learning systems to a traditional machine learning approach. The same features are used for all the models. We have obtained the best results with the Bidirectional LSTM-CRF model, which has outperformed all previously proposed systems. The specialized embeddings have helped to cover unusual words in DrugBank and MedLine, but not in the i2b2/VA dataset. We present a state-of-the-art system for DNR and CCE. Automated word embeddings has allowed us to avoid costly feature engineering and achieve higher accuracy. Nevertheless, the embeddings need to be retrained over datasets that are adequate for the domain, in order to adequately cover the domain-specific vocabulary. Copyright © 2017 Elsevier Inc. All rights reserved.
Integration trumps selection in object recognition.

PubMed

Saarela, Toni P; Landy, Michael S

2015-03-30

Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several "cues" (color, luminance, texture, etc.), and humans can integrate sensory cues to improve detection and recognition [1-3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue invariance by responding to a given shape independent of the visual cue defining it [5-8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10, 11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11, 12], imaging [13-16], and single-cell and neural population recordings [17, 18]. Besides single features, attention can select whole objects [19-21]. Objects are among the suggested "units" of attention because attention to a single feature of an object causes the selection of all of its features [19-21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. Copyright © 2015 Elsevier Ltd. All rights reserved.

Integration trumps selection in object recognition

PubMed Central

Saarela, Toni P.; Landy, Michael S.

2015-01-01

Summary Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several “cues” (color, luminance, texture etc.), and humans can integrate sensory cues to improve detection and recognition [1–3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue-invariance by responding to a given shape independent of the visual cue defining it [5–8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10,11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11,12], imaging [13–16], and single-cell and neural population recordings [17,18]. Besides single features, attention can select whole objects [19–21]. Objects are among the suggested “units” of attention because attention to a single feature of an object causes the selection of all of its features [19–21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near-optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. PMID:25802154
Metric Learning for Hyperspectral Image Segmentation

NASA Technical Reports Server (NTRS)

Bue, Brian D.; Thompson, David R.; Gilmore, Martha S.; Castano, Rebecca

2011-01-01

We present a metric learning approach to improve the performance of unsupervised hyperspectral image segmentation. Unsupervised spatial segmentation can assist both user visualization and automatic recognition of surface features. Analysts can use spatially-continuous segments to decrease noise levels and/or localize feature boundaries. However, existing segmentation methods use tasks-agnostic measures of similarity. Here we learn task-specific similarity measures from training data, improving segment fidelity to classes of interest. Multiclass Linear Discriminate Analysis produces a linear transform that optimally separates a labeled set of training classes. The defines a distance metric that generalized to a new scenes, enabling graph-based segmentation that emphasizes key spectral features. We describe tests based on data from the Compact Reconnaissance Imaging Spectrometer (CRISM) in which learned metrics improve segment homogeneity with respect to mineralogical classes.
Face Recognition in 4- to 7-Year-Olds: Processing of Configural, Featural, and Paraphernalia Information.

ERIC Educational Resources Information Center

Freire, Alejo; Lee, Kang

2001-01-01

Tested in two studies 4- to 7-year-olds' face recognition by manipulating the faces' configural and featural information. Found that even with only a single 5-second exposure, most children could use configural and featural cues to make identity judgments. Repeated exposure and feedback improved others' performance. Even proficient memories were…
An Automatic Registration Algorithm for 3D Maxillofacial Model

NASA Astrophysics Data System (ADS)

Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng

2016-09-01

3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Local gradient Gabor pattern (LGGP) with applications in face recognition, cross-spectral matching, and soft biometrics

NASA Astrophysics Data System (ADS)

Chen, Cunjian; Ross, Arun

2013-05-01

Researchers in face recognition have been using Gabor filters for image representation due to their robustness to complex variations in expression and illumination. Numerous methods have been proposed to model the output of filter responses by employing either local or global descriptors. In this work, we propose a novel but simple approach for encoding Gradient information on Gabor-transformed images to represent the face, which can be used for identity, gender and ethnicity assessment. Extensive experiments on the standard face benchmark FERET (Visible versus Visible), as well as the heterogeneous face dataset HFB (Near-infrared versus Visible), suggest that the matching performance due to the proposed descriptor is comparable against state-of-the-art descriptor-based approaches in face recognition applications. Furthermore, the same feature set is used in the framework of a Collaborative Representation Classification (CRC) scheme for deducing soft biometric traits such as gender and ethnicity from face images in the AR, Morph and CAS-PEAL databases.
False match elimination for face recognition based on SIFT algorithm

NASA Astrophysics Data System (ADS)

Gu, Xuyuan; Shi, Ping; Shao, Meide

2011-06-01

The SIFT (Scale Invariant Feature Transform) is a well known algorithm used to detect and describe local features in images. It is invariant to image scale, rotation and robust to the noise and illumination. In this paper, a novel method used for face recognition based on SIFT is proposed, which combines the optimization of SIFT, mutual matching and Progressive Sample Consensus (PROSAC) together and can eliminate the false matches of face recognition effectively. Experiments on ORL face database show that many false matches can be eliminated and better recognition rate is achieved.
Object recognition of real targets using modelled SAR images

NASA Astrophysics Data System (ADS)

Zherdev, D. A.

2017-12-01

In this work the problem of recognition is studied using SAR images. The algorithm of recognition is based on the computation of conjugation indices with vectors of class. The support subspaces for each class are constructed by exception of the most and the less correlated vectors in a class. In the study we examine the ability of a significant feature vector size reduce that leads to recognition time decrease. The images of targets form the feature vectors that are transformed using pre-trained convolutional neural network (CNN).
Gait recognition based on integral outline

NASA Astrophysics Data System (ADS)

Ming, Guan; Fang, Lv

2017-02-01

Biometric identification technology replaces traditional security technology, which has become a trend, and gait recognition also has become a hot spot of research because its feature is difficult to imitate and theft. This paper presents a gait recognition system based on integral outline of human body. The system has three important aspects: the preprocessing of gait image, feature extraction and classification. Finally, using a method of polling to evaluate the performance of the system, and summarizing the problems existing in the gait recognition and the direction of development in the future.
Feature Extraction Using Attributed Scattering Center Models for Model-Based Automatic Target Recognition (ATR)

DTIC Science & Technology

2005-10-01

section of the coiled arm. Right: measured realized total gain for a square spiral in free space with inductive treatment. . . . . . . . 154 8.5 Initial...appreciable velocities can often be easily separated from clutter returns, slow moving targets of more moderate cross sections can be very difficult to detect...electromagnetic radiation and measuring the energy scattered back. The data obtained as a result of this process is a finite-extent, noisy set of
Minutia Tensor Matrix: A New Strategy for Fingerprint Matching

PubMed Central

Fu, Xiang; Feng, Jufu

2015-01-01

Establishing correspondences between two minutia sets is a fundamental issue in fingerprint recognition. This paper proposes a new tensor matching strategy. First, the concept of minutia tensor matrix (simplified as MTM) is proposed. It describes the first-order features and second-order features of a matching pair. In the MTM, the diagonal elements indicate similarities of minutia pairs and non-diagonal elements indicate pairwise compatibilities between minutia pairs. Correct minutia pairs are likely to establish both large similarities and large compatibilities, so they form a dense sub-block. Minutia matching is then formulated as recovering the dense sub-block in the MTM. This is a new tensor matching strategy for fingerprint recognition. Second, as fingerprint images show both local rigidity and global nonlinearity, we design two different kinds of MTMs: local MTM and global MTM. Meanwhile, a two-level matching algorithm is proposed. For local matching level, the local MTM is constructed and a novel local similarity calculation strategy is proposed. It makes full use of local rigidity in fingerprints. For global matching level, the global MTM is constructed to calculate similarities of entire minutia sets. It makes full use of global compatibility in fingerprints. Proposed method has stronger description ability and better robustness to noise and nonlinearity. Experiments conducted on Fingerprint Verification Competition databases (FVC2002 and FVC2004) demonstrate the effectiveness and the efficiency. PMID:25822489
Face recognition using total margin-based adaptive fuzzy support vector machines.

PubMed

Liu, Yi-Hung; Chen, Yen-Ting

2007-01-01

This paper presents a new classifier called total margin-based adaptive fuzzy support vector machines (TAF-SVM) that deals with several problems that may occur in support vector machines (SVMs) when applied to the face recognition. The proposed TAF-SVM not only solves the overfitting problem resulted from the outlier with the approach of fuzzification of the penalty, but also corrects the skew of the optimal separating hyperplane due to the very imbalanced data sets by using different cost algorithm. In addition, by introducing the total margin algorithm to replace the conventional soft margin algorithm, a lower generalization error bound can be obtained. Those three functions are embodied into the traditional SVM so that the TAF-SVM is proposed and reformulated in both linear and nonlinear cases. By using two databases, the Chung Yuan Christian University (CYCU) multiview and the facial recognition technology (FERET) face databases, and using the kernel Fisher's discriminant analysis (KFDA) algorithm to extract discriminating face features, experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy. The results also indicate that the proposed TAF-SVM can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained.
Improving Protein Fold Recognition by Deep Learning Networks

NASA Astrophysics Data System (ADS)

Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin

2015-12-01

For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl’s benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.
Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks.

PubMed

Yu, Lequan; Chen, Hao; Dou, Qi; Qin, Jing; Heng, Pheng-Ann

2017-04-01

Automated melanoma recognition in dermoscopy images is a very challenging task due to the low contrast of skin lesions, the huge intraclass variation of melanomas, the high degree of visual similarity between melanoma and non-melanoma lesions, and the existence of many artifacts in the image. In order to meet these challenges, we propose a novel method for melanoma recognition by leveraging very deep convolutional neural networks (CNNs). Compared with existing methods employing either low-level hand-crafted features or CNNs with shallower architectures, our substantially deeper networks (more than 50 layers) can acquire richer and more discriminative features for more accurate recognition. To take full advantage of very deep networks, we propose a set of schemes to ensure effective training and learning under limited training data. First, we apply the residual learning to cope with the degradation and overfitting problems when a network goes deeper. This technique can ensure that our networks benefit from the performance gains achieved by increasing network depth. Then, we construct a fully convolutional residual network (FCRN) for accurate skin lesion segmentation, and further enhance its capability by incorporating a multi-scale contextual information integration scheme. Finally, we seamlessly integrate the proposed FCRN (for segmentation) and other very deep residual networks (for classification) to form a two-stage framework. This framework enables the classification network to extract more representative and specific features based on segmented results instead of the whole dermoscopy images, further alleviating the insufficiency of training data. The proposed framework is extensively evaluated on ISBI 2016 Skin Lesion Analysis Towards Melanoma Detection Challenge dataset. Experimental results demonstrate the significant performance gains of the proposed framework, ranking the first in classification and the second in segmentation among 25 teams and 28 teams, respectively. This study corroborates that very deep CNNs with effective training mechanisms can be employed to solve complicated medical image analysis tasks, even with limited training data.
An embedded system for face classification in infrared video using sparse representation

NASA Astrophysics Data System (ADS)

Saavedra M., Antonio; Pezoa, Jorge E.; Zarkesh-Ha, Payman; Figueroa, Miguel

2017-09-01

We propose a platform for robust face recognition in Infrared (IR) images using Compressive Sensing (CS). In line with CS theory, the classification problem is solved using a sparse representation framework, where test images are modeled by means of a linear combination of the training set. Because the training set constitutes an over-complete dictionary, we identify new images by finding their sparsest representation based on the training set, using standard l1-minimization algorithms. Unlike conventional face-recognition algorithms, we feature extraction is performed using random projections with a precomputed binary matrix, as proposed in the CS literature. This random sampling reduces the effects of noise and occlusions such as facial hair, eyeglasses, and disguises, which are notoriously challenging in IR images. Thus, the performance of our framework is robust to these noise and occlusion factors, achieving an average accuracy of approximately 90% when the UCHThermalFace database is used for training and testing purposes. We implemented our framework on a high-performance embedded digital system, where the computation of the sparse representation of IR images was performed by a dedicated hardware using a deeply pipelined architecture on an Field-Programmable Gate Array (FPGA).
Scattering features for lung cancer detection in fibered confocal fluorescence microscopy images.

PubMed

Rakotomamonjy, Alain; Petitjean, Caroline; Salaün, Mathieu; Thiberville, Luc

2014-06-01

To assess the feasibility of lung cancer diagnosis using fibered confocal fluorescence microscopy (FCFM) imaging technique and scattering features for pattern recognition. FCFM imaging technique is a new medical imaging technique for which interest has yet to be established for diagnosis. This paper addresses the problem of lung cancer detection using FCFM images and, as a first contribution, assesses the feasibility of computer-aided diagnosis through these images. Towards this aim, we have built a pattern recognition scheme which involves a feature extraction stage and a classification stage. The second contribution relies on the features used for discrimination. Indeed, we have employed the so-called scattering transform for extracting discriminative features, which are robust to small deformations in the images. We have also compared and combined these features with classical yet powerful features like local binary patterns (LBP) and their variants denoted as local quinary patterns (LQP). We show that scattering features yielded to better recognition performances than classical features like LBP and their LQP variants for the FCFM image classification problems. Another finding is that LBP-based and scattering-based features provide complementary discriminative information and, in some situations, we empirically establish that performance can be improved when jointly using LBP, LQP and scattering features. In this work we analyze the joint capability of FCFM images and scattering features for lung cancer diagnosis. The proposed method achieves a good recognition rate for such a diagnosis problem. It also performs well when used in conjunction with other features for other classical medical imaging classification problems. Copyright © 2014 Elsevier B.V. All rights reserved.
Northeast Artificial Intelligence Consortium Annual Report. Volume 7. 1988 Research in Automated Photointerpretation

DTIC Science & Technology

1989-10-01

weight based on how powerful the corresponding feature is for object recognition and discrimination. For example, consider an arbitrary weight, denoted...quality of the segmentation, how powerful the features and spatial constraints in the knowledge base are (as far as object recognition is concern...that are powerful for object recognition and discrimination. At this point, this selection is performed heuristically through trial-and-error. As a
Signal recognition and parameter estimation of BPSK-LFM combined modulation

NASA Astrophysics Data System (ADS)

Long, Chao; Zhang, Lin; Liu, Yu

2015-07-01

Intra-pulse analysis plays an important role in electronic warfare. Intra-pulse feature abstraction focuses on primary parameters such as instantaneous frequency, modulation, and symbol rate. In this paper, automatic modulation recognition and feature extraction for combined BPSK-LFM modulation signals based on decision theoretic approach is studied. The simulation results show good recognition effect and high estimation precision, and the system is easy to be realized.
Facial emotion recognition and borderline personality pathology.

PubMed

Meehan, Kevin B; De Panfilis, Chiara; Cain, Nicole M; Antonucci, Camilla; Soliani, Antonio; Clarkin, John F; Sambataro, Fabio

2017-09-01

The impact of borderline personality pathology on facial emotion recognition has been in dispute; with impaired, comparable, and enhanced accuracy found in high borderline personality groups. Discrepancies are likely driven by variations in facial emotion recognition tasks across studies (stimuli type/intensity) and heterogeneity in borderline personality pathology. This study evaluates facial emotion recognition for neutral and negative emotions (fear/sadness/disgust/anger) presented at varying intensities. Effortful control was evaluated as a moderator of facial emotion recognition in borderline personality. Non-clinical multicultural undergraduates (n = 132) completed a morphed facial emotion recognition task of neutral and negative emotional expressions across different intensities (100% Neutral; 25%/50%/75% Emotion) and self-reported borderline personality features and effortful control. Greater borderline personality features related to decreased accuracy in detecting neutral faces, but increased accuracy in detecting negative emotion faces, particularly at low-intensity thresholds. This pattern was moderated by effortful control; for individuals with low but not high effortful control, greater borderline personality features related to misattributions of emotion to neutral expressions, and enhanced detection of low-intensity emotional expressions. Individuals with high borderline personality features may therefore exhibit a bias toward detecting negative emotions that are not or barely present; however, good self-regulatory skills may protect against this potential social-cognitive vulnerability. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals

PubMed Central

Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali

2015-01-01

In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature. PMID:25799141
A method of 3D object recognition and localization in a cloud of points

NASA Astrophysics Data System (ADS)

Bielicki, Jerzy; Sitnik, Robert

2013-12-01

The proposed method given in this article is prepared for analysis of data in the form of cloud of points directly from 3D measurements. It is designed for use in the end-user applications that can directly be integrated with 3D scanning software. The method utilizes locally calculated feature vectors (FVs) in point cloud data. Recognition is based on comparison of the analyzed scene with reference object library. A global descriptor in the form of a set of spatially distributed FVs is created for each reference model. During the detection process, correlation of subsets of reference FVs with FVs calculated in the scene is computed. Features utilized in the algorithm are based on parameters, which qualitatively estimate mean and Gaussian curvatures. Replacement of differentiation with averaging in the curvatures estimation makes the algorithm more resistant to discontinuities and poor quality of the input data. Utilization of the FV subsets allows to detect partially occluded and cluttered objects in the scene, while additional spatial information maintains false positive rate at a reasonably low level.

Heuristic algorithm for optical character recognition of Arabic script

NASA Astrophysics Data System (ADS)

Yarman-Vural, Fatos T.; Atici, A.

1996-02-01

In this paper, a heuristic method is developed for segmentation, feature extraction and recognition of the Arabic script. The study is part of a large project for the transcription of the documents in Ottoman Archives. A geometrical and topological feature analysis method is developed for segmentation and feature extraction stages. Chain code transformation is applied to main strokes of the characters which are then classified by the hidden Markov model (HMM) in the recognition stage. Experimental results indicate that the performance of the proposed method is impressive, provided that the thinning process does not yield spurious branches.
Pose Invariant Face Recognition Based on Hybrid Dominant Frequency Features

NASA Astrophysics Data System (ADS)

Wijaya, I. Gede Pasek Suta; Uchimura, Keiichi; Hu, Zhencheng

Face recognition is one of the most active research areas in pattern recognition, not only because the face is a human biometric characteristics of human being but also because there are many potential applications of the face recognition which range from human-computer interactions to authentication, security, and surveillance. This paper presents an approach to pose invariant human face image recognition. The proposed scheme is based on the analysis of discrete cosine transforms (DCT) and discrete wavelet transforms (DWT) of face images. From both the DCT and DWT domain coefficients, which describe the facial information, we build compact and meaningful features vector, using simple statistical measures and quantization. This feature vector is called as the hybrid dominant frequency features. Then, we apply a combination of the L2 and Lq metric to classify the hybrid dominant frequency features to a person's class. The aim of the proposed system is to overcome the high memory space requirement, the high computational load, and the retraining problems of previous methods. The proposed system is tested using several face databases and the experimental results are compared to a well-known Eigenface method. The proposed method shows good performance, robustness, stability, and accuracy without requiring geometrical normalization. Furthermore, the purposed method has low computational cost, requires little memory space, and can overcome retraining problem.
Robust Tomato Recognition for Robotic Harvesting Using Feature Images Fusion

PubMed Central

Zhao, Yuanshen; Gong, Liang; Huang, Yixiang; Liu, Chengliang

2016-01-01

Automatic recognition of mature fruits in a complex agricultural environment is still a challenge for an autonomous harvesting robot due to various disturbances existing in the background of the image. The bottleneck to robust fruit recognition is reducing influence from two main disturbances: illumination and overlapping. In order to recognize the tomato in the tree canopy using a low-cost camera, a robust tomato recognition algorithm based on multiple feature images and image fusion was studied in this paper. Firstly, two novel feature images, the a*-component image and the I-component image, were extracted from the L*a*b* color space and luminance, in-phase, quadrature-phase (YIQ) color space, respectively. Secondly, wavelet transformation was adopted to fuse the two feature images at the pixel level, which combined the feature information of the two source images. Thirdly, in order to segment the target tomato from the background, an adaptive threshold algorithm was used to get the optimal threshold. The final segmentation result was processed by morphology operation to reduce a small amount of noise. In the detection tests, 93% target tomatoes were recognized out of 200 overall samples. It indicates that the proposed tomato recognition method is available for robotic tomato harvesting in the uncontrolled environment with low cost. PMID:26840313
SSVEP recognition using common feature analysis in brain-computer interface.

PubMed

Zhang, Yu; Zhou, Guoxu; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej

2015-04-15

Canonical correlation analysis (CCA) has been successfully applied to steady-state visual evoked potential (SSVEP) recognition for brain-computer interface (BCI) application. Although the CCA method outperforms the traditional power spectral density analysis through multi-channel detection, it requires additionally pre-constructed reference signals of sine-cosine waves. It is likely to encounter overfitting in using a short time window since the reference signals include no features from training data. We consider that a group of electroencephalogram (EEG) data trials recorded at a certain stimulus frequency on a same subject should share some common features that may bear the real SSVEP characteristics. This study therefore proposes a common feature analysis (CFA)-based method to exploit the latent common features as natural reference signals in using correlation analysis for SSVEP recognition. Good performance of the CFA method for SSVEP recognition is validated with EEG data recorded from ten healthy subjects, in contrast to CCA and a multiway extension of CCA (MCCA). Experimental results indicate that the CFA method significantly outperformed the CCA and the MCCA methods for SSVEP recognition in using a short time window (i.e., less than 1s). The superiority of the proposed CFA method suggests it is promising for the development of a real-time SSVEP-based BCI. Copyright © 2014 Elsevier B.V. All rights reserved.
Short memory fuzzy fusion image recognition schema employing spatial and Fourier descriptors

NASA Astrophysics Data System (ADS)

Raptis, Sotiris N.; Tzafestas, Spyros G.

2001-03-01

Single images quite often do not bear enough information for precise interpretation due to a variety of reasons. Multiple image fusion and adequate integration recently became the state of the art in the pattern recognition field. In this paper presented here and enhanced multiple observation schema is discussed investigating improvements to the baseline fuzzy- probabilistic image fusion methodology. The first innovation introduced consists in considering only a limited but seemingly ore effective part of the uncertainty information obtained by a certain time restricting older uncertainty dependencies and alleviating computational burden that is now needed for short sequence (stored into memory) of samples. The second innovation essentially grouping them into feature-blind object hypotheses. Experiment settings include a sequence of independent views obtained by camera being moved around the investigated object.
Suspicious activity recognition in infrared imagery using Hidden Conditional Random Fields for outdoor perimeter surveillance

NASA Astrophysics Data System (ADS)

Rogotis, Savvas; Ioannidis, Dimosthenis; Tzovaras, Dimitrios; Likothanassis, Spiros

2015-04-01

The aim of this work is to present a novel approach for automatic recognition of suspicious activities in outdoor perimeter surveillance systems based on infrared video processing. Through the combination of size, speed and appearance based features, like the Center-Symmetric Local Binary Patterns, short-term actions are identified and serve as input, along with user location, for modeling target activities using the theory of Hidden Conditional Random Fields. HCRFs are used to directly link a set of observations to the most appropriate activity label and as such to discriminate high risk activities (e.g. trespassing) from zero risk activities (e.g loitering outside the perimeter). Experimental results demonstrate the effectiveness of our approach in identifying suspicious activities for video surveillance systems.
Fractal and twin SVM-based handgrip recognition for healthy subjects and trans-radial amputees using myoelectric signal.

PubMed

Arjunan, Sridhar Poosapadi; Kumar, Dinesh Kant; Jayadeva J

2016-02-01

Identifying functional handgrip patterns using surface electromygram (sEMG) signal recorded from amputee residual muscle is required for controlling the myoelectric prosthetic hand. In this study, we have computed the signal fractal dimension (FD) and maximum fractal length (MFL) during different grip patterns performed by healthy and transradial amputee subjects. The FD and MFL of the sEMG, referred to as the fractal features, were classified using twin support vector machines (TSVM) to recognize the handgrips. TSVM requires fewer support vectors, is suitable for data sets with unbalanced distributions, and can simultaneously be trained for improving both sensitivity and specificity. When compared with other methods, this technique resulted in improved grip recognition accuracy, sensitivity, and specificity, and this improvement was significant (κ=0.91).
Person Recognition System Based on a Combination of Body Images from Visible Light and Thermal Cameras.

PubMed

Nguyen, Dat Tien; Hong, Hyung Gil; Kim, Ki Wan; Park, Kang Ryoung

2017-03-16

The human body contains identity information that can be used for the person recognition (verification/recognition) problem. In this paper, we propose a person recognition method using the information extracted from body images. Our research is novel in the following three ways compared to previous studies. First, we use the images of human body for recognizing individuals. To overcome the limitations of previous studies on body-based person recognition that use only visible light images for recognition, we use human body images captured by two different kinds of camera, including a visible light camera and a thermal camera. The use of two different kinds of body image helps us to reduce the effects of noise, background, and variation in the appearance of a human body. Second, we apply a state-of-the art method, called convolutional neural network (CNN) among various available methods, for image features extraction in order to overcome the limitations of traditional hand-designed image feature extraction methods. Finally, with the extracted image features from body images, the recognition task is performed by measuring the distance between the input and enrolled samples. The experimental results show that the proposed method is efficient for enhancing recognition accuracy compared to systems that use only visible light or thermal images of the human body.
Robust and Effective Component-based Banknote Recognition for the Blind

PubMed Central

Hasanuzzaman, Faiz M.; Yang, Xiaodong; Tian, YingLi

2012-01-01

We develop a novel camera-based computer vision technology to automatically recognize banknotes for assisting visually impaired people. Our banknote recognition system is robust and effective with the following features: 1) high accuracy: high true recognition rate and low false recognition rate, 2) robustness: handles a variety of currency designs and bills in various conditions, 3) high efficiency: recognizes banknotes quickly, and 4) ease of use: helps blind users to aim the target for image capture. To make the system robust to a variety of conditions including occlusion, rotation, scaling, cluttered background, illumination change, viewpoint variation, and worn or wrinkled bills, we propose a component-based framework by using Speeded Up Robust Features (SURF). Furthermore, we employ the spatial relationship of matched SURF features to detect if there is a bill in the camera view. This process largely alleviates false recognition and can guide the user to correctly aim at the bill to be recognized. The robustness and generalizability of the proposed system is evaluated on a dataset including both positive images (with U.S. banknotes) and negative images (no U.S. banknotes) collected under a variety of conditions. The proposed algorithm, achieves 100% true recognition rate and 0% false recognition rate. Our banknote recognition system is also tested by blind users. PMID:22661884
CAFÉ-Map: Context Aware Feature Mapping for mining high dimensional biomedical data.

PubMed

Minhas, Fayyaz Ul Amir Afsar; Asif, Amina; Arif, Muhammad

2016-12-01

Feature selection and ranking is of great importance in the analysis of biomedical data. In addition to reducing the number of features used in classification or other machine learning tasks, it allows us to extract meaningful biological and medical information from a machine learning model. Most existing approaches in this domain do not directly model the fact that the relative importance of features can be different in different regions of the feature space. In this work, we present a context aware feature ranking algorithm called CAFÉ-Map. CAFÉ-Map is a locally linear feature ranking framework that allows recognition of important features in any given region of the feature space or for any individual example. This allows for simultaneous classification and feature ranking in an interpretable manner. We have benchmarked CAFÉ-Map on a number of toy and real world biomedical data sets. Our comparative study with a number of published methods shows that CAFÉ-Map achieves better accuracies on these data sets. The top ranking features obtained through CAFÉ-Map in a gene profiling study correlate very well with the importance of different genes reported in the literature. Furthermore, CAFÉ-Map provides a more in-depth analysis of feature ranking at the level of individual examples. CAFÉ-Map Python code is available at: http://faculty.pieas.edu.pk/fayyaz/software.html#cafemap . The CAFÉ-Map package supports parallelization and sparse data and provides example scripts for classification. This code can be used to reconstruct the results given in this paper. Copyright © 2016 Elsevier Ltd. All rights reserved.
Accurate forced-choice recognition without awareness of memory retrieval.

PubMed

Voss, Joel L; Baym, Carol L; Paller, Ken A

2008-06-01

Recognition confidence and the explicit awareness of memory retrieval commonly accompany accurate responding in recognition tests. Memory performance in recognition tests is widely assumed to measure explicit memory, but the generality of this assumption is questionable. Indeed, whether recognition in nonhumans is always supported by explicit memory is highly controversial. Here we identified circumstances wherein highly accurate recognition was unaccompanied by hallmark features of explicit memory. When memory for kaleidoscopes was tested using a two-alternative forced-choice recognition test with similar foils, recognition was enhanced by an attentional manipulation at encoding known to degrade explicit memory. Moreover, explicit recognition was most accurate when the awareness of retrieval was absent. These dissociations between accuracy and phenomenological features of explicit memory are consistent with the notion that correct responding resulted from experience-dependent enhancements of perceptual fluency with specific stimuli--the putative mechanism for perceptual priming effects in implicit memory tests. This mechanism may contribute to recognition performance in a variety of frequently-employed testing circumstances. Our results thus argue for a novel view of recognition, in that analyses of its neurocognitive foundations must take into account the potential for both (1) recognition mechanisms allied with implicit memory and (2) recognition mechanisms allied with explicit memory.
Double-Windows-Based Motion Recognition in Multi-Floor Buildings Assisted by a Built-In Barometer.

PubMed

Liu, Maolin; Li, Huaiyu; Wang, Yuan; Li, Fei; Chen, Xiuwan

2018-04-01

Accelerometers, gyroscopes and magnetometers in smartphones are often used to recognize human motions. Since it is difficult to distinguish between vertical motions and horizontal motions in the data provided by these built-in sensors, the vertical motion recognition accuracy is relatively low. The emergence of a built-in barometer in smartphones improves the accuracy of motion recognition in the vertical direction. However, there is a lack of quantitative analysis and modelling of the barometer signals, which is the basis of barometer's application to motion recognition, and a problem of imbalanced data also exists. This work focuses on using the barometers inside smartphones for vertical motion recognition in multi-floor buildings through modelling and feature extraction of pressure signals. A novel double-windows pressure feature extraction method, which adopts two sliding time windows of different length, is proposed to balance recognition accuracy and response time. Then, a random forest classifier correlation rule is further designed to weaken the impact of imbalanced data on recognition accuracy. The results demonstrate that the recognition accuracy can reach 95.05% when pressure features and the improved random forest classifier are adopted. Specifically, the recognition accuracy of the stair and elevator motions is significantly improved with enhanced response time. The proposed approach proves effective and accurate, providing a robust strategy for increasing accuracy of vertical motions.
A natural approach to convey numerical digits using hand activity recognition based on hand shape features

NASA Astrophysics Data System (ADS)

Chidananda, H.; Reddy, T. Hanumantha

2017-06-01

This paper presents a natural representation of numerical digit(s) using hand activity analysis based on number of fingers out stretched for each numerical digit in sequence extracted from a video. The analysis is based on determining a set of six features from a hand image. The most important features used from each frame in a video are the first fingertip from top, palm-line, palm-center, valley points between the fingers exists above the palm-line. Using this work user can convey any number of numerical digits using right or left or both the hands naturally in a video. Each numerical digit ranges from 0 to9. Hands (right/left/both) used to convey digits can be recognized accurately using the valley points and with this recognition whether the user is a right / left handed person in practice can be analyzed. In this work, first the hand(s) and face parts are detected by using YCbCr color space and face part is removed by using ellipse based method. Then, the hand(s) are analyzed to recognize the activity that represents a series of numerical digits in a video. This work uses pixel continuity algorithm using 2D coordinate geometry system and does not use regular use of calculus, contours, convex hull and datasets.
An evaluation of open set recognition for FLIR images

NASA Astrophysics Data System (ADS)

Scherreik, Matthew; Rigling, Brian

2015-05-01

Typical supervised classification algorithms label inputs according to what was learned in a training phase. Thus, test inputs that were not seen in training are always given incorrect labels. Open set recognition algorithms address this issue by accounting for inputs that are not present in training and providing the classifier with an option to reject" unknown samples. A number of such techniques have been developed in the literature, many of which are based on support vector machines (SVMs). One approach, the 1-vs-set machine, constructs a slab" in feature space using the SVM hyperplane. Inputs falling on one side of the slab or within the slab belong to a training class, while inputs falling on the far side of the slab are rejected. We note that rejection of unknown inputs can be achieved by thresholding class posterior probabilities. Another recently developed approach, the Probabilistic Open Set SVM (POS-SVM), empirically determines good probability thresholds. We apply the 1-vs-set machine, POS-SVM, and closed set SVMs to FLIR images taken from the Comanche SIG dataset. Vehicles in the dataset are divided into three general classes: wheeled, armored personnel carrier (APC), and tank. For each class, a coarse pose estimate (front, rear, left, right) is taken. In a closed set sense, we analyze these algorithms for prediction of vehicle class and pose. To test open set performance, one or more vehicle classes are held out from training. By considering closed and open set performance separately, we may closely analyze both inter-class discrimination and threshold effectiveness.
Supervised neural network classification of pre-sliced cooked pork ham images using quaternionic singular values.

PubMed

Valous, Nektarios A; Mendoza, Fernando; Sun, Da-Wen; Allen, Paul

2010-03-01

The quaternionic singular value decomposition is a technique to decompose a quaternion matrix (representation of a colour image) into quaternion singular vector and singular value component matrices exposing useful properties. The objective of this study was to use a small portion of uncorrelated singular values, as robust features for the classification of sliced pork ham images, using a supervised artificial neural network classifier. Images were acquired from four qualities of sliced cooked pork ham typically consumed in Ireland (90 slices per quality), having similar appearances. Mahalanobis distances and Pearson product moment correlations were used for feature selection. Six highly discriminating features were used as input to train the neural network. An adaptive feedforward multilayer perceptron classifier was employed to obtain a suitable mapping from the input dataset. The overall correct classification performance for the training, validation and test set were 90.3%, 94.4%, and 86.1%, respectively. The results confirm that the classification performance was satisfactory. Extracting the most informative features led to the recognition of a set of different but visually quite similar textural patterns based on quaternionic singular values. Copyright 2009 Elsevier Ltd. All rights reserved.
Maximal likelihood correspondence estimation for face recognition across pose.

PubMed

Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang

2014-10-01

Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
Composite Wavelet Filters for Enhanced Automated Target Recognition

NASA Technical Reports Server (NTRS)

Chiang, Jeffrey N.; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

2012-01-01

Automated Target Recognition (ATR) systems aim to automate target detection, recognition, and tracking. The current project applies a JPL ATR system to low-resolution sonar and camera videos taken from unmanned vehicles. These sonar images are inherently noisy and difficult to interpret, and pictures taken underwater are unreliable due to murkiness and inconsistent lighting. The ATR system breaks target recognition into three stages: 1) Videos of both sonar and camera footage are broken into frames and preprocessed to enhance images and detect Regions of Interest (ROIs). 2) Features are extracted from these ROIs in preparation for classification. 3) ROIs are classified as true or false positives using a standard Neural Network based on the extracted features. Several preprocessing, feature extraction, and training methods are tested and discussed in this paper.
Younger and Older Users’ Recognition of Virtual Agent Facial Expressions

PubMed Central

Beer, Jenay M.; Smarr, Cory-Ann; Fisk, Arthur D.; Rogers, Wendy A.

2015-01-01

As technology advances, robots and virtual agents will be introduced into the home and healthcare settings to assist individuals, both young and old, with everyday living tasks. Understanding how users recognize an agent’s social cues is therefore imperative, especially in social interactions. Facial expression, in particular, is one of the most common non-verbal cues used to display and communicate emotion in on-screen agents (Cassell, Sullivan, Prevost, & Churchill, 2000). Age is important to consider because age-related differences in emotion recognition of human facial expression have been supported (Ruffman et al., 2008), with older adults showing a deficit for recognition of negative facial expressions. Previous work has shown that younger adults can effectively recognize facial emotions displayed by agents (Bartneck & Reichenbach, 2005; Courgeon et al. 2009; 2011; Breazeal, 2003); however, little research has compared in-depth younger and older adults’ ability to label a virtual agent’s facial emotions, an import consideration because social agents will be required to interact with users of varying ages. If such age-related differences exist for recognition of virtual agent facial expressions, we aim to understand if those age-related differences are influenced by the intensity of the emotion, dynamic formation of emotion (i.e., a neutral expression developing into an expression of emotion through motion), or the type of virtual character differing by human-likeness. Study 1 investigated the relationship between age-related differences, the implication of dynamic formation of emotion, and the role of emotion intensity in emotion recognition of the facial expressions of a virtual agent (iCat). Study 2 examined age-related differences in recognition expressed by three types of virtual characters differing by human-likeness (non-humanoid iCat, synthetic human, and human). Study 2 also investigated the role of configural and featural processing as a possible explanation for age-related differences in emotion recognition. First, our findings show age-related differences in the recognition of emotions expressed by a virtual agent, with older adults showing lower recognition for the emotions of anger, disgust, fear, happiness, sadness, and neutral. These age-related difference might be explained by older adults having difficulty discriminating similarity in configural arrangement of facial features for certain emotions; for example, older adults often mislabeled the similar emotions of fear as surprise. Second, our results did not provide evidence for the dynamic formation improving emotion recognition; but, in general, the intensity of the emotion improved recognition. Lastly, we learned that emotion recognition, for older and younger adults, differed by character type, from best to worst: human, synthetic human, and then iCat. Our findings provide guidance for design, as well as the development of a framework of age-related differences in emotion recognition. PMID:25705105
Contact-free palm-vein recognition based on local invariant features.

PubMed

Kang, Wenxiong; Liu, Yang; Wu, Qiuxia; Yue, Xishun

2014-01-01

Contact-free palm-vein recognition is one of the most challenging and promising areas in hand biometrics. In view of the existing problems in contact-free palm-vein imaging, including projection transformation, uneven illumination and difficulty in extracting exact ROIs, this paper presents a novel recognition approach for contact-free palm-vein recognition that performs feature extraction and matching on all vein textures distributed over the palm surface, including finger veins and palm veins, to minimize the loss of feature information. First, a hierarchical enhancement algorithm, which combines a DOG filter and histogram equalization, is adopted to alleviate uneven illumination and to highlight vein textures. Second, RootSIFT, a more stable local invariant feature extraction method in comparison to SIFT, is adopted to overcome the projection transformation in contact-free mode. Subsequently, a novel hierarchical mismatching removal algorithm based on neighborhood searching and LBP histograms is adopted to improve the accuracy of feature matching. Finally, we rigorously evaluated the proposed approach using two different databases and obtained 0.996% and 3.112% Equal Error Rates (EERs), respectively, which demonstrate the effectiveness of the proposed approach.
An online handwriting recognition system for Turkish

NASA Astrophysics Data System (ADS)

Vural, Esra; Erdogan, Hakan; Oflazer, Kemal; Yanikoglu, Berrin A.

2004-12-01

Despite recent developments in Tablet PC technology, there has not been any applications for recognizing handwritings in Turkish. In this paper, we present an online handwritten text recognition system for Turkish, developed using the Tablet PC interface. However, even though the system is developed for Turkish, the addressed issues are common to online handwriting recognition systems in general. Several dynamic features are extracted from the handwriting data for each recorded point and Hidden Markov Models (HMM) are used to train letter and word models. We experimented with using various features and HMM model topologies, and report on the effects of these experiments. We started with first and second derivatives of the x and y coordinates and relative change in the pen pressure as initial features. We found that using two more additional features, that is, number of neighboring points and relative heights of each point with respect to the base-line improve the recognition rate. In addition, extracting features within strokes and using a skipping state topology improve the system performance as well. The improved system performance is 94% in recognizing handwritten words from a 1000-word lexicon.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.