Application of image recognition-based automatic hyphae detection in fungal keratitis.
Wu, Xuelian; Tao, Yuan; Qiu, Qingchen; Wu, Xinyi
2018-03-01
The purpose of this study is to evaluate the accuracy of two methods in diagnosis of fungal keratitis, whereby one method is automatic hyphae detection based on images recognition and the other method is corneal smear. We evaluate the sensitivity and specificity of the method in diagnosis of fungal keratitis, which is automatic hyphae detection based on image recognition. We analyze the consistency of clinical symptoms and the density of hyphae, and perform quantification using the method of automatic hyphae detection based on image recognition. In our study, 56 cases with fungal keratitis (just single eye) and 23 cases with bacterial keratitis were included. All cases underwent the routine inspection of slit lamp biomicroscopy, corneal smear examination, microorganism culture and the assessment of in vivo confocal microscopy images before starting medical treatment. Then, we recognize the hyphae images of in vivo confocal microscopy by using automatic hyphae detection based on image recognition to evaluate its sensitivity and specificity and compare with the method of corneal smear. The next step is to use the index of density to assess the severity of infection, and then find the correlation with the patients' clinical symptoms and evaluate consistency between them. The accuracy of this technology was superior to corneal smear examination (p < 0.05). The sensitivity of the technology of automatic hyphae detection of image recognition was 89.29%, and the specificity was 95.65%. The area under the ROC curve was 0.946. The correlation coefficient between the grading of the severity in the fungal keratitis by the automatic hyphae detection based on image recognition and the clinical grading is 0.87. The technology of automatic hyphae detection based on image recognition was with high sensitivity and specificity, able to identify fungal keratitis, which is better than the method of corneal smear examination. This technology has the advantages when compared with the conventional artificial identification of confocal microscope corneal images, of being accurate, stable and does not rely on human expertise. It was the most useful to the medical experts who are not familiar with fungal keratitis. The technology of automatic hyphae detection based on image recognition can quantify the hyphae density and grade this property. Being noninvasive, it can provide an evaluation criterion to fungal keratitis in a timely, accurate, objective and quantitative manner.
Target recognition based on convolutional neural network
NASA Astrophysics Data System (ADS)
Wang, Liqiang; Wang, Xin; Xi, Fubiao; Dong, Jian
2017-11-01
One of the important part of object target recognition is the feature extraction, which can be classified into feature extraction and automatic feature extraction. The traditional neural network is one of the automatic feature extraction methods, while it causes high possibility of over-fitting due to the global connection. The deep learning algorithm used in this paper is a hierarchical automatic feature extraction method, trained with the layer-by-layer convolutional neural network (CNN), which can extract the features from lower layers to higher layers. The features are more discriminative and it is beneficial to the object target recognition.
Practical automatic Arabic license plate recognition system
NASA Astrophysics Data System (ADS)
Mohammad, Khader; Agaian, Sos; Saleh, Hani
2011-02-01
Since 1970's, the need of an automatic license plate recognition system, sometimes referred as Automatic License Plate Recognition system, has been increasing. A license plate recognition system is an automatic system that is able to recognize a license plate number, extracted from image sensors. In specific, Automatic License Plate Recognition systems are being used in conjunction with various transportation systems in application areas such as law enforcement (e.g. speed limit enforcement) and commercial usages such as parking enforcement and automatic toll payment private and public entrances, border control, theft and vandalism control. Vehicle license plate recognition has been intensively studied in many countries. Due to the different types of license plates being used, the requirement of an automatic license plate recognition system is different for each country. [License plate detection using cluster run length smoothing algorithm ].Generally, an automatic license plate localization and recognition system is made up of three modules; license plate localization, character segmentation and optical character recognition modules. This paper presents an Arabic license plate recognition system that is insensitive to character size, font, shape and orientation with extremely high accuracy rate. The proposed system is based on a combination of enhancement, license plate localization, morphological processing, and feature vector extraction using the Haar transform. The performance of the system is fast due to classification of alphabet and numerals based on the license plate organization. Experimental results for license plates of two different Arab countries show an average of 99 % successful license plate localization and recognition in a total of more than 20 different images captured from a complex outdoor environment. The results run times takes less time compared to conventional and many states of art methods.
Automatic face recognition in HDR imaging
NASA Astrophysics Data System (ADS)
Pereira, Manuela; Moreno, Juan-Carlos; Proença, Hugo; Pinheiro, António M. G.
2014-05-01
The gaining popularity of the new High Dynamic Range (HDR) imaging systems is raising new privacy issues caused by the methods used for visualization. HDR images require tone mapping methods for an appropriate visualization on conventional and non-expensive LDR displays. These visualization methods might result in completely different visualization raising several issues on privacy intrusion. In fact, some visualization methods result in a perceptual recognition of the individuals, while others do not even show any identity. Although perceptual recognition might be possible, a natural question that can rise is how computer based recognition will perform using tone mapping generated images? In this paper, a study where automatic face recognition using sparse representation is tested with images that result from common tone mapping operators applied to HDR images. Its ability for the face identity recognition is described. Furthermore, typical LDR images are used for the face recognition training.
Offline Arabic handwriting recognition: a survey.
Lorigo, Liana M; Govindaraju, Venu
2006-05-01
The automatic recognition of text on scanned images has enabled many applications such as searching for words in large volumes of documents, automatic sorting of postal mail, and convenient editing of previously printed documents. The domain of handwriting in the Arabic script presents unique technical challenges and has been addressed more recently than other domains. Many different methods have been proposed and applied to various types of images. This paper provides a comprehensive review of these methods. It is the first survey to focus on Arabic handwriting recognition and the first Arabic character recognition survey to provide recognition rates and descriptions of test data for the approaches discussed. It includes background on the field, discussion of the methods, and future research directions.
Application of automatic threshold in dynamic target recognition with low contrast
NASA Astrophysics Data System (ADS)
Miao, Hua; Guo, Xiaoming; Chen, Yu
2014-11-01
Hybrid photoelectric joint transform correlator can realize automatic real-time recognition with high precision through the combination of optical devices and electronic devices. When recognizing targets with low contrast using photoelectric joint transform correlator, because of the difference of attitude, brightness and grayscale between target and template, only four to five frames of dynamic targets can be recognized without any processing. CCD camera is used to capture the dynamic target images and the capturing speed of CCD is 25 frames per second. Automatic threshold has many advantages like fast processing speed, effectively shielding noise interference, enhancing diffraction energy of useful information and better reserving outline of target and template, so this method plays a very important role in target recognition with optical correlation method. However, the automatic obtained threshold by program can not achieve the best recognition results for dynamic targets. The reason is that outline information is broken to some extent. Optimal threshold is obtained by manual intervention in most cases. Aiming at the characteristics of dynamic targets, the processing program of improved automatic threshold is finished by multiplying OTSU threshold of target and template by scale coefficient of the processed image, and combining with mathematical morphology. The optimal threshold can be achieved automatically by improved automatic threshold processing for dynamic low contrast target images. The recognition rate of dynamic targets is improved through decreased background noise effect and increased correlation information. A series of dynamic tank images with the speed about 70 km/h are adapted as target images. The 1st frame of this series of tanks can correlate only with the 3rd frame without any processing. Through OTSU threshold, the 80th frame can be recognized. By automatic threshold processing of the joint images, this number can be increased to 89 frames. Experimental results show that the improved automatic threshold processing has special application value for the recognition of dynamic target with low contrast.
Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition
NASA Astrophysics Data System (ADS)
Drygajlo, Andrzej
Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide a coherent way of quantifying and presenting recorded voice as biometric evidence. In such methods, the biometric evidence consists of the quantified degree of similarity between speaker-dependent features extracted from the trace and speaker-dependent features extracted from recorded speech of a suspect. The interpretation of recorded voice as evidence in the forensic context presents particular challenges, including within-speaker (within-source) variability and between-speakers (between-sources) variability. Consequently, FASR methods must provide a statistical evaluation which gives the court an indication of the strength of the evidence given the estimated within-source and between-sources variabilities. This paper reports on the first ENFSI evaluation campaign through a fake case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.
Automatic concept extraction from spoken medical reports.
Happe, André; Pouliquen, Bruno; Burgun, Anita; Cuggia, Marc; Le Beux, Pierre
2003-07-01
The objective of this project is to investigate methods whereby a combination of speech recognition and automated indexing methods substitute for current transcription and indexing practices. We based our study on existing speech recognition software programs and on NOMINDEX, a tool that extracts MeSH concepts from medical text in natural language and that is mainly based on a French medical lexicon and on the UMLS. For each document, the process consists of three steps: (1) dictation and digital audio recording, (2) speech recognition, (3) automatic indexing. The evaluation consisted of a comparison between the set of concepts extracted by NOMINDEX after the speech recognition phase and the set of keywords manually extracted from the initial document. The method was evaluated on a set of 28 patient discharge summaries extracted from the MENELAS corpus in French, corresponding to in-patients admitted for coronarography. The overall precision was 73% and the overall recall was 90%. Indexing errors were mainly due to word sense ambiguity and abbreviations. A specific issue was the fact that the standard French translation of MeSH terms lacks diacritics. A preliminary evaluation of speech recognition tools showed that the rate of accurate recognition was higher than 98%. Only 3% of the indexing errors were generated by inadequate speech recognition. We discuss several areas to focus on to improve this prototype. However, the very low rate of indexing errors due to speech recognition errors highlights the potential benefits of combining speech recognition techniques and automatic indexing.
Cross spectral, active and passive approach to face recognition for improved performance
NASA Astrophysics Data System (ADS)
Grudzien, A.; Kowalski, M.; Szustakowski, M.
2017-08-01
Biometrics is a technique for automatic recognition of a person based on physiological or behavior characteristics. Since the characteristics used are unique, biometrics can create a direct link between a person and identity, based on variety of characteristics. The human face is one of the most important biometric modalities for automatic authentication. The most popular method of face recognition which relies on processing of visual information seems to be imperfect. Thermal infrared imagery may be a promising alternative or complement to visible range imaging due to its several reasons. This paper presents an approach of combining both methods.
Ben Younes, Lassad; Nakajima, Yoshikazu; Saito, Toki
2014-03-01
Femur segmentation is well established and widely used in computer-assisted orthopedic surgery. However, most of the robust segmentation methods such as statistical shape models (SSM) require human intervention to provide an initial position for the SSM. In this paper, we propose to overcome this problem and provide a fully automatic femur segmentation method for CT images based on primitive shape recognition and SSM. Femur segmentation in CT scans was performed using primitive shape recognition based on a robust algorithm such as the Hough transform and RANdom SAmple Consensus. The proposed method is divided into 3 steps: (1) detection of the femoral head as sphere and the femoral shaft as cylinder in the SSM and the CT images, (2) rigid registration between primitives of SSM and CT image to initialize the SSM into the CT image, and (3) fitting of the SSM to the CT image edge using an affine transformation followed by a nonlinear fitting. The automated method provided good results even with a high number of outliers. The difference of segmentation error between the proposed automatic initialization method and a manual initialization method is less than 1 mm. The proposed method detects primitive shape position to initialize the SSM into the target image. Based on primitive shapes, this method overcomes the problem of inter-patient variability. Moreover, the results demonstrate that our method of primitive shape recognition can be used for 3D SSM initialization to achieve fully automatic segmentation of the femur.
Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition
NASA Astrophysics Data System (ADS)
Ryumin, D.; Karpov, A. A.
2017-05-01
In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.
Automatic Recognition of Fetal Facial Standard Plane in Ultrasound Image via Fisher Vector.
Lei, Baiying; Tan, Ee-Leng; Chen, Siping; Zhuo, Liu; Li, Shengli; Ni, Dong; Wang, Tianfu
2015-01-01
Acquisition of the standard plane is the prerequisite of biometric measurement and diagnosis during the ultrasound (US) examination. In this paper, a new algorithm is developed for the automatic recognition of the fetal facial standard planes (FFSPs) such as the axial, coronal, and sagittal planes. Specifically, densely sampled root scale invariant feature transform (RootSIFT) features are extracted and then encoded by Fisher vector (FV). The Fisher network with multi-layer design is also developed to extract spatial information to boost the classification performance. Finally, automatic recognition of the FFSPs is implemented by support vector machine (SVM) classifier based on the stochastic dual coordinate ascent (SDCA) algorithm. Experimental results using our dataset demonstrate that the proposed method achieves an accuracy of 93.27% and a mean average precision (mAP) of 99.19% in recognizing different FFSPs. Furthermore, the comparative analyses reveal the superiority of the proposed method based on FV over the traditional methods.
Automatic event recognition and anomaly detection with attribute grammar by learning scene semantics
NASA Astrophysics Data System (ADS)
Qi, Lin; Yao, Zhenyu; Li, Li; Dong, Junyu
2007-11-01
In this paper we present a novel framework for automatic event recognition and abnormal behavior detection with attribute grammar by learning scene semantics. This framework combines learning scene semantics by trajectory analysis and constructing attribute grammar-based event representation. The scene and event information is learned automatically. Abnormal behaviors that disobey scene semantics or event grammars rules are detected. By this method, an approach to understanding video scenes is achieved. Further more, with this prior knowledge, the accuracy of abnormal event detection is increased.
Automatic lip reading by using multimodal visual features
NASA Astrophysics Data System (ADS)
Takahashi, Shohei; Ohya, Jun
2013-12-01
Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.
Speech recognition for embedded automatic positioner for laparoscope
NASA Astrophysics Data System (ADS)
Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin
2014-07-01
In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.
Automatic Facial Expression Recognition and Operator Functional State
NASA Technical Reports Server (NTRS)
Blanson, Nina
2012-01-01
The prevalence of human error in safety-critical occupations remains a major challenge to mission success despite increasing automation in control processes. Although various methods have been proposed to prevent incidences of human error, none of these have been developed to employ the detection and regulation of Operator Functional State (OFS), or the optimal condition of the operator while performing a task, in work environments due to drawbacks such as obtrusiveness and impracticality. A video-based system with the ability to infer an individual's emotional state from facial feature patterning mitigates some of the problems associated with other methods of detecting OFS, like obtrusiveness and impracticality in integration with the mission environment. This paper explores the utility of facial expression recognition as a technology for inferring OFS by first expounding on the intricacies of OFS and the scientific background behind emotion and its relationship with an individual's state. Then, descriptions of the feedback loop and the emotion protocols proposed for the facial recognition program are explained. A basic version of the facial expression recognition program uses Haar classifiers and OpenCV libraries to automatically locate key facial landmarks during a live video stream. Various methods of creating facial expression recognition software are reviewed to guide future extensions of the program. The paper concludes with an examination of the steps necessary in the research of emotion and recommendations for the creation of an automatic facial expression recognition program for use in real-time, safety-critical missions
Automatic Facial Expression Recognition and Operator Functional State
NASA Technical Reports Server (NTRS)
Blanson, Nina
2011-01-01
The prevalence of human error in safety-critical occupations remains a major challenge to mission success despite increasing automation in control processes. Although various methods have been proposed to prevent incidences of human error, none of these have been developed to employ the detection and regulation of Operator Functional State (OFS), or the optimal condition of the operator while performing a task, in work environments due to drawbacks such as obtrusiveness and impracticality. A video-based system with the ability to infer an individual's emotional state from facial feature patterning mitigates some of the problems associated with other methods of detecting OFS, like obtrusiveness and impracticality in integration with the mission environment. This paper explores the utility of facial expression recognition as a technology for inferring OFS by first expounding on the intricacies of OFS and the scientific background behind emotion and its relationship with an individual's state. Then, descriptions of the feedback loop and the emotion protocols proposed for the facial recognition program are explained. A basic version of the facial expression recognition program uses Haar classifiers and OpenCV libraries to automatically locate key facial landmarks during a live video stream. Various methods of creating facial expression recognition software are reviewed to guide future extensions of the program. The paper concludes with an examination of the steps necessary in the research of emotion and recommendations for the creation of an automatic facial expression recognition program for use in real-time, safety-critical missions.
3D automatic anatomy recognition based on iterative graph-cut-ASM
NASA Astrophysics Data System (ADS)
Chen, Xinjian; Udupa, Jayaram K.; Bagci, Ulas; Alavi, Abass; Torigian, Drew A.
2010-02-01
We call the computerized assistive process of recognizing, delineating, and quantifying organs and tissue regions in medical imaging, occurring automatically during clinical image interpretation, automatic anatomy recognition (AAR). The AAR system we are developing includes five main parts: model building, object recognition, object delineation, pathology detection, and organ system quantification. In this paper, we focus on the delineation part. For the modeling part, we employ the active shape model (ASM) strategy. For recognition and delineation, we integrate several hybrid strategies of combining purely image based methods with ASM. In this paper, an iterative Graph-Cut ASM (IGCASM) method is proposed for object delineation. An algorithm called GC-ASM was presented at this symposium last year for object delineation in 2D images which attempted to combine synergistically ASM and GC. Here, we extend this method to 3D medical image delineation. The IGCASM method effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. We propose a new GC cost function, which effectively integrates the specific image information with the ASM shape model information. The proposed methods are tested on a clinical abdominal CT data set. The preliminary results show that: (a) it is feasible to explicitly bring prior 3D statistical shape information into the GC framework; (b) the 3D IGCASM delineation method improves on ASM and GC and can provide practical operational time on clinical images.
Han, Guanghui; Liu, Xiabi; Zheng, Guangyuan; Wang, Murong; Huang, Shan
2018-06-06
Ground-glass opacity (GGO) is a common CT imaging sign on high-resolution CT, which means the lesion is more likely to be malignant compared to common solid lung nodules. The automatic recognition of GGO CT imaging signs is of great importance for early diagnosis and possible cure of lung cancers. The present GGO recognition methods employ traditional low-level features and system performance improves slowly. Considering the high-performance of CNN model in computer vision field, we proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling is performed on multi-views and multi-receptive fields, which reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has the ability to obtain the optimal fine-tuning model. Multi-CNN models fusion strategy obtains better performance than any single trained model. We evaluated our method on the GGO nodule samples in publicly available LIDC-IDRI dataset of chest CT scans. The experimental results show that our method yields excellent results with 96.64% sensitivity, 71.43% specificity, and 0.83 F1 score. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images. Graphical abstract We proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has ability to obtain the optimal fine-tuning model. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images.
Container-code recognition system based on computer vision and deep neural networks
NASA Astrophysics Data System (ADS)
Liu, Yi; Li, Tianjian; Jiang, Li; Liang, Xiaoyao
2018-04-01
Automatic container-code recognition system becomes a crucial requirement for ship transportation industry in recent years. In this paper, an automatic container-code recognition system based on computer vision and deep neural networks is proposed. The system consists of two modules, detection module and recognition module. The detection module applies both algorithms based on computer vision and neural networks, and generates a better detection result through combination to avoid the drawbacks of the two methods. The combined detection results are also collected for online training of the neural networks. The recognition module exploits both character segmentation and end-to-end recognition, and outputs the recognition result which passes the verification. When the recognition module generates false recognition, the result will be corrected and collected for online training of the end-to-end recognition sub-module. By combining several algorithms, the system is able to deal with more situations, and the online training mechanism can improve the performance of the neural networks at runtime. The proposed system is able to achieve 93% of overall recognition accuracy.
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor
NASA Astrophysics Data System (ADS)
Heracleous, Panikos; Kaino, Tomomi; Saruwatari, Hiroshi; Shikano, Kiyohiro
2006-12-01
We present the use of stethoscope and silicon NAM (nonaudible murmur) microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible) speech, but also very quietly uttered speech (nonaudible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc.) for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a[InlineEquation not available: see fulltext.] word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
NASA Astrophysics Data System (ADS)
Sanger, Demas S.; Haneishi, Hideaki; Miyake, Yoichi
1995-08-01
This paper proposed a simple and automatic method for recognizing the light sources from various color negative film brands by means of digital image processing. First, we stretched the image obtained from a negative based on the standardized scaling factors, then extracted the dominant color component among red, green, and blue components of the stretched image. The dominant color component became the discriminator for the recognition. The experimental results verified that any one of the three techniques could recognize the light source from negatives of any film brands and all brands greater than 93.2 and 96.6% correct recognitions, respectively. This method is significant for the automation of color quality control in color reproduction from color negative film in mass processing and printing machine.
NASA Astrophysics Data System (ADS)
Zhang, Shijun; Jing, Zhongliang; Li, Jianxun
2005-01-01
The rotation invariant feature of the target is obtained using the multi-direction feature extraction property of the steerable filter. Combining the morphological operation top-hat transform with the self-organizing feature map neural network, the adaptive topological region is selected. Using the erosion operation, the topological region shrinkage is achieved. The steerable filter based morphological self-organizing feature map neural network is applied to automatic target recognition of binary standard patterns and real-world infrared sequence images. Compared with Hamming network and morphological shared-weight networks respectively, the higher recognition correct rate, robust adaptability, quick training, and better generalization of the proposed method are achieved.
The Automatic Recognition of the Abnormal Sky-subtraction Spectra Based on Hadoop
NASA Astrophysics Data System (ADS)
An, An; Pan, Jingchang
2017-10-01
The skylines, superimposing on the target spectrum as a main noise, If the spectrum still contains a large number of high strength skylight residuals after sky-subtraction processing, it will not be conducive to the follow-up analysis of the target spectrum. At the same time, the LAMOST can observe a quantity of spectroscopic data in every night. We need an efficient platform to proceed the recognition of the larger numbers of abnormal sky-subtraction spectra quickly. Hadoop, as a distributed parallel data computing platform, can deal with large amounts of data effectively. In this paper, we conduct the continuum normalization firstly and then a simple and effective method will be presented to automatic recognize the abnormal sky-subtraction spectra based on Hadoop platform. Obtain through the experiment, the Hadoop platform can implement the recognition with more speed and efficiency, and the simple method can recognize the abnormal sky-subtraction spectra and find the abnormal skyline positions of different residual strength effectively, can be applied to the automatic detection of abnormal sky-subtraction of large number of spectra.
Automatic detection and recognition of signs from natural scenes.
Chen, Xilin; Yang, Jie; Zhang, Jing; Waibel, Alex
2004-01-01
In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to recover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English.
Human Activity Recognition in AAL Environments Using Random Projections.
Damaševičius, Robertas; Vasiljevas, Mindaugas; Šalkevičius, Justas; Woźniak, Marcin
2016-01-01
Automatic human activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors attached to the subject's body and permit continuous monitoring of numerous physiological signals reflecting the state of human actions. Successful identification of human activities can be immensely useful in healthcare applications for Ambient Assisted Living (AAL), for automatic and intelligent activity monitoring systems developed for elderly and disabled people. In this paper, we propose the method for activity recognition and subject identification based on random projections from high-dimensional feature space to low-dimensional projection space, where the classes are separated using the Jaccard distance between probability density functions of projected data. Two HAR domain tasks are considered: activity identification and subject identification. The experimental results using the proposed method with Human Activity Dataset (HAD) data are presented.
Human Activity Recognition in AAL Environments Using Random Projections
Damaševičius, Robertas; Vasiljevas, Mindaugas; Šalkevičius, Justas; Woźniak, Marcin
2016-01-01
Automatic human activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors attached to the subject's body and permit continuous monitoring of numerous physiological signals reflecting the state of human actions. Successful identification of human activities can be immensely useful in healthcare applications for Ambient Assisted Living (AAL), for automatic and intelligent activity monitoring systems developed for elderly and disabled people. In this paper, we propose the method for activity recognition and subject identification based on random projections from high-dimensional feature space to low-dimensional projection space, where the classes are separated using the Jaccard distance between probability density functions of projected data. Two HAR domain tasks are considered: activity identification and subject identification. The experimental results using the proposed method with Human Activity Dataset (HAD) data are presented. PMID:27413392
NASA Technical Reports Server (NTRS)
Tescher, Andrew G. (Editor)
1989-01-01
Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.
Automatic speech recognition technology development at ITT Defense Communications Division
NASA Technical Reports Server (NTRS)
White, George M.
1977-01-01
An assessment of the applications of automatic speech recognition to defense communication systems is presented. Future research efforts include investigations into the following areas: (1) dynamic programming; (2) recognition of speech degraded by noise; (3) speaker independent recognition; (4) large vocabulary recognition; (5) word spotting and continuous speech recognition; and (6) isolated word recognition.
Military applications of automatic speech recognition and future requirements
NASA Technical Reports Server (NTRS)
Beek, Bruno; Cupples, Edward J.
1977-01-01
An updated summary of the state-of-the-art of automatic speech recognition and its relevance to military applications is provided. A number of potential systems for military applications are under development. These include: (1) digital narrowband communication systems; (2) automatic speech verification; (3) on-line cartographic processing unit; (4) word recognition for militarized tactical data system; and (5) voice recognition and synthesis for aircraft cockpit.
Multiclassifier information fusion methods for microarray pattern recognition
NASA Astrophysics Data System (ADS)
Braun, Jerome J.; Glina, Yan; Judson, Nicholas; Herzig-Marx, Rachel
2004-04-01
This paper addresses automatic recognition of microarray patterns, a capability that could have a major significance for medical diagnostics, enabling development of diagnostic tools for automatic discrimination of specific diseases. The paper presents multiclassifier information fusion methods for microarray pattern recognition. The input space partitioning approach based on fitness measures that constitute an a-priori gauging of classification efficacy for each subspace is investigated. Methods for generation of fitness measures, generation of input subspaces and their use in the multiclassifier fusion architecture are presented. In particular, two-level quantification of fitness that accounts for the quality of each subspace as well as the quality of individual neighborhoods within the subspace is described. Individual-subspace classifiers are Support Vector Machine based. The decision fusion stage fuses the information from mulitple SVMs along with the multi-level fitness information. Final decision fusion stage techniques, including weighted fusion as well as Dempster-Shafer theory based fusion are investigated. It should be noted that while the above methods are discussed in the context of microarray pattern recognition, they are applicable to a broader range of discrimination problems, in particular to problems involving a large number of information sources irreducible to a low-dimensional feature space.
Method for automatic detection of wheezing in lung sounds.
Riella, R J; Nohama, P; Maia, J M
2009-07-01
The present report describes the development of a technique for automatic wheezing recognition in digitally recorded lung sounds. This method is based on the extraction and processing of spectral information from the respiratory cycle and the use of these data for user feedback and automatic recognition. The respiratory cycle is first pre-processed, in order to normalize its spectral information, and its spectrogram is then computed. After this procedure, the spectrogram image is processed by a two-dimensional convolution filter and a half-threshold in order to increase the contrast and isolate its highest amplitude components, respectively. Thus, in order to generate more compressed data to automatic recognition, the spectral projection from the processed spectrogram is computed and stored as an array. The higher magnitude values of the array and its respective spectral values are then located and used as inputs to a multi-layer perceptron artificial neural network, which results an automatic indication about the presence of wheezes. For validation of the methodology, lung sounds recorded from three different repositories were used. The results show that the proposed technique achieves 84.82% accuracy in the detection of wheezing for an isolated respiratory cycle and 92.86% accuracy for the detection of wheezes when detection is carried out using groups of respiratory cycles obtained from the same person. Also, the system presents the original recorded sound and the post-processed spectrogram image for the user to draw his own conclusions from the data.
NASA Astrophysics Data System (ADS)
El Bekri, Nadia; Angele, Susanne; Ruckhäberle, Martin; Peinsipp-Byma, Elisabeth; Haelke, Bruno
2015-10-01
This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR (automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely. State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human perception and computational methods in a synergistic way, both are unified in an interactive assistance system. RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain object type based on the object features that originated from the image signatures. The infrastructure analysis mode pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system offers him the possibility to assign the image signatures to features given by sample images. The underlying data set contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible solution set of objects is automatically reduced and matches only the objects that contain the selected features. Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly automated methods to extract image signatures and assign them to the corresponding features. This research includes methods for automatically determining the orientation of an object and geometric features like width and length of the object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive recognition assistance system.
Terminologies for text-mining; an experiment in the lipoprotein metabolism domain
Alexopoulou, Dimitra; Wächter, Thomas; Pickersgill, Laura; Eyre, Cecilia; Schroeder, Michael
2008-01-01
Background The engineering of ontologies, especially with a view to a text-mining use, is still a new research field. There does not yet exist a well-defined theory and technology for ontology construction. Many of the ontology design steps remain manual and are based on personal experience and intuition. However, there exist a few efforts on automatic construction of ontologies in the form of extracted lists of terms and relations between them. Results We share experience acquired during the manual development of a lipoprotein metabolism ontology (LMO) to be used for text-mining. We compare the manually created ontology terms with the automatically derived terminology from four different automatic term recognition (ATR) methods. The top 50 predicted terms contain up to 89% relevant terms. For the top 1000 terms the best method still generates 51% relevant terms. In a corpus of 3066 documents 53% of LMO terms are contained and 38% can be generated with one of the methods. Conclusions Given high precision, automatic methods can help decrease development time and provide significant support for the identification of domain-specific vocabulary. The coverage of the domain vocabulary depends strongly on the underlying documents. Ontology development for text mining should be performed in a semi-automatic way; taking ATR results as input and following the guidelines we described. Availability The TFIDF term recognition is available as Web Service, described at PMID:18460175
Background feature descriptor for offline handwritten numeral recognition
NASA Astrophysics Data System (ADS)
Ming, Delie; Wang, Hao; Tian, Tian; Jie, Feiran; Lei, Bo
2011-11-01
This paper puts forward an offline handwritten numeral recognition method based on background structural descriptor (sixteen-value numerical background expression). Through encoding the background pixels in the image according to a certain rule, 16 different eigenvalues were generated, which reflected the background condition of every digit, then reflected the structural features of the digits. Through pattern language description of images by these features, automatic segmentation of overlapping digits and numeral recognition can be realized. This method is characterized by great deformation resistant ability, high recognition speed and easy realization. Finally, the experimental results and conclusions are presented. The experimental results of recognizing datasets from various practical application fields reflect that with this method, a good recognition effect can be achieved.
ERIC Educational Resources Information Center
Young, Victoria; Mihailidis, Alex
2010-01-01
Despite their growing presence in home computer applications and various telephony services, commercial automatic speech recognition technologies are still not easily employed by everyone; especially individuals with speech disorders. In addition, relatively little research has been conducted on automatic speech recognition performance with older…
NASA Astrophysics Data System (ADS)
Yan, Fengxia; Udupa, Jayaram K.; Tong, Yubing; Xu, Guoping; Odhner, Dewey; Torigian, Drew A.
2018-03-01
The recently developed body-wide Automatic Anatomy Recognition (AAR) methodology depends on fuzzy modeling of individual objects, hierarchically arranging objects, constructing an anatomy ensemble of these models, and a dichotomous object recognition-delineation process. The parent-to-offspring spatial relationship in the object hierarchy is crucial in the AAR method. We have found this relationship to be quite complex, and as such any improvement in capturing this relationship information in the anatomy model will improve the process of recognition itself. Currently, the method encodes this relationship based on the layout of the geometric centers of the objects. Motivated by the concept of virtual landmarks (VLs), this paper presents a new one-shot AAR recognition method that utilizes the VLs to learn object relationships by training a neural network to predict the pose and the VLs of an offspring object given the VLs of the parent object in the hierarchy. We set up two neural networks for each parent-offspring object pair in a body region, one for predicting the VLs and another for predicting the pose parameters. The VL-based learning/prediction method is evaluated on two object hierarchies involving 14 objects. We utilize 54 computed tomography (CT) image data sets of head and neck cancer patients and the associated object contours drawn by dosimetrists for routine radiation therapy treatment planning. The VL neural network method is found to yield more accurate object localization than the currently used simple AAR method.
Support vector machine for automatic pain recognition
NASA Astrophysics Data System (ADS)
Monwar, Md Maruf; Rezaei, Siamak
2009-02-01
Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.
Automatic Cataloguing and Searching for Retrospective Data by Use of OCR Text.
ERIC Educational Resources Information Center
Tseng, Yuen-Hsien
2001-01-01
Describes efforts in supporting information retrieval from OCR (optical character recognition) degraded text. Reports on approaches used in an automatic cataloging and searching contest for books in multiple languages, including a vector space retrieval model, an n-gram indexing method, and a weighting scheme; and discusses problems of Asian…
RFID: A Revolution in Automatic Data Recognition
ERIC Educational Resources Information Center
Deal, Walter F., III
2004-01-01
Radio frequency identification, or RFID, is a generic term for technologies that use radio waves to automatically identify people or objects. There are several methods of identification, but the most common is to store a serial number that identifies a person or object, and perhaps other information, on a microchip that is attached to an antenna…
Terrain type recognition using ERTS-1 MSS images
NASA Technical Reports Server (NTRS)
Gramenopoulos, N.
1973-01-01
For the automatic recognition of earth resources from ERTS-1 digital tapes, both multispectral and spatial pattern recognition techniques are important. Recognition of terrain types is based on spatial signatures that become evident by processing small portions of an image through selected algorithms. An investigation of spatial signatures that are applicable to ERTS-1 MSS images is described. Artifacts in the spatial signatures seem to be related to the multispectral scanner. A method for suppressing such artifacts is presented. Finally, results of terrain type recognition for one ERTS-1 image are presented.
Automatic anatomy recognition via multiobject oriented active shape models.
Chen, Xinjian; Udupa, Jayaram K; Alavi, Abass; Torigian, Drew A
2010-12-01
This paper studies the feasibility of developing an automatic anatomy recognition (AAR) system in clinical radiology and demonstrates its operation on clinical 2D images. The anatomy recognition method described here consists of two main components: (a) multiobject generalization of OASM and (b) object recognition strategies. The OASM algorithm is generalized to multiple objects by including a model for each object and assigning a cost structure specific to each object in the spirit of live wire. The delineation of multiobject boundaries is done in MOASM via a three level dynamic programming algorithm, wherein the first level is at pixel level which aims to find optimal oriented boundary segments between successive landmarks, the second level is at landmark level which aims to find optimal location for the landmarks, and the third level is at the object level which aims to find optimal arrangement of object boundaries over all objects. The object recognition strategy attempts to find that pose vector (consisting of translation, rotation, and scale component) for the multiobject model that yields the smallest total boundary cost for all objects. The delineation and recognition accuracies were evaluated separately utilizing routine clinical chest CT, abdominal CT, and foot MRI data sets. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF and FPVF). The recognition accuracy was assessed (1) in terms of the size of the space of the pose vectors for the model assembly that yielded high delineation accuracy, (2) as a function of the number of objects and objects' distribution and size in the model, (3) in terms of the interdependence between delineation and recognition, and (4) in terms of the closeness of the optimum recognition result to the global optimum. When multiple objects are included in the model, the delineation accuracy in terms of TPVF can be improved to 97%-98% with a low FPVF of 0.1%-0.2%. Typically, a recognition accuracy of > or = 90% yielded a TPVF > or = 95% and FPVF < or = 0.5%. Over the three data sets and over all tested objects, in 97% of the cases, the optimal solutions found by the proposed method constituted the true global optimum. The experimental results showed the feasibility and efficacy of the proposed automatic anatomy recognition system. Increasing the number of objects in the model can significantly improve both recognition and delineation accuracy. More spread out arrangement of objects in the model can lead to improved recognition and delineation accuracy. Including larger objects in the model also improved recognition and delineation. The proposed method almost always finds globally optimum solutions.
NASA Astrophysics Data System (ADS)
Kuznetsov, Michael V.
2006-05-01
For reliable teamwork of various systems of automatic telecommunication including transferring systems of optical communication networks it is necessary authentic recognition of signals for one- or two-frequency service signal system. The analysis of time parameters of an accepted signal allows increasing reliability of detection and recognition of the service signal system on a background of speech.
ERIC Educational Resources Information Center
Wigmore, Angela; Hunter, Gordon; Pflugel, Eckhard; Denholm-Price, James; Binelli, Vincent
2009-01-01
Speech technology--especially automatic speech recognition--has now advanced to a level where it can be of great benefit both to able-bodied people and those with various disabilities. In this paper we describe an application "TalkMaths" which, using the output from a commonly-used conventional automatic speech recognition system,…
Yang, Cheng-Huei; Luo, Ching-Hsing; Yang, Cheng-Hong; Chuang, Li-Yeh
2004-01-01
Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, including mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for disabled persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. This restriction is a major hindrance. Therefore, a switch adaptive automatic recognition method with a high recognition rate is needed. The proposed system combines counter-propagation networks with a variable degree variable step size LMS algorithm. It is divided into five stages: space recognition, tone recognition, learning process, adaptive processing, and character recognition. Statistical analyses demonstrated that the proposed method elicited a better recognition rate in comparison to alternative methods in the literature.
Definition and automatic anatomy recognition of lymph node zones in the pelvis on CT images
NASA Astrophysics Data System (ADS)
Liu, Yu; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Guo, Shuxu; Attor, Rosemary; Reinicke, Danica; Torigian, Drew A.
2016-03-01
Currently, unlike IALSC-defined thoracic lymph node zones, no explicitly provided definitions for lymph nodes in other body regions are available. Yet, definitions are critical for standardizing the recognition, delineation, quantification, and reporting of lymphadenopathy in other body regions. Continuing from our previous work in the thorax, this paper proposes a standardized definition of the grouping of pelvic lymph nodes into 10 zones. We subsequently employ our earlier Automatic Anatomy Recognition (AAR) framework designed for body-wide organ modeling, recognition, and delineation to actually implement these zonal definitions where the zones are treated as anatomic objects. First, all 10 zones and key anatomic organs used as anchors are manually delineated under expert supervision for constructing fuzzy anatomy models of the assembly of organs together with the zones. Then, optimal hierarchical arrangement of these objects is constructed for the purpose of achieving the best zonal recognition. For actual localization of the objects, two strategies are used -- optimal thresholded search for organs and one-shot method for the zones where the known relationship of the zones to key organs is exploited. Based on 50 computed tomography (CT) image data sets for the pelvic body region and an equal division into training and test subsets, automatic zonal localization within 1-3 voxels is achieved.
Automatic recognition of fundamental tissues on histology images of the human cardiovascular system.
Mazo, Claudia; Trujillo, Maria; Alegre, Enrique; Salazar, Liliana
2016-10-01
Cardiovascular disease is the leading cause of death worldwide. Therefore, techniques for improving diagnosis and treatment in this field have become key areas for research. In particular, approaches for tissue image processing may support education system and medical practice. In this paper, an approach to automatic recognition and classification of fundamental tissues, using morphological information is presented. Taking a 40× or 10× histological image as input, three clusters are created with the k-means algorithm using a structural tensor and the red and the green channels. Loose connective tissue, light regions and cell nuclei are recognised on 40× images. Then, the cell nuclei's features - shape and spatial projection - and light regions are used to recognise and classify epithelial cells and tissue into flat, cubic and cylindrical. In a similar way, light regions, loose connective and muscle tissues are recognised on 10× images. Finally, the tissue's function and composition are used to refine muscle tissue recognition. Experimental validation is then carried out by histologist following expert criteria, along with manually annotated images that are used as a ground-truth. The results revealed that the proposed approach classified the fundamental tissues in a similar way to the conventional method employed by histologists. The proposed automatic recognition approach provides for epithelial tissues a sensitivity of 0.79 for cubic, 0.85 for cylindrical and 0.91 for flat. Furthermore, the experts gave our method an average score of 4.85 out of 5 in the recognition of loose connective tissue and 4.82 out of 5 for muscle tissue recognition. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Fang, Leyuan; Yang, Liumao; Li, Shutao; Rabbani, Hossein; Liu, Zhimin; Peng, Qinghua; Chen, Xiangdong
2017-06-01
Detection and recognition of macular lesions in optical coherence tomography (OCT) are very important for retinal diseases diagnosis and treatment. As one kind of retinal disease (e.g., diabetic retinopathy) may contain multiple lesions (e.g., edema, exudates, and microaneurysms) and eye patients may suffer from multiple retinal diseases, multiple lesions often coexist within one retinal image. Therefore, one single-lesion-based detector may not support the diagnosis of clinical eye diseases. To address this issue, we propose a multi-instance multilabel-based lesions recognition (MIML-LR) method for the simultaneous detection and recognition of multiple lesions. The proposed MIML-LR method consists of the following steps: (1) segment the regions of interest (ROIs) for different lesions, (2) compute descriptive instances (features) for each lesion region, (3) construct multilabel detectors, and (4) recognize each ROI with the detectors. The proposed MIML-LR method was tested on 823 clinically labeled OCT images with normal macular and macular with three common lesions: epiretinal membrane, edema, and drusen. For each input OCT image, our MIML-LR method can automatically identify the number of lesions and assign the class labels, achieving the average accuracy of 88.72% for the cases with multiple lesions, which better assists macular disease diagnosis and treatment.
Can a CNN recognize Catalan diet?
NASA Astrophysics Data System (ADS)
Herruzo, P.; Bolaños, M.; Radeva, P.
2016-10-01
Nowadays, we can find several diseases related to the unhealthy diet habits of the population, such as diabetes, obesity, anemia, bulimia and anorexia. In many cases, these diseases are related to the food consumption of people. Mediterranean diet is scientifically known as a healthy diet that helps to prevent many metabolic diseases. In particular, our work focuses on the recognition of Mediterranean food and dishes. The development of this methodology would allow to analise the daily habits of users with wearable cameras, within the topic of lifelogging. By using automatic mechanisms we could build an objective tool for the analysis of the patient's behavior, allowing specialists to discover unhealthy food patterns and understand the user's lifestyle. With the aim to automatically recognize a complete diet, we introduce a challenging multi-labeled dataset related to Mediter-ranean diet called FoodCAT. The first type of label provided consists of 115 food classes with an average of 400 images per dish, and the second one consists of 12 food categories with an average of 3800 pictures per class. This dataset will serve as a basis for the development of automatic diet recognition. In this context, deep learning and more specifically, Convolutional Neural Networks (CNNs), currently are state-of-the-art methods for automatic food recognition. In our work, we compare several architectures for image classification, with the purpose of diet recognition. Applying the best model for recognising food categories, we achieve a top-1 accuracy of 72.29%, and top-5 of 97.07%. In a complete diet recognition of dishes from Mediterranean diet, enlarged with the Food-101 dataset for international dishes recognition, we achieve a top-1 accuracy of 68.07%, and top-5 of 89.53%, for a total of 115+101 food classes.
Vieira, Manuel; Fonseca, Paulo J; Amorim, M Clara P; Teixeira, Carlos J C
2015-12-01
The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Improved automatic adjustment of density and contrast in FCR system using neural network
NASA Astrophysics Data System (ADS)
Takeo, Hideya; Nakajima, Nobuyoshi; Ishida, Masamitsu; Kato, Hisatoyo
1994-05-01
FCR system has an automatic adjustment of image density and contrast by analyzing the histogram of image data in the radiation field. Advanced image recognition methods proposed in this paper can improve the automatic adjustment performance, in which neural network technology is used. There are two methods. Both methods are basically used 3-layer neural network with back propagation. The image data are directly input to the input-layer in one method and the histogram data is input in the other method. The former is effective to the imaging menu such as shoulder joint in which the position of interest region occupied on the histogram changes by difference of positioning and the latter is effective to the imaging menu such as chest-pediatrics in which the histogram shape changes by difference of positioning. We experimentally confirm the validity of these methods (about the automatic adjustment performance) as compared with the conventional histogram analysis methods.
Automatic 2.5-D Facial Landmarking and Emotion Annotation for Social Interaction Assistance.
Zhao, Xi; Zou, Jianhua; Li, Huibin; Dellandrea, Emmanuel; Kakadiaris, Ioannis A; Chen, Liming
2016-09-01
People with low vision, Alzheimer's disease, and autism spectrum disorder experience difficulties in perceiving or interpreting facial expression of emotion in their social lives. Though automatic facial expression recognition (FER) methods on 2-D videos have been extensively investigated, their performance was constrained by challenges in head pose and lighting conditions. The shape information in 3-D facial data can reduce or even overcome these challenges. However, high expenses of 3-D cameras prevent their widespread use. Fortunately, 2.5-D facial data from emerging portable RGB-D cameras provide a good balance for this dilemma. In this paper, we propose an automatic emotion annotation solution on 2.5-D facial data collected from RGB-D cameras. The solution consists of a facial landmarking method and a FER method. Specifically, we propose building a deformable partial face model and fit the model to a 2.5-D face for localizing facial landmarks automatically. In FER, a novel action unit (AU) space-based FER method has been proposed. Facial features are extracted using landmarks and further represented as coordinates in the AU space, which are classified into facial expressions. Evaluated on three publicly accessible facial databases, namely EURECOM, FRGC, and Bosphorus databases, the proposed facial landmarking and expression recognition methods have achieved satisfactory results. Possible real-world applications using our algorithms have also been discussed.
Activity Recognition for Personal Time Management
NASA Astrophysics Data System (ADS)
Prekopcsák, Zoltán; Soha, Sugárka; Henk, Tamás; Gáspár-Papanek, Csaba
We describe an accelerometer based activity recognition system for mobile phones with a special focus on personal time management. We compare several data mining algorithms for the automatic recognition task in the case of single user and multiuser scenario, and improve accuracy with heuristics and advanced data mining methods. The results show that daily activities can be recognized with high accuracy and the integration with the RescueTime software can give good insights for personal time management.
Niioka, Hirohiko; Asatani, Satoshi; Yoshimura, Aina; Ohigashi, Hironori; Tagawa, Seiichi; Miyake, Jun
2018-01-01
In the field of regenerative medicine, tremendous numbers of cells are necessary for tissue/organ regeneration. Today automatic cell-culturing system has been developed. The next step is constructing a non-invasive method to monitor the conditions of cells automatically. As an image analysis method, convolutional neural network (CNN), one of the deep learning method, is approaching human recognition level. We constructed and applied the CNN algorithm for automatic cellular differentiation recognition of myogenic C2C12 cell line. Phase-contrast images of cultured C2C12 are prepared as input dataset. In differentiation process from myoblasts to myotubes, cellular morphology changes from round shape to elongated tubular shape due to fusion of the cells. CNN abstract the features of the shape of the cells and classify the cells depending on the culturing days from when differentiation is induced. Changes in cellular shape depending on the number of days of culture (Day 0, Day 3, Day 6) are classified with 91.3% accuracy. Image analysis with CNN has a potential to realize regenerative medicine industry.
Intelligent form removal with character stroke preservation
NASA Astrophysics Data System (ADS)
Garris, Michael D.
1996-03-01
A new technique for intelligent form removal has been developed along with a new method for evaluating its impact on optical character recognition (OCR). All the dominant lines in the image are automatically detected using the Hough line transform and intelligently erased while simultaneously preserving overlapping character strokes by computing line width statistics and keying off of certain visual cues. This new method of form removal operates on loosely defined zones with no image deskewing. Any field in which the writer is provided a horizontal line to enter a response can be processed by this method. Several examples of processed fields are provided, including a comparison of results between the new method and a commercially available forms removal package. Even if this new form removal method did not improve character recognition accuracy, it is still a significant improvement to the technology because the requirement of a priori knowledge of the form's geometric details has been greatly reduced. This relaxes the recognition system's dependence on rigid form design, printing, and reproduction by automatically detecting and removing some of the physical structures (lines) on the form. Using the National Institute of Standards and Technology (NIST) public domain form-based handprint recognition system, the technique was tested on a large number of fields containing randomly ordered handprinted lowercase alphabets, as these letters (especially those with descenders) frequently touch and extend through the line along which they are written. Preserving character strokes improves overall lowercase recognition performance by 3%, which is a net improvement, but a single performance number like this doesn't communicate how the recognition process was really influenced. There is expected to be trade- offs with the introduction of any new technique into a complex recognition system. To understand both the improvements and the trade-offs, a new analysis was designed to compare the statistical distributions of individual confusion pairs between two systems. As OCR technology continues to improve, sophisticated analyses like this are necessary to reduce the errors remaining in complex recognition problems.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System.
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-10-20
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-01-01
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias. PMID:27775596
Automatic Artifact Removal from Electroencephalogram Data Based on A Priori Artifact Information.
Zhang, Chi; Tong, Li; Zeng, Ying; Jiang, Jingfang; Bu, Haibing; Yan, Bin; Li, Jianxin
2015-01-01
Electroencephalogram (EEG) is susceptible to various nonneural physiological artifacts. Automatic artifact removal from EEG data remains a key challenge for extracting relevant information from brain activities. To adapt to variable subjects and EEG acquisition environments, this paper presents an automatic online artifact removal method based on a priori artifact information. The combination of discrete wavelet transform and independent component analysis (ICA), wavelet-ICA, was utilized to separate artifact components. The artifact components were then automatically identified using a priori artifact information, which was acquired in advance. Subsequently, signal reconstruction without artifact components was performed to obtain artifact-free signals. The results showed that, using this automatic online artifact removal method, there were statistical significant improvements of the classification accuracies in both two experiments, namely, motor imagery and emotion recognition.
Automatic Artifact Removal from Electroencephalogram Data Based on A Priori Artifact Information
Zhang, Chi; Tong, Li; Zeng, Ying; Jiang, Jingfang; Bu, Haibing; Li, Jianxin
2015-01-01
Electroencephalogram (EEG) is susceptible to various nonneural physiological artifacts. Automatic artifact removal from EEG data remains a key challenge for extracting relevant information from brain activities. To adapt to variable subjects and EEG acquisition environments, this paper presents an automatic online artifact removal method based on a priori artifact information. The combination of discrete wavelet transform and independent component analysis (ICA), wavelet-ICA, was utilized to separate artifact components. The artifact components were then automatically identified using a priori artifact information, which was acquired in advance. Subsequently, signal reconstruction without artifact components was performed to obtain artifact-free signals. The results showed that, using this automatic online artifact removal method, there were statistical significant improvements of the classification accuracies in both two experiments, namely, motor imagery and emotion recognition. PMID:26380294
Semi-automatic mapping of cultural heritage from airborne laser scanning using deep learning
NASA Astrophysics Data System (ADS)
Due Trier, Øivind; Salberg, Arnt-Børre; Holger Pilø, Lars; Tonning, Christer; Marius Johansen, Hans; Aarsten, Dagrun
2016-04-01
This paper proposes to use deep learning to improve semi-automatic mapping of cultural heritage from airborne laser scanning (ALS) data. Automatic detection methods, based on traditional pattern recognition, have been applied in a number of cultural heritage mapping projects in Norway for the past five years. Automatic detection of pits and heaps have been combined with visual interpretation of the ALS data for the mapping of deer hunting systems, iron production sites, grave mounds and charcoal kilns. However, the performance of the automatic detection methods varies substantially between ALS datasets. For the mapping of deer hunting systems on flat gravel and sand sediment deposits, the automatic detection results were almost perfect. However, some false detections appeared in the terrain outside of the sediment deposits. These could be explained by other pit-like landscape features, like parts of river courses, spaces between boulders, and modern terrain modifications. However, these were easy to spot during visual interpretation, and the number of missed individual pitfall traps was still low. For the mapping of grave mounds, the automatic method produced a large number of false detections, reducing the usefulness of the semi-automatic approach. The mound structure is a very common natural terrain feature, and the grave mounds are less distinct in shape than the pitfall traps. Still, applying automatic mound detection on an entire municipality did lead to a new discovery of an Iron Age grave field with more than 15 individual mounds. Automatic mound detection also proved to be useful for a detailed re-mapping of Norway's largest Iron Age grave yard, which contains almost 1000 individual graves. Combined pit and mound detection has been applied to the mapping of more than 1000 charcoal kilns that were used by an iron work 350-200 years ago. The majority of charcoal kilns were indirectly detected as either pits on the circumference, a central mound, or both. However, kilns with a flat interior and a shallow ditch along the circumference were often missed by the automatic detection method. The successfulness of automatic detection seems to depend on two factors: (1) the density of ALS ground hits on the cultural heritage structures being sought, and (2) to what extent these structures stand out from natural terrain structures. The first factor may, to some extent, be improved by using a higher number of ALS pulses per square meter. The second factor is difficult to change, and also highlights another challenge: how to make a general automatic method that is applicable in all types of terrain within a country. The mixed experience with traditional pattern recognition for semi-automatic mapping of cultural heritage led us to consider deep learning as an alternative approach. The main principle is that a general feature detector has been trained on a large image database. The feature detector is then tailored to a specific task by using a modest number of images of true and false examples of the features being sought. Results of using deep learning are compared with previous results using traditional pattern recognition.
A multilingual gold-standard corpus for biomedical concept recognition: the Mantra GSC
Clematide, Simon; Akhondi, Saber A; van Mulligen, Erik M; Rebholz-Schuhmann, Dietrich
2015-01-01
Objective To create a multilingual gold-standard corpus for biomedical concept recognition. Materials and methods We selected text units from different parallel corpora (Medline abstract titles, drug labels, biomedical patent claims) in English, French, German, Spanish, and Dutch. Three annotators per language independently annotated the biomedical concepts, based on a subset of the Unified Medical Language System and covering a wide range of semantic groups. To reduce the annotation workload, automatically generated preannotations were provided. Individual annotations were automatically harmonized and then adjudicated, and cross-language consistency checks were carried out to arrive at the final annotations. Results The number of final annotations was 5530. Inter-annotator agreement scores indicate good agreement (median F-score 0.79), and are similar to those between individual annotators and the gold standard. The automatically generated harmonized annotation set for each language performed equally well as the best annotator for that language. Discussion The use of automatic preannotations, harmonized annotations, and parallel corpora helped to keep the manual annotation efforts manageable. The inter-annotator agreement scores provide a reference standard for gauging the performance of automatic annotation techniques. Conclusion To our knowledge, this is the first gold-standard corpus for biomedical concept recognition in languages other than English. Other distinguishing features are the wide variety of semantic groups that are being covered, and the diversity of text genres that were annotated. PMID:25948699
Facial recognition in education system
NASA Astrophysics Data System (ADS)
Krithika, L. B.; Venkatesh, K.; Rathore, S.; Kumar, M. Harish
2017-11-01
Human beings exploit emotions comprehensively for conveying messages and their resolution. Emotion detection and face recognition can provide an interface between the individuals and technologies. The most successful applications of recognition analysis are recognition of faces. Many different techniques have been used to recognize the facial expressions and emotion detection handle varying poses. In this paper, we approach an efficient method to recognize the facial expressions to track face points and distances. This can automatically identify observer face movements and face expression in image. This can capture different aspects of emotion and facial expressions.
Probst, Yasmine; Nguyen, Duc Thanh; Tran, Minh Khoi; Li, Wanqing
2015-07-27
Dietary assessment, while traditionally based on pen-and-paper, is rapidly moving towards automatic approaches. This study describes an Australian automatic food record method and its prototype for dietary assessment via the use of a mobile phone and techniques of image processing and pattern recognition. Common visual features including scale invariant feature transformation (SIFT), local binary patterns (LBP), and colour are used for describing food images. The popular bag-of-words (BoW) model is employed for recognizing the images taken by a mobile phone for dietary assessment. Technical details are provided together with discussions on the issues and future work.
Automatic Speech Recognition in Air Traffic Control: a Human Factors Perspective
NASA Technical Reports Server (NTRS)
Karlsson, Joakim
1990-01-01
The introduction of Automatic Speech Recognition (ASR) technology into the Air Traffic Control (ATC) system has the potential to improve overall safety and efficiency. However, because ASR technology is inherently a part of the man-machine interface between the user and the system, the human factors issues involved must be addressed. Here, some of the human factors problems are identified and related methods of investigation are presented. Research at M.I.T.'s Flight Transportation Laboratory is being conducted from a human factors perspective, focusing on intelligent parser design, presentation of feedback, error correction strategy design, and optimal choice of input modalities.
NASA Astrophysics Data System (ADS)
Scharenborg, Odette; ten Bosch, Louis; Boves, Lou; Norris, Dennis
2003-12-01
This letter evaluates potential benefits of combining human speech recognition (HSR) and automatic speech recognition by building a joint model of an automatic phone recognizer (APR) and a computational model of HSR, viz., Shortlist [Norris, Cognition 52, 189-234 (1994)]. Experiments based on ``real-life'' speech highlight critical limitations posed by some of the simplifying assumptions made in models of human speech recognition. These limitations could be overcome by avoiding hard phone decisions at the output side of the APR, and by using a match between the input and the internal lexicon that flexibly copes with deviations from canonical phonemic representations.
Automatically Detecting Likely Edits in Clinical Notes Created Using Automatic Speech Recognition
Lybarger, Kevin; Ostendorf, Mari; Yetisgen, Meliha
2017-01-01
The use of automatic speech recognition (ASR) to create clinical notes has the potential to reduce costs associated with note creation for electronic medical records, but at current system accuracy levels, post-editing by practitioners is needed to ensure note quality. Aiming to reduce the time required to edit ASR transcripts, this paper investigates novel methods for automatic detection of edit regions within the transcripts, including both putative ASR errors but also regions that are targets for cleanup or rephrasing. We create detection models using logistic regression and conditional random field models, exploring a variety of text-based features that consider the structure of clinical notes and exploit the medical context. Different medical text resources are used to improve feature extraction. Experimental results on a large corpus of practitioner-edited clinical notes show that 67% of sentence-level edits and 45% of word-level edits can be detected with a false detection rate of 15%. PMID:29854187
A Limited-Vocabulary, Multi-Speaker Automatic Isolated Word Recognition System.
ERIC Educational Resources Information Center
Paul, James E., Jr.
Techniques for automatic recognition of isolated words are investigated, and a computer simulation of a word recognition system is effected. Considered in detail are data acquisition and digitizing, word detection, amplitude and time normalization, short-time spectral estimation including spectral windowing, spectral envelope approximation,…
Development of an automated ultrasonic testing system
NASA Astrophysics Data System (ADS)
Shuxiang, Jiao; Wong, Brian Stephen
2005-04-01
Non-Destructive Testing is necessary in areas where defects in structures emerge over time due to wear and tear and structural integrity is necessary to maintain its usability. However, manual testing results in many limitations: high training cost, long training procedure, and worse, the inconsistent test results. A prime objective of this project is to develop an automatic Non-Destructive testing system for a shaft of the wheel axle of a railway carriage. Various methods, such as the neural network, pattern recognition methods and knowledge-based system are used for the artificial intelligence problem. In this paper, a statistical pattern recognition approach, Classification Tree is applied. Before feature selection, a thorough study on the ultrasonic signals produced was carried out. Based on the analysis of the ultrasonic signals, three signal processing methods were developed to enhance the ultrasonic signals: Cross-Correlation, Zero-Phase filter and Averaging. The target of this step is to reduce the noise and make the signal character more distinguishable. Four features: 1. The Auto Regressive Model Coefficients. 2. Standard Deviation. 3. Pearson Correlation 4. Dispersion Uniformity Degree are selected. And then a Classification Tree is created and applied to recognize the peak positions and amplitudes. Searching local maximum is carried out before feature computing. This procedure reduces much computation time in the real-time testing. Based on this algorithm, a software package called SOFRA was developed to recognize the peaks, calibrate automatically and test a simulated shaft automatically. The automatic calibration procedure and the automatic shaft testing procedure are developed.
Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun
2018-01-01
Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. PMID:29786665
Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun
2018-05-22
Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition.
A Vocal-Based Analytical Method for Goose Behaviour Recognition
Steen, Kim Arild; Therkildsen, Ole Roland; Karstoft, Henrik; Green, Ole
2012-01-01
Since human-wildlife conflicts are increasing, the development of cost-effective methods for reducing damage or conflict levels is important in wildlife management. A wide range of devices to detect and deter animals causing conflict are used for this purpose, although their effectiveness is often highly variable, due to habituation to disruptive or disturbing stimuli. Automated recognition of behaviours could form a critical component of a system capable of altering the disruptive stimuli to avoid this. In this paper we present a novel method to automatically recognise goose behaviour based on vocalisations from flocks of free-living barnacle geese (Branta leucopsis). The geese were observed and recorded in a natural environment, using a shielded shotgun microphone. The classification used Support Vector Machines (SVMs), which had been trained with labeled data. Greenwood Function Cepstral Coefficients (GFCC) were used as features for the pattern recognition algorithm, as they can be adjusted to the hearing capabilities of different species. Three behaviours are classified based in this approach, and the method achieves a good recognition of foraging behaviour (86–97% sensitivity, 89–98% precision) and a reasonable recognition of flushing (79–86%, 66–80%) and landing behaviour(73–91%, 79–92%). The Support Vector Machine has proven to be a robust classifier for this kind of classification, as generality and non-linear capabilities are important. We conclude that vocalisations can be used to automatically detect behaviour of conflict wildlife species, and as such, may be used as an integrated part of a wildlife management system. PMID:22737037
Speaker-Machine Interaction in Automatic Speech Recognition. Technical Report.
ERIC Educational Resources Information Center
Makhoul, John I.
The feasibility and limitations of speaker adaptation in improving the performance of a "fixed" (speaker-independent) automatic speech recognition system were examined. A fixed vocabulary of 55 syllables is used in the recognition system which contains 11 stops and fricatives and five tense vowels. The results of an experiment on speaker…
Corneanu, Ciprian Adrian; Simon, Marc Oliu; Cohn, Jeffrey F; Guerrero, Sergio Escalera
2016-08-01
Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.
Probst, Yasmine; Nguyen, Duc Thanh; Tran, Minh Khoi; Li, Wanqing
2015-01-01
Dietary assessment, while traditionally based on pen-and-paper, is rapidly moving towards automatic approaches. This study describes an Australian automatic food record method and its prototype for dietary assessment via the use of a mobile phone and techniques of image processing and pattern recognition. Common visual features including scale invariant feature transformation (SIFT), local binary patterns (LBP), and colour are used for describing food images. The popular bag-of-words (BoW) model is employed for recognizing the images taken by a mobile phone for dietary assessment. Technical details are provided together with discussions on the issues and future work. PMID:26225994
The application of automatic recognition techniques in the Apollo 9 SO-65 experiment
NASA Technical Reports Server (NTRS)
Macdonald, R. B.
1970-01-01
A synoptic feature analysis is reported on Apollo 9 remote earth surface photographs that uses the methods of statistical pattern recognition to classify density points and clusterings in digital conversion of optical data. A computer derived geological map of a geological test site indicates that geological features of the range are separable, but that specific rock types are not identifiable.
ATR applications of minimax entropy models of texture and shape
NASA Astrophysics Data System (ADS)
Zhu, Song-Chun; Yuille, Alan L.; Lanterman, Aaron D.
2001-10-01
Concepts from information theory have recently found favor in both the mainstream computer vision community and the military automatic target recognition community. In the computer vision literature, the principles of minimax entropy learning theory have been used to generate rich probabilitistic models of texture and shape. In addition, the method of types and large deviation theory has permitted the difficulty of various texture and shape recognition tasks to be characterized by 'order parameters' that determine how fundamentally vexing a task is, independent of the particular algorithm used. These information-theoretic techniques have been demonstrated using traditional visual imagery in applications such as simulating cheetah skin textures and such as finding roads in aerial imagery. We discuss their application to problems in the specific application domain of automatic target recognition using infrared imagery. We also review recent theoretical and algorithmic developments which permit learning minimax entropy texture models for infrared textures in reasonable timeframes.
Automatic anatomy recognition in post-tonsillectomy MR images of obese children with OSAS
NASA Astrophysics Data System (ADS)
Tong, Yubing; Udupa, Jayaram K.; Odhner, Dewey; Sin, Sanghun; Arens, Raanan
2015-03-01
Automatic Anatomy Recognition (AAR) is a recently developed approach for the automatic whole body wide organ segmentation. We previously tested that methodology on image cases with some pathology where the organs were not distorted significantly. In this paper, we present an advancement of AAR to handle organs which may have been modified or resected by surgical intervention. We focus on MRI of the neck in pediatric Obstructive Sleep Apnea Syndrome (OSAS). The proposed method consists of an AAR step followed by support vector machine techniques to detect the presence/absence of organs. The AAR step employs a hierarchical organization of the organs for model building. For each organ, a fuzzy model over a population is built. The model of the body region is then described in terms of the fuzzy models and a host of other descriptors which include parent to offspring relationship estimated over the population. Organs are recognized following the organ hierarchy by using an optimal threshold based search. The SVM step subsequently checks for evidence of the presence of organs. Experimental results show that AAR techniques can be combined with machine learning strategies within the AAR recognition framework for good performance in recognizing missing organs, in our case missing tonsils in post-tonsillectomy images as well as in simulating tonsillectomy images. The previous recognition performance is maintained achieving an organ localization accuracy of within 1 voxel when the organ is actually not removed. To our knowledge, no methods have been reported to date for handling significantly deformed or missing organs, especially in neck MRI.
Li, Heng; Su, Xiaofan; Wang, Jing; Kan, Han; Han, Tingting; Zeng, Yajie; Chai, Xinyu
2018-01-01
Current retinal prostheses can only generate low-resolution visual percepts constituted of limited phosphenes which are elicited by an electrode array and with uncontrollable color and restricted grayscale. Under this visual perception, prosthetic recipients can just complete some simple visual tasks, but more complex tasks like face identification/object recognition are extremely difficult. Therefore, it is necessary to investigate and apply image processing strategies for optimizing the visual perception of the recipients. This study focuses on recognition of the object of interest employing simulated prosthetic vision. We used a saliency segmentation method based on a biologically plausible graph-based visual saliency model and a grabCut-based self-adaptive-iterative optimization framework to automatically extract foreground objects. Based on this, two image processing strategies, Addition of Separate Pixelization and Background Pixel Shrink, were further utilized to enhance the extracted foreground objects. i) The results showed by verification of psychophysical experiments that under simulated prosthetic vision, both strategies had marked advantages over Direct Pixelization in terms of recognition accuracy and efficiency. ii) We also found that recognition performance under two strategies was tied to the segmentation results and was affected positively by the paired-interrelated objects in the scene. The use of the saliency segmentation method and image processing strategies can automatically extract and enhance foreground objects, and significantly improve object recognition performance towards recipients implanted a high-density implant. Copyright © 2017 Elsevier B.V. All rights reserved.
Emotion Recognition from EEG Signals Using Multidimensional Information in EMD Domain.
Zhuang, Ning; Zeng, Ying; Tong, Li; Zhang, Chi; Zhang, Hanming; Yan, Bin
2017-01-01
This paper introduces a method for feature extraction and emotion recognition based on empirical mode decomposition (EMD). By using EMD, EEG signals are decomposed into Intrinsic Mode Functions (IMFs) automatically. Multidimensional information of IMF is utilized as features, the first difference of time series, the first difference of phase, and the normalized energy. The performance of the proposed method is verified on a publicly available emotional database. The results show that the three features are effective for emotion recognition. The role of each IMF is inquired and we find that high frequency component IMF1 has significant effect on different emotional states detection. The informative electrodes based on EMD strategy are analyzed. In addition, the classification accuracy of the proposed method is compared with several classical techniques, including fractal dimension (FD), sample entropy, differential entropy, and discrete wavelet transform (DWT). Experiment results on DEAP datasets demonstrate that our method can improve emotion recognition performance.
Li, Yanpeng; Li, Xiang; Wang, Hongqiang; Chen, Yiping; Zhuang, Zhaowen; Cheng, Yongqiang; Deng, Bin; Wang, Liandong; Zeng, Yonghu; Gao, Lei
2014-01-01
This paper offers a compacted mechanism to carry out the performance evaluation work for an automatic target recognition (ATR) system: (a) a standard description of the ATR system's output is suggested, a quantity to indicate the operating condition is presented based on the principle of feature extraction in pattern recognition, and a series of indexes to assess the output in different aspects are developed with the application of statistics; (b) performance of the ATR system is interpreted by a quality factor based on knowledge of engineering mathematics; (c) through a novel utility called “context-probability” estimation proposed based on probability, performance prediction for an ATR system is realized. The simulation result shows that the performance of an ATR system can be accounted for and forecasted by the above-mentioned measures. Compared to existing technologies, the novel method can offer more objective performance conclusions for an ATR system. These conclusions may be helpful in knowing the practical capability of the tested ATR system. At the same time, the generalization performance of the proposed method is good. PMID:24967605
Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks
ERIC Educational Resources Information Center
Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya
2016-01-01
This study investigates the use of Automatic Speech Recognition (ASR) systems to epitomize second language (L2) listeners' problems in perception of TED talks. ASR-generated transcripts of videos often involve recognition errors, which may indicate difficult segments for L2 listeners. This paper aims to discover the root-causes of the ASR errors…
An image-based automatic recognition method for the flowering stage of maize
NASA Astrophysics Data System (ADS)
Yu, Zhenghong; Zhou, Huabing; Li, Cuina
2018-03-01
In this paper, we proposed an image-based approach for automatic recognizing the flowering stage of maize. A modified HOG/SVM detection framework is first adopted to detect the ears of maize. Then, we use low-rank matrix recovery technology to precisely extract the ears at pixel level. At last, a new feature called color gradient histogram, as an indicator, is proposed to determine the flowering stage. Comparing experiment has been carried out to testify the validity of our method and the results indicate that our method can meet the demand for practical observation.
Speech Processing and Recognition (SPaRe)
2011-01-01
results in the areas of automatic speech recognition (ASR), speech processing, machine translation (MT), natural language processing ( NLP ), and...Processing ( NLP ), Information Retrieval (IR) 16. SECURITY CLASSIFICATION OF: UNCLASSIFED 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME...Figure 9, the IOC was only expected to provide document submission and search; automatic speech recognition (ASR) for English, Spanish, Arabic , and
Four-Channel Biosignal Analysis and Feature Extraction for Automatic Emotion Recognition
NASA Astrophysics Data System (ADS)
Kim, Jonghwa; André, Elisabeth
This paper investigates the potential of physiological signals as a reliable channel for automatic recognition of user's emotial state. For the emotion recognition, little attention has been paid so far to physiological signals compared to audio-visual emotion channels such as facial expression or speech. All essential stages of automatic recognition system using biosignals are discussed, from recording physiological dataset up to feature-based multiclass classification. Four-channel biosensors are used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to search the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by emotion recognition results.
Cavalli, Fabio; Lusnig, Luca; Trentin, Edmondo
2017-05-01
Sex determination on skeletal remains is one of the most important diagnosis in forensic cases and in demographic studies on ancient populations. Our purpose is to realize an automatic operator-independent method to determine the sex from the bone shape and to test an intelligent, automatic pattern recognition system in an anthropological domain. Our multiple-classifier system is based exclusively on the morphological variants of a curve that represents the sagittal profile of the calvarium, modeled via artificial neural networks, and yields an accuracy higher than 80 %. The application of this system to other bone profiles is expected to further improve the sensibility of the methodology.
Automatic Recognition of Road Signs
NASA Astrophysics Data System (ADS)
Inoue, Yasuo; Kohashi, Yuuichirou; Ishikawa, Naoto; Nakajima, Masato
2002-11-01
The increase in traffic accidents is becoming a serious social problem with the recent rapid traffic increase. In many cases, the driver"s carelessness is the primary factor of traffic accidents, and the driver assistance system is demanded for supporting driver"s safety. In this research, we propose the new method of automatic detection and recognition of road signs by image processing. The purpose of this research is to prevent accidents caused by driver"s carelessness, and call attention to a driver when the driver violates traffic a regulation. In this research, high accuracy and the efficient sign detecting method are realized by removing unnecessary information except for a road sign from an image, and detect a road sign using shape features. At first, the color information that is not used in road signs is removed from an image. Next, edges except for circular and triangle ones are removed to choose sign shape. In the recognition process, normalized cross correlation operation is carried out to the two-dimensional differentiation pattern of a sign, and the accurate and efficient method for detecting the road sign is realized. Moreover, the real-time operation in a software base was realized by holding down calculation cost, maintaining highly precise sign detection and recognition. Specifically, it becomes specifically possible to process by 0.1 sec(s)/frame using a general-purpose PC (CPU: Pentium4 1.7GHz). As a result of in-vehicle experimentation, our system could process on real time and has confirmed that detection and recognition of a sign could be performed correctly.
Infrared target recognition based on improved joint local ternary pattern
NASA Astrophysics Data System (ADS)
Sun, Junding; Wu, Xiaosheng
2016-05-01
This paper presents a simple, efficient, yet robust approach, named joint orthogonal combination of local ternary pattern, for automatic forward-looking infrared target recognition. It gives more advantages to describe the macroscopic textures and microscopic textures by fusing variety of scales than the traditional LBP-based methods. In addition, it can effectively reduce the feature dimensionality. Further, the rotation invariant and uniform scheme, the robust LTP, and soft concave-convex partition are introduced to enhance its discriminative power. Experimental results demonstrate that the proposed method can achieve competitive results compared with the state-of-the-art methods.
Chun, Hong-Woo; Tsuruoka, Yoshimasa; Kim, Jin-Dong; Shiba, Rie; Nagata, Naoki; Hishiki, Teruyoshi; Tsujii, Jun'ichi
2006-01-01
Background Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. Methods We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. Results Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. Conclusion A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques. PMID:17134477
CNN based approach for activity recognition using a wrist-worn accelerometer.
Panwar, Madhuri; Dyuthi, S Ram; Chandra Prakash, K; Biswas, Dwaipayan; Acharyya, Amit; Maharatna, Koushik; Gautam, Arvind; Naik, Ganesh R
2017-07-01
In recent years, significant advancements have taken place in human activity recognition using various machine learning approaches. However, feature engineering have dominated conventional methods involving the difficult process of optimal feature selection. This problem has been mitigated by using a novel methodology based on deep learning framework which automatically extracts the useful features and reduces the computational cost. As a proof of concept, we have attempted to design a generalized model for recognition of three fundamental movements of the human forearm performed in daily life where data is collected from four different subjects using a single wrist worn accelerometer sensor. The validation of the proposed model is done with different pre-processing and noisy data condition which is evaluated using three possible methods. The results show that our proposed methodology achieves an average recognition rate of 99.8% as opposed to conventional methods based on K-means clustering, linear discriminant analysis and support vector machine.
Automatic recognition of postural allocations.
Sazonov, Edward; Krishnamurthy, Vidya; Makeyev, Oleksandr; Browning, Ray; Schutz, Yves; Hill, James
2007-01-01
A significant part of daily energy expenditure may be attributed to non-exercise activity thermogenesis and exercise activity thermogenesis. Automatic recognition of postural allocations such as standing or sitting can be used in behavioral modification programs aimed at minimizing static postures. In this paper we propose a shoe-based device and related pattern recognition methodology for recognition of postural allocations. Inexpensive technology allows implementation of this methodology as a part of footwear. The experimental results suggest high efficiency and reliability of the proposed approach.
Automated feature detection and identification in digital point-ordered signals
Oppenlander, Jane E.; Loomis, Kent C.; Brudnoy, David M.; Levy, Arthur J.
1998-01-01
A computer-based automated method to detect and identify features in digital point-ordered signals. The method is used for processing of non-destructive test signals, such as eddy current signals obtained from calibration standards. The signals are first automatically processed to remove noise and to determine a baseline. Next, features are detected in the signals using mathematical morphology filters. Finally, verification of the features is made using an expert system of pattern recognition methods and geometric criteria. The method has the advantage that standard features can be, located without prior knowledge of the number or sequence of the features. Further advantages are that standard features can be differentiated from irrelevant signal features such as noise, and detected features are automatically verified by parameters extracted from the signals. The method proceeds fully automatically without initial operator set-up and without subjective operator feature judgement.
NASA Astrophysics Data System (ADS)
Wang, Jinhu; Lindenbergh, Roderik; Menenti, Massimo
2017-06-01
Urban road environments contain a variety of objects including different types of lamp poles and traffic signs. Its monitoring is traditionally conducted by visual inspection, which is time consuming and expensive. Mobile laser scanning (MLS) systems sample the road environment efficiently by acquiring large and accurate point clouds. This work proposes a methodology for urban road object recognition from MLS point clouds. The proposed method uses, for the first time, shape descriptors of complete objects to match repetitive objects in large point clouds. To do so, a novel 3D multi-scale shape descriptor is introduced, that is embedded in a workflow that efficiently and automatically identifies different types of lamp poles and traffic signs. The workflow starts by tiling the raw point clouds along the scanning trajectory and by identifying non-ground points. After voxelization of the non-ground points, connected voxels are clustered to form candidate objects. For automatic recognition of lamp poles and street signs, a 3D significant eigenvector based shape descriptor using voxels (SigVox) is introduced. The 3D SigVox descriptor is constructed by first subdividing the points with an octree into several levels. Next, significant eigenvectors of the points in each voxel are determined by principal component analysis (PCA) and mapped onto the appropriate triangle of a sphere approximating icosahedron. This step is repeated for different scales. By determining the similarity of 3D SigVox descriptors between candidate point clusters and training objects, street furniture is automatically identified. The feasibility and quality of the proposed method is verified on two point clouds obtained in opposite direction of a stretch of road of 4 km. 6 types of lamp pole and 4 types of road sign were selected as objects of interest. Ground truth validation showed that the overall accuracy of the ∼170 automatically recognized objects is approximately 95%. The results demonstrate that the proposed method is able to recognize street furniture in a practical scenario. Remaining difficult cases are touching objects, like a lamp pole close to a tree.
Digital signal processing algorithms for automatic voice recognition
NASA Technical Reports Server (NTRS)
Botros, Nazeih M.
1987-01-01
The current digital signal analysis algorithms are investigated that are implemented in automatic voice recognition algorithms. Automatic voice recognition means, the capability of a computer to recognize and interact with verbal commands. The digital signal is focused on, rather than the linguistic, analysis of speech signal. Several digital signal processing algorithms are available for voice recognition. Some of these algorithms are: Linear Predictive Coding (LPC), Short-time Fourier Analysis, and Cepstrum Analysis. Among these algorithms, the LPC is the most widely used. This algorithm has short execution time and do not require large memory storage. However, it has several limitations due to the assumptions used to develop it. The other 2 algorithms are frequency domain algorithms with not many assumptions, but they are not widely implemented or investigated. However, with the recent advances in the digital technology, namely signal processors, these 2 frequency domain algorithms may be investigated in order to implement them in voice recognition. This research is concerned with real time, microprocessor based recognition algorithms.
Practical vision based degraded text recognition system
NASA Astrophysics Data System (ADS)
Mohammad, Khader; Agaian, Sos; Saleh, Hani
2011-02-01
Rapid growth and progress in the medical, industrial, security and technology fields means more and more consideration for the use of camera based optical character recognition (OCR) Applying OCR to scanned documents is quite mature, and there are many commercial and research products available on this topic. These products achieve acceptable recognition accuracy and reasonable processing times especially with trained software, and constrained text characteristics. Even though the application space for OCR is huge, it is quite challenging to design a single system that is capable of performing automatic OCR for text embedded in an image irrespective of the application. Challenges for OCR systems include; images are taken under natural real world conditions, Surface curvature, text orientation, font, size, lighting conditions, and noise. These and many other conditions make it extremely difficult to achieve reasonable character recognition. Performance for conventional OCR systems drops dramatically as the degradation level of the text image quality increases. In this paper, a new recognition method is proposed to recognize solid or dotted line degraded characters. The degraded text string is localized and segmented using a new algorithm. The new method was implemented and tested using a development framework system that is capable of performing OCR on camera captured images. The framework allows parameter tuning of the image-processing algorithm based on a training set of camera-captured text images. Novel methods were used for enhancement, text localization and the segmentation algorithm which enables building a custom system that is capable of performing automatic OCR which can be used for different applications. The developed framework system includes: new image enhancement, filtering, and segmentation techniques which enabled higher recognition accuracies, faster processing time, and lower energy consumption, compared with the best state of the art published techniques. The system successfully produced impressive OCR accuracies (90% -to- 93%) using customized systems generated by our development framework in two industrial OCR applications: water bottle label text recognition and concrete slab plate text recognition. The system was also trained for the Arabic language alphabet, and demonstrated extremely high recognition accuracy (99%) for Arabic license name plate text recognition with processing times of 10 seconds. The accuracy and run times of the system were compared to conventional and many states of art methods, the proposed system shows excellent results.
Morphological feature extraction for the classification of digital images of cancerous tissues.
Thiran, J P; Macq, B
1996-10-01
This paper presents a new method for automatic recognition of cancerous tissues from an image of a microscopic section. Based on the shape and the size analysis of the observed cells, this method provides the physician with nonsubjective numerical values for four criteria of malignancy. This automatic approach is based on mathematical morphology, and more specifically on the use of Geodesy. This technique is used first to remove the background noise from the image and then to operate a segmentation of the nuclei of the cells and an analysis of their shape, their size, and their texture. From the values of the extracted criteria, an automatic classification of the image (cancerous or not) is finally operated.
NASA Astrophysics Data System (ADS)
Shuxin, Li; Zhilong, Zhang; Biao, Li
2018-01-01
Plane is an important target category in remote sensing targets and it is of great value to detect the plane targets automatically. As remote imaging technology developing continuously, the resolution of the remote sensing image has been very high and we can get more detailed information for detecting the remote sensing targets automatically. Deep learning network technology is the most advanced technology in image target detection and recognition, which provided great performance improvement in the field of target detection and recognition in the everyday scenes. We combined the technology with the application in the remote sensing target detection and proposed an algorithm with end to end deep network, which can learn from the remote sensing images to detect the targets in the new images automatically and robustly. Our experiments shows that the algorithm can capture the feature information of the plane target and has better performance in target detection with the old methods.
Automatic Recognition of Indoor Navigation Elements from Kinect Point Clouds
NASA Astrophysics Data System (ADS)
Zeng, L.; Kang, Z.
2017-09-01
This paper realizes automatically the navigating elements defined by indoorGML data standard - door, stairway and wall. The data used is indoor 3D point cloud collected by Kinect v2 launched in 2011 through the means of ORB-SLAM. By contrast, it is cheaper and more convenient than lidar, but the point clouds also have the problem of noise, registration error and large data volume. Hence, we adopt a shape descriptor - histogram of distances between two randomly chosen points, proposed by Osada and merges with other descriptor - in conjunction with random forest classifier to recognize the navigation elements (door, stairway and wall) from Kinect point clouds. This research acquires navigation elements and their 3-d location information from each single data frame through segmentation of point clouds, boundary extraction, feature calculation and classification. Finally, this paper utilizes the acquired navigation elements and their information to generate the state data of the indoor navigation module automatically. The experimental results demonstrate a high recognition accuracy of the proposed method.
Automated aural classification used for inter-species discrimination of cetaceans.
Binder, Carolyn M; Hines, Paul C
2014-04-01
Passive acoustic methods are in widespread use to detect and classify cetacean species; however, passive acoustic systems often suffer from large false detection rates resulting from numerous transient sources. To reduce the acoustic analyst workload, automatic recognition methods may be implemented in a two-stage process. First, a general automatic detector is implemented that produces many detections to ensure cetacean presence is noted. Then an automatic classifier is used to significantly reduce the number of false detections and classify the cetacean species. This process requires development of a robust classifier capable of performing inter-species classification. Because human analysts can aurally discriminate species, an automated aural classifier that uses perceptual signal features was tested on a cetacean data set. The classifier successfully discriminated between four species of cetaceans-bowhead, humpback, North Atlantic right, and sperm whales-with 85% accuracy. It also performed well (100% accuracy) for discriminating sperm whale clicks from right whale gunshots. An accuracy of 92% and area under the receiver operating characteristic curve of 0.97 were obtained for the relatively challenging bowhead and humpback recognition case. These results demonstrated that the perceptual features employed by the aural classifier provided powerful discrimination cues for inter-species classification of cetaceans.
AN AUTOMATIC DEVICE FOR READING TYPOGRAPHICAL TEXTS,
permissible. The system represents an attempt to apply the methods of machines designed for typescript reading to machines reading printed texts...Some characteristics by which typescript and typographical material differ are presented. The basic aspects of the recognition algorithm are given. A
Cherry recognition in natural environment based on the vision of picking robot
NASA Astrophysics Data System (ADS)
Zhang, Qirong; Chen, Shanxiong; Yu, Tingzhong; Wang, Yan
2017-04-01
In order to realize the automatic recognition of cherry in the natural environment, this paper designed a robot vision system recognition method. The first step of this method is to pre-process the cherry image by median filtering. The second step is to identify the colour of the cherry through the 0.9R-G colour difference formula, and then use the Otsu algorithm for threshold segmentation. The third step is to remove noise by using the area threshold. The fourth step is to remove the holes in the cherry image by morphological closed and open operation. The fifth step is to obtain the centroid and contour of cherry by using the smallest external rectangular and the Hough transform. Through this recognition process, we can successfully identify 96% of the cherry without blocking and adhesion.
Huo, Guanying
2017-01-01
As a typical deep-learning model, Convolutional Neural Networks (CNNs) can be exploited to automatically extract features from images using the hierarchical structure inspired by mammalian visual system. For image classification tasks, traditional CNN models employ the softmax function for classification. However, owing to the limited capacity of the softmax function, there are some shortcomings of traditional CNN models in image classification. To deal with this problem, a new method combining Biomimetic Pattern Recognition (BPR) with CNNs is proposed for image classification. BPR performs class recognition by a union of geometrical cover sets in a high-dimensional feature space and therefore can overcome some disadvantages of traditional pattern recognition. The proposed method is evaluated on three famous image classification benchmarks, that is, MNIST, AR, and CIFAR-10. The classification accuracies of the proposed method for the three datasets are 99.01%, 98.40%, and 87.11%, respectively, which are much higher in comparison with the other four methods in most cases. PMID:28316614
NASA Astrophysics Data System (ADS)
Sheng, Yehua; Zhang, Ka; Ye, Chun; Liang, Cheng; Li, Jian
2008-04-01
Considering the problem of automatic traffic sign detection and recognition in stereo images captured under motion conditions, a new algorithm for traffic sign detection and recognition based on features and probabilistic neural networks (PNN) is proposed in this paper. Firstly, global statistical color features of left image are computed based on statistics theory. Then for red, yellow and blue traffic signs, left image is segmented to three binary images by self-adaptive color segmentation method. Secondly, gray-value projection and shape analysis are used to confirm traffic sign regions in left image. Then stereo image matching is used to locate the homonymy traffic signs in right image. Thirdly, self-adaptive image segmentation is used to extract binary inner core shapes of detected traffic signs. One-dimensional feature vectors of inner core shapes are computed by central projection transformation. Fourthly, these vectors are input to the trained probabilistic neural networks for traffic sign recognition. Lastly, recognition results in left image are compared with recognition results in right image. If results in stereo images are identical, these results are confirmed as final recognition results. The new algorithm is applied to 220 real images of natural scenes taken by the vehicle-borne mobile photogrammetry system in Nanjing at different time. Experimental results show a detection and recognition rate of over 92%. So the algorithm is not only simple, but also reliable and high-speed on real traffic sign detection and recognition. Furthermore, it can obtain geometrical information of traffic signs at the same time of recognizing their types.
a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
He, H.; Khoshelham, K.; Fraser, C.
2017-09-01
Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.
Shape and texture fused recognition of flying targets
NASA Astrophysics Data System (ADS)
Kovács, Levente; Utasi, Ákos; Kovács, Andrea; Szirányi, Tamás
2011-06-01
This paper presents visual detection and recognition of flying targets (e.g. planes, missiles) based on automatically extracted shape and object texture information, for application areas like alerting, recognition and tracking. Targets are extracted based on robust background modeling and a novel contour extraction approach, and object recognition is done by comparisons to shape and texture based query results on a previously gathered real life object dataset. Application areas involve passive defense scenarios, including automatic object detection and tracking with cheap commodity hardware components (CPU, camera and GPS).
Ghose, Soumya; Mitra, Jhimli; Karunanithi, Mohan; Dowling, Jason
2015-01-01
Home monitoring of chronically ill or elderly patient can reduce frequent hospitalisations and hence provide improved quality of care at a reduced cost to the community, therefore reducing the burden on the healthcare system. Activity recognition of such patients is of high importance in such a design. In this work, a system for automatic human physical activity recognition from smart-phone inertial sensors data is proposed. An ensemble of decision trees framework is adopted to train and predict the multi-class human activity system. A comparison of our proposed method with a multi-class traditional support vector machine shows significant improvement in activity recognition accuracies.
Robot Command Interface Using an Audio-Visual Speech Recognition System
NASA Astrophysics Data System (ADS)
Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy
In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.
Automatic classification of bottles in crates
NASA Astrophysics Data System (ADS)
Aas, Kjersti; Eikvil, Line; Bremnes, Dag; Norbryhn, Andreas
1995-03-01
This paper presents a statistical method for classification of bottles in crates for use in automatic return bottle machines. For the automatons to reimburse the correct deposit, a reliable recognition is important. The images are acquired by a laser range scanner coregistering the distance to the object and the strength of the reflected signal. The objective is to identify the crate and the bottles from a library with a number of legal types. The bottles with significantly different size are separated using quite simple methods, while a more sophisticated recognizer is required to distinguish the more similar bottle types. Good results have been obtained when testing the method developed on bottle types which are difficult to distinguish using simple methods.
Support vector machine-based facial-expression recognition method combining shape and appearance
NASA Astrophysics Data System (ADS)
Han, Eun Jung; Kang, Byung Jun; Park, Kang Ryoung; Lee, Sangyoun
2010-11-01
Facial expression recognition can be widely used for various applications, such as emotion-based human-machine interaction, intelligent robot interfaces, face recognition robust to expression variation, etc. Previous studies have been classified as either shape- or appearance-based recognition. The shape-based method has the disadvantage that the individual variance of facial feature points exists irrespective of similar expressions, which can cause a reduction of the recognition accuracy. The appearance-based method has a limitation in that the textural information of the face is very sensitive to variations in illumination. To overcome these problems, a new facial-expression recognition method is proposed, which combines both shape and appearance information, based on the support vector machine (SVM). This research is novel in the following three ways as compared to previous works. First, the facial feature points are automatically detected by using an active appearance model. From these, the shape-based recognition is performed by using the ratios between the facial feature points based on the facial-action coding system. Second, the SVM, which is trained to recognize the same and different expression classes, is proposed to combine two matching scores obtained from the shape- and appearance-based recognitions. Finally, a single SVM is trained to discriminate four different expressions, such as neutral, a smile, anger, and a scream. By determining the expression of the input facial image whose SVM output is at a minimum, the accuracy of the expression recognition is much enhanced. The experimental results showed that the recognition accuracy of the proposed method was better than previous researches and other fusion methods.
Automatic textual annotation of video news based on semantic visual object extraction
NASA Astrophysics Data System (ADS)
Boujemaa, Nozha; Fleuret, Francois; Gouet, Valerie; Sahbi, Hichem
2003-12-01
In this paper, we present our work for automatic generation of textual metadata based on visual content analysis of video news. We present two methods for semantic object detection and recognition from a cross modal image-text thesaurus. These thesaurus represent a supervised association between models and semantic labels. This paper is concerned with two semantic objects: faces and Tv logos. In the first part, we present our work for efficient face detection and recogniton with automatic name generation. This method allows us also to suggest the textual annotation of shots close-up estimation. On the other hand, we were interested to automatically detect and recognize different Tv logos present on incoming different news from different Tv Channels. This work was done jointly with the French Tv Channel TF1 within the "MediaWorks" project that consists on an hybrid text-image indexing and retrieval plateform for video news.
Liakata, Maria; Saha, Shyamasree; Dobnik, Simon; Batchelor, Colin; Rebholz-Schuhmann, Dietrich
2012-04-01
Scholarly biomedical publications report on the findings of a research investigation. Scientists use a well-established discourse structure to relate their work to the state of the art, express their own motivation and hypotheses and report on their methods, results and conclusions. In previous work, we have proposed ways to explicitly annotate the structure of scientific investigations in scholarly publications. Here we present the means to facilitate automatic access to the scientific discourse of articles by automating the recognition of 11 categories at the sentence level, which we call Core Scientific Concepts (CoreSCs). These include: Hypothesis, Motivation, Goal, Object, Background, Method, Experiment, Model, Observation, Result and Conclusion. CoreSCs provide the structure and context to all statements and relations within an article and their automatic recognition can greatly facilitate biomedical information extraction by characterizing the different types of facts, hypotheses and evidence available in a scientific publication. We have trained and compared machine learning classifiers (support vector machines and conditional random fields) on a corpus of 265 full articles in biochemistry and chemistry to automatically recognize CoreSCs. We have evaluated our automatic classifications against a manually annotated gold standard, and have achieved promising accuracies with 'Experiment', 'Background' and 'Model' being the categories with the highest F1-scores (76%, 62% and 53%, respectively). We have analysed the task of CoreSC annotation both from a sentence classification as well as sequence labelling perspective and we present a detailed feature evaluation. The most discriminative features are local sentence features such as unigrams, bigrams and grammatical dependencies while features encoding the document structure, such as section headings, also play an important role for some of the categories. We discuss the usefulness of automatically generated CoreSCs in two biomedical applications as well as work in progress. A web-based tool for the automatic annotation of articles with CoreSCs and corresponding documentation is available online at http://www.sapientaproject.com/software http://www.sapientaproject.com also contains detailed information pertaining to CoreSC annotation and links to annotation guidelines as well as a corpus of manually annotated articles, which served as our training data. liakata@ebi.ac.uk Supplementary data are available at Bioinformatics online.
Stochastic Modeling as a Means of Automatic Speech Recognition
1975-04-01
companng ihc features of different speech recognition systems, attention is often focused on thc control structures and the methods o’ communication...with no need to use secondary storage . Note that we go from a group of separate knowledge sources to an integrated network representation in...exhaust the available lime or storage . - - - . . 1- .-.-.. mmm^~ i — ■ ■ ’ ■ C haplcr I - IN I ROÜliCl ION Page 13 On the other hand
NASA Astrophysics Data System (ADS)
Li, Y. H.; Shinohara, T.; Satoh, T.; Tachibana, K.
2016-06-01
High-definition and highly accurate road maps are necessary for the realization of automated driving, and road signs are among the most important element in the road map. Therefore, a technique is necessary which can acquire information about all kinds of road signs automatically and efficiently. Due to the continuous technical advancement of Mobile Mapping System (MMS), it has become possible to acquire large number of images and 3d point cloud efficiently with highly precise position information. In this paper, we present an automatic road sign detection and recognition approach utilizing both images and 3D point cloud acquired by MMS. The proposed approach consists of three stages: 1) detection of road signs from images based on their color and shape features using object based image analysis method, 2) filtering out of over detected candidates utilizing size and position information estimated from 3D point cloud, region of candidates and camera information, and 3) road sign recognition using template matching method after shape normalization. The effectiveness of proposed approach was evaluated by testing dataset, acquired from more than 180 km of different types of roads in Japan. The results show a very high success in detection and recognition of road signs, even under the challenging conditions such as discoloration, deformation and in spite of partial occlusions.
Jonnagaddala, Jitendra; Jue, Toni Rose; Chang, Nai-Wen; Dai, Hong-Jie
2016-01-01
The rapidly increasing biomedical literature calls for the need of an automatic approach in the recognition and normalization of disease mentions in order to increase the precision and effectivity of disease based information retrieval. A variety of methods have been proposed to deal with the problem of disease named entity recognition and normalization. Among all the proposed methods, conditional random fields (CRFs) and dictionary lookup method are widely used for named entity recognition and normalization respectively. We herein developed a CRF-based model to allow automated recognition of disease mentions, and studied the effect of various techniques in improving the normalization results based on the dictionary lookup approach. The dataset from the BioCreative V CDR track was used to report the performance of the developed normalization methods and compare with other existing dictionary lookup based normalization methods. The best configuration achieved an F-measure of 0.77 for the disease normalization, which outperformed the best dictionary lookup based baseline method studied in this work by an F-measure of 0.13. Database URL: https://github.com/TCRNBioinformatics/DiseaseExtract PMID:27504009
Concept Recognition in an Automatic Text-Processing System for the Life Sciences.
ERIC Educational Resources Information Center
Vleduts-Stokolov, Natasha
1987-01-01
Describes a system developed for the automatic recognition of biological concepts in titles of scientific articles; reports results of several pilot experiments which tested the system's performance; analyzes typical ambiguity problems encountered by the system; describes a disambiguation technique that was developed; and discusses future plans…
Recognition of upper airway and surrounding structures at MRI in pediatric PCOS and OSAS
NASA Astrophysics Data System (ADS)
Tong, Yubing; Udupa, J. K.; Odhner, D.; Sin, Sanghun; Arens, Raanan
2013-03-01
Obstructive Sleep Apnea Syndrome (OSAS) is common in obese children with risk being 4.5 fold compared to normal control subjects. Polycystic Ovary Syndrome (PCOS) has recently been shown to be associated with OSAS that may further lead to significant cardiovascular and neuro-cognitive deficits. We are investigating image-based biomarkers to understand the architectural and dynamic changes in the upper airway and the surrounding hard and soft tissue structures via MRI in obese teenage children to study OSAS. At the previous SPIE conferences, we presented methods underlying Fuzzy Object Models (FOMs) for Automatic Anatomy Recognition (AAR) based on CT images of the thorax and the abdomen. The purpose of this paper is to demonstrate that the AAR approach is applicable to a different body region and image modality combination, namely in the study of upper airway structures via MRI. FOMs were built hierarchically, the smaller sub-objects forming the offspring of larger parent objects. FOMs encode the uncertainty and variability present in the form and relationships among the objects over a study population. Totally 11 basic objects (17 including composite) were modeled. Automatic recognition for the best pose of FOMs in a given image was implemented by using four methods - a one-shot method that does not require search, another three searching methods that include Fisher Linear Discriminate (FLD), a b-scale energy optimization strategy, and optimum threshold recognition method. In all, 30 multi-fold cross validation experiments based on 15 patient MRI data sets were carried out to assess the accuracy of recognition. The results indicate that the objects can be recognized with an average location error of less than 5 mm or 2-3 voxels. Then the iterative relative fuzzy connectedness (IRFC) algorithm was adopted for delineation of the target organs based on the recognized results. The delineation results showed an overall FP and TP volume fraction of 0.02 and 0.93.
Text recognition and correction for automated data collection by mobile devices
NASA Astrophysics Data System (ADS)
Ozarslan, Suleyman; Eren, P. Erhan
2014-03-01
Participatory sensing is an approach which allows mobile devices such as mobile phones to be used for data collection, analysis and sharing processes by individuals. Data collection is the first and most important part of a participatory sensing system, but it is time consuming for the participants. In this paper, we discuss automatic data collection approaches for reducing the time required for collection, and increasing the amount of collected data. In this context, we explore automated text recognition on images of store receipts which are captured by mobile phone cameras, and the correction of the recognized text. Accordingly, our first goal is to evaluate the performance of the Optical Character Recognition (OCR) method with respect to data collection from store receipt images. Images captured by mobile phones exhibit some typical problems, and common image processing methods cannot handle some of them. Consequently, the second goal is to address these types of problems through our proposed Knowledge Based Correction (KBC) method used in support of the OCR, and also to evaluate the KBC method with respect to the improvement on the accurate recognition rate. Results of the experiments show that the KBC method improves the accurate data recognition rate noticeably.
Localized contourlet features in vehicle make and model recognition
NASA Astrophysics Data System (ADS)
Zafar, I.; Edirisinghe, E. A.; Acar, B. S.
2009-02-01
Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic Number Plate Recognition (ANPR) systems. Several vehicle MMR systems have been proposed in literature. In parallel to this, the usefulness of multi-resolution based feature analysis techniques leading to efficient object classification algorithms have received close attention from the research community. To this effect, Contourlet transforms that can provide an efficient directional multi-resolution image representation has recently been introduced. Already an attempt has been made in literature to use Curvelet/Contourlet transforms in vehicle MMR. In this paper we propose a novel localized feature detection method in Contourlet transform domain that is capable of increasing the classification rates up to 4%, as compared to the previously proposed Contourlet based vehicle MMR approach in which the features are non-localized and thus results in sub-optimal classification. Further we show that the proposed algorithm can achieve the increased classification accuracy of 96% at significantly lower computational complexity due to the use of Two Dimensional Linear Discriminant Analysis (2DLDA) for dimensionality reduction by preserving the features with high between-class variance and low inter-class variance.
Automatic identification of species with neural networks.
Hernández-Serna, Andrés; Jiménez-Segura, Luz Fernanda
2014-01-01
A new automatic identification system using photographic images has been designed to recognize fish, plant, and butterfly species from Europe and South America. The automatic classification system integrates multiple image processing tools to extract the geometry, morphology, and texture of the images. Artificial neural networks (ANNs) were used as the pattern recognition method. We tested a data set that included 740 species and 11,198 individuals. Our results show that the system performed with high accuracy, reaching 91.65% of true positive fish identifications, 92.87% of plants and 93.25% of butterflies. Our results highlight how the neural networks are complementary to species identification.
Autoregressive statistical pattern recognition algorithms for damage detection in civil structures
NASA Astrophysics Data System (ADS)
Yao, Ruigen; Pakzad, Shamim N.
2012-08-01
Statistical pattern recognition has recently emerged as a promising set of complementary methods to system identification for automatic structural damage assessment. Its essence is to use well-known concepts in statistics for boundary definition of different pattern classes, such as those for damaged and undamaged structures. In this paper, several statistical pattern recognition algorithms using autoregressive models, including statistical control charts and hypothesis testing, are reviewed as potentially competitive damage detection techniques. To enhance the performance of statistical methods, new feature extraction techniques using model spectra and residual autocorrelation, together with resampling-based threshold construction methods, are proposed. Subsequently, simulated acceleration data from a multi degree-of-freedom system is generated to test and compare the efficiency of the existing and proposed algorithms. Data from laboratory experiments conducted on a truss and a large-scale bridge slab model are then used to further validate the damage detection methods and demonstrate the superior performance of proposed algorithms.
Fast title extraction method for business documents
NASA Astrophysics Data System (ADS)
Katsuyama, Yutaka; Naoi, Satoshi
1997-04-01
Conventional electronic document filing systems are inconvenient because the user must specify the keywords in each document for later searches. To solve this problem, automatic keyword extraction methods using natural language processing and character recognition have been developed. However, these methods are slow, especially for japanese documents. To develop a practical electronic document filing system, we focused on the extraction of keyword areas from a document by image processing. Our fast title extraction method can automatically extract titles as keywords from business documents. All character strings are evaluated for similarity by rating points associated with title similarity. We classified these points as four items: character sitting size, position of character strings, relative position among character strings, and string attribution. Finally, the character string that has the highest rating is selected as the title area. The character recognition process is carried out on the selected area. It is fast because this process must recognize a small number of patterns in the restricted area only, and not throughout the entire document. The mean performance of this method is an accuracy of about 91 percent and a 1.8 sec. processing time for an examination of 100 Japanese business documents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, X; Yang, D
Purpose: To investigate the method to automatically recognize the treatment site in the X-Ray portal images. It could be useful to detect potential treatment errors, and to provide guidance to sequential tasks, e.g. automatically verify the patient daily setup. Methods: The portal images were exported from MOSAIQ as DICOM files, and were 1) processed with a threshold based intensity transformation algorithm to enhance contrast, and 2) where then down-sampled (from 1024×768 to 128×96) by using bi-cubic interpolation algorithm. An appearance-based vector space model (VSM) was used to rearrange the images into vectors. A principal component analysis (PCA) method was usedmore » to reduce the vector dimensions. A multi-class support vector machine (SVM), with radial basis function kernel, was used to build the treatment site recognition models. These models were then used to recognize the treatment sites in the portal image. Portal images of 120 patients were included in the study. The images were selected to cover six treatment sites: brain, head and neck, breast, lung, abdomen and pelvis. Each site had images of the twenty patients. Cross-validation experiments were performed to evaluate the performance. Results: MATLAB image processing Toolbox and scikit-learn (a machine learning library in python) were used to implement the proposed method. The average accuracies using the AP and RT images separately were 95% and 94% respectively. The average accuracy using AP and RT images together was 98%. Computation time was ∼0.16 seconds per patient with AP or RT image, ∼0.33 seconds per patient with both of AP and RT images. Conclusion: The proposed method of treatment site recognition is efficient and accurate. It is not sensitive to the differences of image intensity, size and positions of patients in the portal images. It could be useful for the patient safety assurance. The work was partially supported by a research grant from Varian Medical System.« less
Luo, Jiebo; Boutell, Matthew
2005-05-01
Automatic image orientation detection for natural images is a useful, yet challenging research topic. Humans use scene context and semantic object recognition to identify the correct image orientation. However, it is difficult for a computer to perform the task in the same way because current object recognition algorithms are extremely limited in their scope and robustness. As a result, existing orientation detection methods were built upon low-level vision features such as spatial distributions of color and texture. Discrepant detection rates have been reported for these methods in the literature. We have developed a probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework. Our current accuracy is 90 percent for unconstrained consumer photos, impressive given the findings of a psychophysical study conducted recently. The proposed framework is an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.
Semi-automatic recognition of marine debris on beaches
NASA Astrophysics Data System (ADS)
Ge, Zhenpeng; Shi, Huahong; Mei, Xuefei; Dai, Zhijun; Li, Daoji
2016-05-01
An increasing amount of anthropogenic marine debris is pervading the earth’s environmental systems, resulting in an enormous threat to living organisms. Additionally, the large amount of marine debris around the world has been investigated mostly through tedious manual methods. Therefore, we propose the use of a new technique, light detection and ranging (LIDAR), for the semi-automatic recognition of marine debris on a beach because of its substantially more efficient role in comparison with other more laborious methods. Our results revealed that LIDAR should be used for the classification of marine debris into plastic, paper, cloth and metal. Additionally, we reconstructed a 3-dimensional model of different types of debris on a beach with a high validity of debris revivification using LIDAR-based individual separation. These findings demonstrate that the availability of this new technique enables detailed observations to be made of debris on a large beach that was previously not possible. It is strongly suggested that LIDAR could be implemented as an appropriate monitoring tool for marine debris by global researchers and governments.
Fuzzy support vector machines for adaptive Morse code recognition.
Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh
2006-11-01
Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.
Fuzzy Logic-Based Audio Pattern Recognition
NASA Astrophysics Data System (ADS)
Malcangi, M.
2008-11-01
Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
Automatic violence detection in digital movies
NASA Astrophysics Data System (ADS)
Fischer, Stephan
1996-11-01
Research on computer-based recognition of violence is scant. We are working on the automatic recognition of violence in digital movies, a first step towards the goal of a computer- assisted system capable of protecting children against TV programs containing a great deal of violence. In the video domain a collision detection and a model-mapping to locate human figures are run, while the creation and comparison of fingerprints to find certain events are run int he audio domain. This article centers on the recognition of fist- fights in the video domain and on the recognition of shots, explosions and cries in the audio domain.
ERIC Educational Resources Information Center
Chen, Howard Hao-Jan
2011-01-01
Oral communication ability has become increasingly important to many EFL students. Several commercial software programs based on automatic speech recognition (ASR) technologies are available but their prices are not affordable for many students. This paper will demonstrate how the Microsoft Speech Application Software Development Kit (SASDK), a…
Automatic speech recognition in air traffic control
NASA Technical Reports Server (NTRS)
Karlsson, Joakim
1990-01-01
Automatic Speech Recognition (ASR) technology and its application to the Air Traffic Control system are described. The advantages of applying ASR to Air Traffic Control, as well as criteria for choosing a suitable ASR system are presented. Results from previous research and directions for future work at the Flight Transportation Laboratory are outlined.
Automatic Speech Recognition: Reliability and Pedagogical Implications for Teaching Pronunciation
ERIC Educational Resources Information Center
Kim, In-Seok
2006-01-01
This study examines the reliability of automatic speech recognition (ASR) software used to teach English pronunciation, focusing on one particular piece of software, "FluSpeak, as a typical example." Thirty-six Korean English as a Foreign Language (EFL) college students participated in an experiment in which they listened to 15 sentences…
Automatic Speech Recognition Technology as an Effective Means for Teaching Pronunciation
ERIC Educational Resources Information Center
Elimat, Amal Khalil; AbuSeileek, Ali Farhan
2014-01-01
This study aimed to explore the effect of using automatic speech recognition technology (ASR) on the third grade EFL students' performance in pronunciation, whether teaching pronunciation through ASR is better than regular instruction, and the most effective teaching technique (individual work, pair work, or group work) in teaching pronunciation…
Automatization and Orthographic Development in Second Language Visual Word Recognition
ERIC Educational Resources Information Center
Kida, Shusaku
2016-01-01
The present study investigated second language (L2) learners' acquisition of automatic word recognition and the development of L2 orthographic representation in the mental lexicon. Participants in the study were Japanese university students enrolled in a compulsory course involving a weekly 30-minute sustained silent reading (SSR) activity with…
Evaluating Automatic Speech Recognition-Based Language Learning Systems: A Case Study
ERIC Educational Resources Information Center
van Doremalen, Joost; Boves, Lou; Colpaert, Jozef; Cucchiarini, Catia; Strik, Helmer
2016-01-01
The purpose of this research was to evaluate a prototype of an automatic speech recognition (ASR)-based language learning system that provides feedback on different aspects of speaking performance (pronunciation, morphology and syntax) to students of Dutch as a second language. We carried out usability reviews, expert reviews and user tests to…
Toward End-to-End Face Recognition Through Alignment Learning
NASA Astrophysics Data System (ADS)
Zhong, Yuanyi; Chen, Jiansheng; Huang, Bo
2017-08-01
Plenty of effective methods have been proposed for face recognition during the past decade. Although these methods differ essentially in many aspects, a common practice of them is to specifically align the facial area based on the prior knowledge of human face structure before feature extraction. In most systems, the face alignment module is implemented independently. This has actually caused difficulties in the designing and training of end-to-end face recognition models. In this paper we study the possibility of alignment learning in end-to-end face recognition, in which neither prior knowledge on facial landmarks nor artificially defined geometric transformations are required. Specifically, spatial transformer layers are inserted in front of the feature extraction layers in a Convolutional Neural Network (CNN) for face recognition. Only human identity clues are used for driving the neural network to automatically learn the most suitable geometric transformation and the most appropriate facial area for the recognition task. To ensure reproducibility, our model is trained purely on the publicly available CASIA-WebFace dataset, and is tested on the Labeled Face in the Wild (LFW) dataset. We have achieved a verification accuracy of 99.08\\% which is comparable to state-of-the-art single model based methods.
NASA Astrophysics Data System (ADS)
Hachaj, Tomasz; Ogiela, Marek R.
2014-09-01
Gesture Description Language (GDL) is a classifier that enables syntactic description and real time recognition of full-body gestures and movements. Gestures are described in dedicated computer language named Gesture Description Language script (GDLs). In this paper we will introduce new GDLs formalisms that enable recognition of selected classes of movement trajectories. The second novelty is new unsupervised learning method with which it is possible to automatically generate GDLs descriptions. We have initially evaluated both proposed extensions of GDL and we have obtained very promising results. Both the novel methodology and evaluation results will be described in this paper.
Automated Detection of Stereotypical Motor Movements
ERIC Educational Resources Information Center
Goodwin, Matthew S.; Intille, Stephen S.; Albinali, Fahd; Velicer, Wayne F.
2011-01-01
To overcome problems with traditional methods for measuring stereotypical motor movements in persons with Autism Spectrum Disorders (ASD), we evaluated the use of wireless three-axis accelerometers and pattern recognition algorithms to automatically detect body rocking and hand flapping in children with ASD. Findings revealed that, on average,…
A probabilistic union model with automatic order selection for noisy speech recognition.
Jancovic, P; Ming, J
2001-09-01
A critical issue in exploiting the potential of the sub-band-based approach to robust speech recognition is the method of combining the sub-band observations, for selecting the bands unaffected by noise. A new method for this purpose, i.e., the probabilistic union model, was recently introduced. This model has been shown to be capable of dealing with band-limited corruption, requiring no knowledge about the band position and statistical distribution of the noise. A parameter within the model, which we call its order, gives the best results when it equals the number of noisy bands. Since this information may not be available in practice, in this paper we introduce an automatic algorithm for selecting the order, based on the state duration pattern generated by the hidden Markov model (HMM). The algorithm has been tested on the TIDIGITS database corrupted by various types of additive band-limited noise with unknown noisy bands. The results have shown that the union model equipped with the new algorithm can achieve a recognition performance similar to that achieved when the number of noisy bands is known. The results show a very significant improvement over the traditional full-band model, without requiring prior information on either the position or the number of noisy bands. The principle of the algorithm for selecting the order based on state duration may also be applied to other sub-band combination methods.
Deep Learning Methods for Underwater Target Feature Extraction and Recognition
Peng, Yuan; Qiu, Mengran; Shi, Jianfei; Liu, Liangliang
2018-01-01
The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel frequency cepstral coefficients are used as a method of underwater acoustic signal feature extraction. In this paper, a method for feature extraction and identification of underwater noise data based on CNN and ELM is proposed. An automatic feature extraction method of underwater acoustic signals is proposed using depth convolution network. An underwater target recognition classifier is based on extreme learning machine. Although convolution neural networks can execute both feature extraction and classification, their function mainly relies on a full connection layer, which is trained by gradient descent-based; the generalization ability is limited and suboptimal, so an extreme learning machine (ELM) was used in classification stage. Firstly, CNN learns deep and robust features, followed by the removing of the fully connected layers. Then ELM fed with the CNN features is used as the classifier to conduct an excellent classification. Experiments on the actual data set of civil ships obtained 93.04% recognition rate; compared to the traditional Mel frequency cepstral coefficients and Hilbert-Huang feature, recognition rate greatly improved. PMID:29780407
Transfer learning for bimodal biometrics recognition
NASA Astrophysics Data System (ADS)
Dan, Zhiping; Sun, Shuifa; Chen, Yanfei; Gan, Haitao
2013-10-01
Biometrics recognition aims to identify and predict new personal identities based on their existing knowledge. As the use of multiple biometric traits of the individual may enables more information to be used for recognition, it has been proved that multi-biometrics can produce higher accuracy than single biometrics. However, a common problem with traditional machine learning is that the training and test data should be in the same feature space, and have the same underlying distribution. If the distributions and features are different between training and future data, the model performance often drops. In this paper, we propose a transfer learning method for face recognition on bimodal biometrics. The training and test samples of bimodal biometric images are composed of the visible light face images and the infrared face images. Our algorithm transfers the knowledge across feature spaces, relaxing the assumption of same feature space as well as same underlying distribution by automatically learning a mapping between two different but somewhat similar face images. According to the experiments in the face images, the results show that the accuracy of face recognition has been greatly improved by the proposed method compared with the other previous methods. It demonstrates the effectiveness and robustness of our method.
3D automatic anatomy segmentation based on iterative graph-cut-ASM.
Chen, Xinjian; Bagci, Ulas
2011-08-01
This paper studies the feasibility of developing an automatic anatomy segmentation (AAS) system in clinical radiology and demonstrates its operation on clinical 3D images. The AAS system, the authors are developing consists of two main parts: object recognition and object delineation. As for recognition, a hierarchical 3D scale-based multiobject method is used for the multiobject recognition task, which incorporates intensity weighted ball-scale (b-scale) information into the active shape model (ASM). For object delineation, an iterative graph-cut-ASM (IGCASM) algorithm is proposed, which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. The presented IGCASM algorithm is a 3D generalization of the 2D GC-ASM method that they proposed previously in Chen et al. [Proc. SPIE, 7259, 72590C1-72590C-8 (2009)]. The proposed methods are tested on two datasets comprised of images obtained from 20 patients (10 male and 10 female) of clinical abdominal CT scans, and 11 foot magnetic resonance imaging (MRI) scans. The test is for four organs (liver, left and right kidneys, and spleen) segmentation, five foot bones (calcaneus, tibia, cuboid, talus, and navicular). The recognition and delineation accuracies were evaluated separately. The recognition accuracy was evaluated in terms of translation, rotation, and scale (size) error. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF, FPVF). The efficiency of the delineation method was also evaluated on an Intel Pentium IV PC with a 3.4 GHZ CPU machine. The recognition accuracies in terms of translation, rotation, and scale error over all organs are about 8 mm, 10 degrees and 0.03, and over all foot bones are about 3.5709 mm, 0.35 degrees and 0.025, respectively. The accuracy of delineation over all organs for all subjects as expressed in TPVF and FPVF is 93.01% and 0.22%, and all foot bones for all subjects are 93.75% and 0.28%, respectively. While the delineations for the four organs can be accomplished quite rapidly with average of 78 s, the delineations for the five foot bones can be accomplished with average of 70 s. The experimental results showed the feasibility and efficacy of the proposed automatic anatomy segmentation system: (a) the incorporation of shape priors into the GC framework is feasible in 3D as demonstrated previously for 2D images; (b) our results in 3D confirm the accuracy behavior observed in 2D. The hybrid strategy IGCASM seems to be more robust and accurate than ASM and GC individually; and (c) delineations within body regions and foot bones of clinical importance can be accomplished quite rapidly within 1.5 min.
Cheng, Yezeng; Larin, Kirill V
2006-12-20
Fingerprint recognition is one of the most widely used methods of biometrics. This method relies on the surface topography of a finger and, thus, is potentially vulnerable for spoofing by artificial dummies with embedded fingerprints. In this study, we applied the optical coherence tomography (OCT) technique to distinguish artificial materials commonly used for spoofing fingerprint scanning systems from the real skin. Several artificial fingerprint dummies made from household cement and liquid silicone rubber were prepared and tested using a commercial fingerprint reader and an OCT system. While the artificial fingerprints easily spoofed the commercial fingerprint reader, OCT images revealed the presence of them at all times. We also demonstrated that an autocorrelation analysis of the OCT images could be potentially used in automatic recognition systems.
NASA Astrophysics Data System (ADS)
Cheng, Yezeng; Larin, Kirill V.
2006-12-01
Fingerprint recognition is one of the most widely used methods of biometrics. This method relies on the surface topography of a finger and, thus, is potentially vulnerable for spoofing by artificial dummies with embedded fingerprints. In this study, we applied the optical coherence tomography (OCT) technique to distinguish artificial materials commonly used for spoofing fingerprint scanning systems from the real skin. Several artificial fingerprint dummies made from household cement and liquid silicone rubber were prepared and tested using a commercial fingerprint reader and an OCT system. While the artificial fingerprints easily spoofed the commercial fingerprint reader, OCT images revealed the presence of them at all times. We also demonstrated that an autocorrelation analysis of the OCT images could be potentially used in automatic recognition systems.
Research on application of LADAR in ground vehicle recognition
NASA Astrophysics Data System (ADS)
Lan, Jinhui; Shen, Zhuoxun
2009-11-01
For the requirement of many practical applications in the field of military, the research of 3D target recognition is active. The representation that captures the salient attributes of a 3D target independent of the viewing angle will be especially useful to the automatic 3D target recognition system. This paper presents a new approach of image generation based on Laser Detection and Ranging (LADAR) data. Range image of target is obtained by transformation of point cloud. In order to extract features of different ground vehicle targets and to recognize targets, zernike moment properties of typical ground vehicle targets are researched in this paper. A technique of support vector machine is applied to the classification and recognition of target. The new method of image generation and feature representation has been applied to the outdoor experiments. Through outdoor experiments, it can be proven that the method of image generation is stability, the moments are effective to be used as features for recognition, and the LADAR can be applied to the field of 3D target recognition.
A fast automatic recognition and location algorithm for fetal genital organs in ultrasound images.
Tang, Sheng; Chen, Si-ping
2009-09-01
Severe sex ratio imbalance at birth is now becoming an important issue in several Asian countries. Its leading immediate cause is prenatal sex-selective abortion following illegal sex identification by ultrasound scanning. In this paper, a fast automatic recognition and location algorithm for fetal genital organs is proposed as an effective method to help prevent ultrasound technicians from unethically and illegally identifying the sex of the fetus. This automatic recognition algorithm can be divided into two stages. In the 'rough' stage, a few pixels in the image, which are likely to represent the genital organs, are automatically chosen as points of interest (POIs) according to certain salient characteristics of fetal genital organs. In the 'fine' stage, a specifically supervised learning framework, which fuses an effective feature data preprocessing mechanism into the multiple classifier architecture, is applied to every POI. The basic classifiers in the framework are selected from three widely used classifiers: radial basis function network, backpropagation network, and support vector machine. The classification results of all the POIs are then synthesized to determine whether the fetal genital organ is present in the image, and to locate the genital organ within the positive image. Experiments were designed and carried out based on an image dataset comprising 658 positive images (images with fetal genital organs) and 500 negative images (images without fetal genital organs). The experimental results showed true positive (TP) and true negative (TN) results from 80.5% (265 from 329) and 83.0% (415 from 500) of samples, respectively. The average computation time was 453 ms per image.
Management of natural resources through automatic cartographic inventory. [France
NASA Technical Reports Server (NTRS)
Rey, P.; Gourinard, Y.; Cambou, F. (Principal Investigator)
1974-01-01
The author has identified the following significant results. (1) Accurate recognition of previously known ground features from ERTS-1 imagery has been confirmed and a probable detection range for the major signatures can be given. (2) Unidentified elements, however, must be decoded by means of the equal densitometric value zone method. (3) Determination of these zonings involves an analogical treatment of images using the color equidensity methods (pseudo-color), color composites and especially temporal color composite (repetitive superposition). (4) After this analogical preparation, the digital equidensities can be processed by computer in the four MSS bands, according to a series of transfer operations from imagery and automatic cartography.
Tóth, László; Hoffmann, Ildikó; Gosztolya, Gábor; Vincze, Veronika; Szatlóczki, Gréta; Bánréti, Zoltán; Pákáski, Magdolna; Kálmán, János
2018-01-01
Background: Even today the reliable diagnosis of the prodromal stages of Alzheimer’s disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive de-cline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. Methods: Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech sig-nals, first manually (using the Praat software), and then automatically, with an automatic speech recogni-tion (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. Results: The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process – that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78.8%. Conclusion: The temporal analysis of spontaneous speech can be exploited in implementing a new, auto-matic detection-based tool for screening MCI for the community. PMID:29165085
ERIC Educational Resources Information Center
Sidgi, Lina Fathi Sidig; Shaari, Ahmad Jelani
2017-01-01
The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the foreign language classrooms where it is most needed. One promising emerging technology that supports language learning is automatic speech recognition (ASR). Integrating such technology, especially in the instruction of pronunciation…
ERIC Educational Resources Information Center
Mayer, Andreas; Motsch, Hans-Joachim
2015-01-01
This study analysed the effects of a classroom intervention focusing on phonological awareness and/or automatized word recognition in children with a deficit in the domains of phonological awareness and rapid automatized naming ("double deficit"). According to the double-deficit hypothesis (Wolf & Bowers, 1999), these children belong…
Using Automatic Speech Recognition Technology with Elicited Oral Response Testing
ERIC Educational Resources Information Center
Cox, Troy L.; Davies, Randall S.
2012-01-01
This study examined the use of automatic speech recognition (ASR) scored elicited oral response (EOR) tests to assess the speaking ability of English language learners. It also examined the relationship between ASR-scored EOR and other language proficiency measures and the ability of the ASR to rate speakers without bias to gender or native…
Open Dataset for the Automatic Recognition of Sedentary Behaviors.
Possos, William; Cruz, Robinson; Cerón, Jesús D; López, Diego M; Sierra-Torres, Carlos H
2017-01-01
Sedentarism is associated with the development of noncommunicable diseases (NCD) such as cardiovascular diseases (CVD), type 2 diabetes, and cancer. Therefore, the identification of specific sedentary behaviors (TV viewing, sitting at work, driving, relaxing, etc.) is especially relevant for planning personalized prevention programs. To build and evaluate a public a dataset for the automatic recognition (classification) of sedentary behaviors. The dataset included data from 30 subjects, who performed 23 sedentary behaviors while wearing a commercial wearable on the wrist, a smartphone on the hip and another in the thigh. Bluetooth Low Energy (BLE) beacons were used in order to improve the automatic classification of different sedentary behaviors. The study also compared six well know data mining classification techniques in order to identify the more precise method of solving the classification problem of the 23 defined behaviors. A better classification accuracy was obtained using the Random Forest algorithm and when data were collected from the phone on the hip. Furthermore, the use of beacons as a reference for obtaining the symbolic location of the individual improved the precision of the classification.
Retina vascular network recognition
NASA Astrophysics Data System (ADS)
Tascini, Guido; Passerini, Giorgio; Puliti, Paolo; Zingaretti, Primo
1993-09-01
The analysis of morphological and structural modifications of the retina vascular network is an interesting investigation method in the study of diabetes and hypertension. Normally this analysis is carried out by qualitative evaluations, according to standardized criteria, though medical research attaches great importance to quantitative analysis of vessel color, shape and dimensions. The paper describes a system which automatically segments and recognizes the ocular fundus circulation and micro circulation network, and extracts a set of features related to morphometric aspects of vessels. For this class of images the classical segmentation methods seem weak. We propose a computer vision system in which segmentation and recognition phases are strictly connected. The system is hierarchically organized in four modules. Firstly the Image Enhancement Module (IEM) operates a set of custom image enhancements to remove blur and to prepare data for subsequent segmentation and recognition processes. Secondly the Papilla Border Analysis Module (PBAM) automatically recognizes number, position and local diameter of blood vessels departing from optical papilla. Then the Vessel Tracking Module (VTM) analyses vessels comparing the results of body and edge tracking and detects branches and crossings. Finally the Feature Extraction Module evaluates PBAM and VTM output data and extracts some numerical indexes. Used algorithms appear to be robust and have been successfully tested on various ocular fundus images.
Face Recognition From One Example View.
1995-09-01
Proceedings, International Workshop on Automatic Face- and Gesture-Recognition, pages 248{253, Zurich, 1995. [32] Yael Moses, Shimon Ullman, and Shimon...recognition. Journal of Cognitive Neuroscience, 3(1):71{86, 1991. [49] Shimon Ullman and Ronen Basri. Recognition by linear combinations of models
Hierarchical Recognition Scheme for Human Facial Expression Recognition Systems
Siddiqi, Muhammad Hameed; Lee, Sungyoung; Lee, Young-Koo; Khan, Adil Mehmood; Truc, Phan Tran Ho
2013-01-01
Over the last decade, human facial expressions recognition (FER) has emerged as an important research area. Several factors make FER a challenging research problem. These include varying light conditions in training and test images; need for automatic and accurate face detection before feature extraction; and high similarity among different expressions that makes it difficult to distinguish these expressions with a high accuracy. This work implements a hierarchical linear discriminant analysis-based facial expressions recognition (HL-FER) system to tackle these problems. Unlike the previous systems, the HL-FER uses a pre-processing step to eliminate light effects, incorporates a new automatic face detection scheme, employs methods to extract both global and local features, and utilizes a HL-FER to overcome the problem of high similarity among different expressions. Unlike most of the previous works that were evaluated using a single dataset, the performance of the HL-FER is assessed using three publicly available datasets under three different experimental settings: n-fold cross validation based on subjects for each dataset separately; n-fold cross validation rule based on datasets; and, finally, a last set of experiments to assess the effectiveness of each module of the HL-FER separately. Weighted average recognition accuracy of 98.7% across three different datasets, using three classifiers, indicates the success of employing the HL-FER for human FER. PMID:24316568
Automatic Target Recognition Based on Cross-Plot
Wong, Kelvin Kian Loong; Abbott, Derek
2011-01-01
Automatic target recognition that relies on rapid feature extraction of real-time target from photo-realistic imaging will enable efficient identification of target patterns. To achieve this objective, Cross-plots of binary patterns are explored as potential signatures for the observed target by high-speed capture of the crucial spatial features using minimal computational resources. Target recognition was implemented based on the proposed pattern recognition concept and tested rigorously for its precision and recall performance. We conclude that Cross-plotting is able to produce a digital fingerprint of a target that correlates efficiently and effectively to signatures of patterns having its identity in a target repository. PMID:21980508
Presentation video retrieval using automatically recovered slide and spoken text
NASA Astrophysics Data System (ADS)
Cooper, Matthew
2013-03-01
Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.
AstroCV: Astronomy computer vision library
NASA Astrophysics Data System (ADS)
González, Roberto E.; Muñoz, Roberto P.; Hernández, Cristian A.
2018-04-01
AstroCV processes and analyzes big astronomical datasets, and is intended to provide a community repository of high performance Python and C++ algorithms used for image processing and computer vision. The library offers methods for object recognition, segmentation and classification, with emphasis in the automatic detection and classification of galaxies.
Bayesian Methods and Confidence Intervals for Automatic Target Recognition of SAR Canonical Shapes
2014-03-27
and DirectX [22]. The CUDA platform was developed by the NVIDIA Corporation to allow programmers access to the computational capabilities of the...were used for the intense repetitive computations. Developing CUDA software requires writing code for specialized compilers provided by NVIDIA and
Image-based automatic recognition of larvae
NASA Astrophysics Data System (ADS)
Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai
2010-08-01
As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, H; Tan, J; Kavanaugh, J
Purpose: Radiotherapy (RT) contours delineated either manually or semiautomatically require verification before clinical usage. Manual evaluation is very time consuming. A new integrated software tool using supervised pattern contour recognition was thus developed to facilitate this process. Methods: The contouring tool was developed using an object-oriented programming language C# and application programming interfaces, e.g. visualization toolkit (VTK). The C# language served as the tool design basis. The Accord.Net scientific computing libraries were utilized for the required statistical data processing and pattern recognition, while the VTK was used to build and render 3-D mesh models from critical RT structures in real-timemore » and 360° visualization. Principal component analysis (PCA) was used for system self-updating geometry variations of normal structures based on physician-approved RT contours as a training dataset. The inhouse design of supervised PCA-based contour recognition method was used for automatically evaluating contour normality/abnormality. The function for reporting the contour evaluation results was implemented by using C# and Windows Form Designer. Results: The software input was RT simulation images and RT structures from commercial clinical treatment planning systems. Several abilities were demonstrated: automatic assessment of RT contours, file loading/saving of various modality medical images and RT contours, and generation/visualization of 3-D images and anatomical models. Moreover, it supported the 360° rendering of the RT structures in a multi-slice view, which allows physicians to visually check and edit abnormally contoured structures. Conclusion: This new software integrates the supervised learning framework with image processing and graphical visualization modules for RT contour verification. This tool has great potential for facilitating treatment planning with the assistance of an automatic contour evaluation module in avoiding unnecessary manual verification for physicians/dosimetrists. In addition, its nature as a compact and stand-alone tool allows for future extensibility to include additional functions for physicians’ clinical needs.« less
Jonnagaddala, Jitendra; Jue, Toni Rose; Chang, Nai-Wen; Dai, Hong-Jie
2016-01-01
The rapidly increasing biomedical literature calls for the need of an automatic approach in the recognition and normalization of disease mentions in order to increase the precision and effectivity of disease based information retrieval. A variety of methods have been proposed to deal with the problem of disease named entity recognition and normalization. Among all the proposed methods, conditional random fields (CRFs) and dictionary lookup method are widely used for named entity recognition and normalization respectively. We herein developed a CRF-based model to allow automated recognition of disease mentions, and studied the effect of various techniques in improving the normalization results based on the dictionary lookup approach. The dataset from the BioCreative V CDR track was used to report the performance of the developed normalization methods and compare with other existing dictionary lookup based normalization methods. The best configuration achieved an F-measure of 0.77 for the disease normalization, which outperformed the best dictionary lookup based baseline method studied in this work by an F-measure of 0.13.Database URL: https://github.com/TCRNBioinformatics/DiseaseExtract. © The Author(s) 2016. Published by Oxford University Press.
Zhang, Qiuwen; Zhang, Yan; Yang, Xiaohong; Su, Bin
2014-01-01
In recent years, earthquakes have frequently occurred all over the world, which caused huge casualties and economic losses. It is very necessary and urgent to obtain the seismic intensity map timely so as to master the distribution of the disaster and provide supports for quick earthquake relief. Compared with traditional methods of drawing seismic intensity map, which require many investigations in the field of earthquake area or are too dependent on the empirical formulas, spatial information technologies such as Remote Sensing (RS) and Geographical Information System (GIS) can provide fast and economical way to automatically recognize the seismic intensity. With the integrated application of RS and GIS, this paper proposes a RS/GIS-based approach for automatic recognition of seismic intensity, in which RS is used to retrieve and extract the information on damages caused by earthquake, and GIS is applied to manage and display the data of seismic intensity. The case study in Wenchuan Ms8.0 earthquake in China shows that the information on seismic intensity can be automatically extracted from remotely sensed images as quickly as possible after earthquake occurrence, and the Digital Intensity Model (DIM) can be used to visually query and display the distribution of seismic intensity.
Automatic cloud tracking applied to GOES and Meteosat observations
NASA Technical Reports Server (NTRS)
Endlich, R. M.; Wolf, D. E.
1981-01-01
An improved automatic processing method for the tracking of cloud motions as revealed by satellite imagery is presented and applications of the method to GOES observations of Hurricane Eloise and Meteosat water vapor and infrared data are presented. The method is shown to involve steps of picture smoothing, target selection and the calculation of cloud motion vectors by the matching of a group at a given time with its best likeness at a later time, or by a cross-correlation computation. Cloud motion computations can be made in as many as four separate layers simultaneously. For data of 4 and 8 km resolution in the eye of Hurricane Eloise, the automatic system is found to provide results comparable in accuracy and coverage to those obtained by NASA analysts using the Atmospheric and Oceanographic Information Processing System, with results obtained by the pattern recognition and cross correlation computations differing by only fractions of a pixel. For Meteosat water vapor data from the tropics and midlatitudes, the automatic motion computations are found to be reliable only in areas where the water vapor fields contained small-scale structure, although excellent results are obtained using Meteosat IR data in the same regions. The automatic method thus appears to be competitive in accuracy and coverage with motion determination by human analysts.
Automatic forensic face recognition from digital images.
Peacock, C; Goode, A; Brett, A
2004-01-01
Digital image evidence is now widely available from criminal investigations and surveillance operations, often captured by security and surveillance CCTV. This has resulted in a growing demand from law enforcement agencies for automatic person-recognition based on image data. In forensic science, a fundamental requirement for such automatic face recognition is to evaluate the weight that can justifiably be attached to this recognition evidence in a scientific framework. This paper describes a pilot study carried out by the Forensic Science Service (UK) which explores the use of digital facial images in forensic investigation. For the purpose of the experiment a specific software package was chosen (Image Metrics Optasia). The paper does not describe the techniques used by the software to reach its decision of probabilistic matches to facial images, but accepts the output of the software as though it were a 'black box'. In this way, the paper lays a foundation for how face recognition systems can be compared in a forensic framework. The aim of the paper is to explore how reliably and under what conditions digital facial images can be presented in evidence.
ERIC Educational Resources Information Center
Sauval, Karinne; Perre, Laetitia; Casalis, Séverine
2017-01-01
The present study aimed to investigate the development of automatic phonological processes involved in visual word recognition during reading acquisition in French. A visual masked priming lexical decision experiment was carried out with third, fifth graders and adult skilled readers. Three different types of partial overlap between the prime and…
ERIC Educational Resources Information Center
Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian
2017-01-01
Purpose: The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist…
ERIC Educational Resources Information Center
Saadatzi, Mohammad Nasser; Pennington, Robert C.; Welch, Karla C.; Graham, James H.; Scott, Renee E.
2017-01-01
In the current study, we examined the effects of an instructional package comprised of an autonomous pedagogical agent, automatic speech recognition, and constant time delay during the instruction of reading sight words aloud to young adults with autism spectrum disorder. We used a concurrent multiple baseline across participants design to…
ERIC Educational Resources Information Center
Wald, Mike
2006-01-01
The potential use of Automatic Speech Recognition to assist receptive communication is explored. The opportunities and challenges that this technology presents students and staff to provide captioning of speech online or in classrooms for deaf or hard of hearing students and assist blind, visually impaired or dyslexic learners to read and search…
[Extraction and recognition of attractors in three-dimensional Lorenz plot].
Hu, Min; Jang, Chengfan; Wang, Suxia
2018-02-01
Lorenz plot (LP) method which gives a global view of long-time electrocardiogram signals, is an efficient simple visualization tool to analyze cardiac arrhythmias, and the morphologies and positions of the extracted attractors may reveal the underlying mechanisms of the onset and termination of arrhythmias. But automatic diagnosis is still impossible because it is lack of the method of extracting attractors by now. We presented here a methodology of attractor extraction and recognition based upon homogeneously statistical properties of the location parameters of scatter points in three dimensional LP (3DLP), which was constructed by three successive RR intervals as X , Y and Z axis in Cartesian coordinate system. Validation experiments were tested in a group of RR-interval time series and tags data with frequent unifocal premature complexes exported from a 24-hour Holter system. The results showed that this method had excellent effective not only on extraction of attractors, but also on automatic recognition of attractors by the location parameters such as the azimuth of the points peak frequency ( A PF ) of eccentric attractors once stereographic projection of 3DLP along the space diagonal. Besides, A PF was still a powerful index of differential diagnosis of atrial and ventricular extrasystole. Additional experiments proved that this method was also available on several other arrhythmias. Moreover, there were extremely relevant relationships between 3DLP and two dimensional LPs which indicate any conventional achievement of LPs could be implanted into 3DLP. It would have a broad application prospect to integrate this method into conventional long-time electrocardiogram monitoring and analysis system.
Face Recognition in Humans and Machines
NASA Astrophysics Data System (ADS)
O'Toole, Alice; Tistarelli, Massimo
The study of human face recognition by psychologists and neuroscientists has run parallel to the development of automatic face recognition technologies by computer scientists and engineers. In both cases, there are analogous steps of data acquisition, image processing, and the formation of representations that can support the complex and diverse tasks we accomplish with faces. These processes can be understood and compared in the context of their neural and computational implementations. In this chapter, we present the essential elements of face recognition by humans and machines, taking a perspective that spans psychological, neural, and computational approaches. From the human side, we overview the methods and techniques used in the neurobiology of face recognition, the underlying neural architecture of the system, the role of visual attention, and the nature of the representations that emerges. From the computational side, we discuss face recognition technologies and the strategies they use to overcome challenges to robust operation over viewing parameters. Finally, we conclude the chapter with a look at some recent studies that compare human and machine performances at face recognition.
3D automatic anatomy segmentation based on iterative graph-cut-ASM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Xinjian; Bagci, Ulas; Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10 Room 1C515, Bethesda, Maryland 20892-1182
2011-08-15
Purpose: This paper studies the feasibility of developing an automatic anatomy segmentation (AAS) system in clinical radiology and demonstrates its operation on clinical 3D images. Methods: The AAS system, the authors are developing consists of two main parts: object recognition and object delineation. As for recognition, a hierarchical 3D scale-based multiobject method is used for the multiobject recognition task, which incorporates intensity weighted ball-scale (b-scale) information into the active shape model (ASM). For object delineation, an iterative graph-cut-ASM (IGCASM) algorithm is proposed, which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability ofmore » the GC method. The presented IGCASM algorithm is a 3D generalization of the 2D GC-ASM method that they proposed previously in Chen et al.[Proc. SPIE, 7259, 72590C1-72590C-8 (2009)]. The proposed methods are tested on two datasets comprised of images obtained from 20 patients (10 male and 10 female) of clinical abdominal CT scans, and 11 foot magnetic resonance imaging (MRI) scans. The test is for four organs (liver, left and right kidneys, and spleen) segmentation, five foot bones (calcaneus, tibia, cuboid, talus, and navicular). The recognition and delineation accuracies were evaluated separately. The recognition accuracy was evaluated in terms of translation, rotation, and scale (size) error. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF, FPVF). The efficiency of the delineation method was also evaluated on an Intel Pentium IV PC with a 3.4 GHZ CPU machine. Results: The recognition accuracies in terms of translation, rotation, and scale error over all organs are about 8 mm, 10 deg. and 0.03, and over all foot bones are about 3.5709 mm, 0.35 deg. and 0.025, respectively. The accuracy of delineation over all organs for all subjects as expressed in TPVF and FPVF is 93.01% and 0.22%, and all foot bones for all subjects are 93.75% and 0.28%, respectively. While the delineations for the four organs can be accomplished quite rapidly with average of 78 s, the delineations for the five foot bones can be accomplished with average of 70 s. Conclusions: The experimental results showed the feasibility and efficacy of the proposed automatic anatomy segmentation system: (a) the incorporation of shape priors into the GC framework is feasible in 3D as demonstrated previously for 2D images; (b) our results in 3D confirm the accuracy behavior observed in 2D. The hybrid strategy IGCASM seems to be more robust and accurate than ASM and GC individually; and (c) delineations within body regions and foot bones of clinical importance can be accomplished quite rapidly within 1.5 min.« less
3D automatic anatomy segmentation based on iterative graph-cut-ASM
Chen, Xinjian; Bagci, Ulas
2011-01-01
Purpose: This paper studies the feasibility of developing an automatic anatomy segmentation (AAS) system in clinical radiology and demonstrates its operation on clinical 3D images.Methods: The AAS system, the authors are developing consists of two main parts: object recognition and object delineation. As for recognition, a hierarchical 3D scale-based multiobject method is used for the multiobject recognition task, which incorporates intensity weighted ball-scale (b-scale) information into the active shape model (ASM). For object delineation, an iterative graph-cut-ASM (IGCASM) algorithm is proposed, which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. The presented IGCASM algorithm is a 3D generalization of the 2D GC-ASM method that they proposed previously in Chen et al. [Proc. SPIE, 7259, 72590C1–72590C-8 (2009)]. The proposed methods are tested on two datasets comprised of images obtained from 20 patients (10 male and 10 female) of clinical abdominal CT scans, and 11 foot magnetic resonance imaging (MRI) scans. The test is for four organs (liver, left and right kidneys, and spleen) segmentation, five foot bones (calcaneus, tibia, cuboid, talus, and navicular). The recognition and delineation accuracies were evaluated separately. The recognition accuracy was evaluated in terms of translation, rotation, and scale (size) error. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF, FPVF). The efficiency of the delineation method was also evaluated on an Intel Pentium IV PC with a 3.4 GHZ CPU machine.Results: The recognition accuracies in terms of translation, rotation, and scale error over all organs are about 8 mm, 10° and 0.03, and over all foot bones are about 3.5709 mm, 0.35° and 0.025, respectively. The accuracy of delineation over all organs for all subjects as expressed in TPVF and FPVF is 93.01% and 0.22%, and all foot bones for all subjects are 93.75% and 0.28%, respectively. While the delineations for the four organs can be accomplished quite rapidly with average of 78 s, the delineations for the five foot bones can be accomplished with average of 70 s.Conclusions: The experimental results showed the feasibility and efficacy of the proposed automatic anatomy segmentation system: (a) the incorporation of shape priors into the GC framework is feasible in 3D as demonstrated previously for 2D images; (b) our results in 3D confirm the accuracy behavior observed in 2D. The hybrid strategy IGCASM seems to be more robust and accurate than ASM and GC individually; and (c) delineations within body regions and foot bones of clinical importance can be accomplished quite rapidly within 1.5 min. PMID:21928634
NASA Astrophysics Data System (ADS)
Ye, L.; Xu, X.; Luan, D.; Jiang, W.; Kang, Z.
2017-07-01
Crater-detection approaches can be divided into four categories: manual recognition, shape-profile fitting algorithms, machine-learning methods and geological information-based analysis using terrain and spectral data. The mainstream method is Shape-profile fitting algorithms. Many scholars throughout the world use the illumination gradient information to fit standard circles by least square method. Although this method has achieved good results, it is difficult to identify the craters with poor "visibility", complex structure and composition. Moreover, the accuracy of recognition is difficult to be improved due to the multiple solutions and noise interference. Aiming at the problem, we propose a method for the automatic extraction of impact craters based on spectral characteristics of the moon rocks and minerals: 1) Under the condition of sunlight, the impact craters are extracted from MI by condition matching and the positions as well as diameters of the craters are obtained. 2) Regolith is spilled while lunar is impacted and one of the elements of lunar regolith is iron. Therefore, incorrectly extracted impact craters can be removed by judging whether the crater contains "non iron" element. 3) Craters which are extracted correctly, are divided into two types: simple type and complex type according to their diameters. 4) Get the information of titanium and match the titanium distribution of the complex craters with normal distribution curve, then calculate the goodness of fit and set the threshold. The complex craters can be divided into two types: normal distribution curve type of titanium and non normal distribution curve type of titanium. We validated our proposed method with MI acquired by SELENE. Experimental results demonstrate that the proposed method has good performance in the test area.
Hierarchically Structured Non-Intrusive Sign Language Recognition. Chapter 2
NASA Technical Reports Server (NTRS)
Zieren, Jorg; Zieren, Jorg; Kraiss, Karl-Friedrich
2007-01-01
This work presents a hierarchically structured approach at the nonintrusive recognition of sign language from a monocular frontal view. Robustness is achieved through sophisticated localization and tracking methods, including a combined EM/CAMSHIFT overlap resolution procedure and the parallel pursuit of multiple hypotheses about hands position and movement. This allows handling of ambiguities and automatically corrects tracking errors. A biomechanical skeleton model and dynamic motion prediction using Kalman filters represents high level knowledge. Classification is performed by Hidden Markov Models. 152 signs from German sign language were recognized with an accuracy of 97.6%.
A new accurate pill recognition system using imprint information
NASA Astrophysics Data System (ADS)
Chen, Zhiyuan; Kamata, Sei-ichiro
2013-12-01
Great achievements in modern medicine benefit human beings. Also, it has brought about an explosive growth of pharmaceuticals that current in the market. In daily life, pharmaceuticals sometimes confuse people when they are found unlabeled. In this paper, we propose an automatic pill recognition technique to solve this problem. It functions mainly based on the imprint feature of the pills, which is extracted by proposed MSWT (modified stroke width transform) and described by WSC (weighted shape context). Experiments show that our proposed pill recognition method can reach an accurate rate up to 92.03% within top 5 ranks when trying to classify more than 10 thousand query pill images into around 2000 categories.
Research on gait-based human identification
NASA Astrophysics Data System (ADS)
Li, Youguo
Gait recognition refers to automatic identification of individual based on his/her style of walking. This paper proposes a gait recognition method based on Continuous Hidden Markov Model with Mixture of Gaussians(G-CHMM). First, we initialize a Gaussian mix model for training image sequence with K-means algorithm, then train the HMM parameters using a Baum-Welch algorithm. These gait feature sequences can be trained and obtain a Continuous HMM for every person, therefore, the 7 key frames and the obtained HMM can represent each person's gait sequence. Finally, the recognition is achieved by Front algorithm. The experiments made on CASIA gait databases obtain comparatively high correction identification ratio and comparatively strong robustness for variety of bodily angle.
Facial expression recognition based on weber local descriptor and sparse representation
NASA Astrophysics Data System (ADS)
Ouyang, Yan
2018-03-01
Automatic facial expression recognition has been one of the research hotspots in the area of computer vision for nearly ten years. During the decade, many state-of-the-art methods have been proposed which perform very high accurate rate based on the face images without any interference. Nowadays, many researchers begin to challenge the task of classifying the facial expression images with corruptions and occlusions and the Sparse Representation based Classification framework has been wildly used because it can robust to the corruptions and occlusions. Therefore, this paper proposed a novel facial expression recognition method based on Weber local descriptor (WLD) and Sparse representation. The method includes three parts: firstly the face images are divided into many local patches, and then the WLD histograms of each patch are extracted, finally all the WLD histograms features are composed into a vector and combined with SRC to classify the facial expressions. The experiment results on the Cohn-Kanade database show that the proposed method is robust to occlusions and corruptions.
Learning Enterprise Malware Triage from Automatic Dynamic Analysis
2013-03-01
Kolter and Maloof n-gram method, Dube’s malware target recognition (MaTR) static method performs significantly more accurately at the 95% confidence...from the static method as in Kolter and Maloof. The MIST approach with behavior sequences 9 allows researchers to tailor the level of analysis to the...citations, none publish work that implements it. Only Kolter and Maloof use nearly as long gram structures, although that research uses static grams rather
The fast iris image clarity evaluation based on Tenengrad and ROI selection
NASA Astrophysics Data System (ADS)
Gao, Shuqin; Han, Min; Cheng, Xu
2018-04-01
In iris recognition system, the clarity of iris image is an important factor that influences recognition effect. In the process of recognition, the blurred image may possibly be rejected by the automatic iris recognition system, which will lead to the failure of identification. Therefore it is necessary to evaluate the iris image definition before recognition. Considered the existing evaluation methods on iris image definition, we proposed a fast algorithm to evaluate the definition of iris image in this paper. In our algorithm, firstly ROI (Region of Interest) is extracted based on the reference point which is determined by using the feature of the light spots within the pupil, then Tenengrad operator is used to evaluate the iris image's definition. Experiment results show that, the iris image definition algorithm proposed in this paper could accurately distinguish the iris images of different clarity, and the algorithm has the merit of low computational complexity and more effectiveness.
Computational Modeling of Emotions and Affect in Social-Cultural Interaction
2013-10-02
acoustic and textual information sources. Second, a cross-lingual study was performed that shed light on how human perception and automatic recognition...speech is produced, a speaker’s pitch and intonational pattern, and word usage. Better feature representation and advanced approaches were used to...recognition performance, and improved our understanding of language/cultural impact on human perception of emotion and automatic classification. • Units
A distinguishing method of printed and handwritten legal amount on Chinese bank check
NASA Astrophysics Data System (ADS)
Zhu, Ningbo; Lou, Zhen; Yang, Jingyu
2003-09-01
While carrying out Optical Chinese Character Recognition, distinguishing the font between printed and handwritten characters at the early phase is necessary, because there is so much difference between the methods on recognizing these two types of characters. In this paper, we proposed a good method on how to banish seals and its relative standards that can judge whether they should be banished. Meanwhile, an approach on clearing up scattered noise shivers after image segmentation is presented. Four sets of classifying features that show discrimination between printed and handwritten characters are well adopted. The proposed approach was applied to an automatic check processing system and tested on about 9031 checks. The recognition rate is more than 99.5%.
NASA Astrophysics Data System (ADS)
Xu, Jiayuan; Yu, Chengtao; Bo, Bin; Xue, Yu; Xu, Changfu; Chaminda, P. R. Dushantha; Hu, Chengbo; Peng, Kai
2018-03-01
The automatic recognition of the high voltage isolation switch by remote video monitoring is an effective means to ensure the safety of the personnel and the equipment. The existing methods mainly include two ways: improving monitoring accuracy and adopting target detection technology through equipment transformation. Such a method is often applied to specific scenarios, with limited application scope and high cost. To solve this problem, a high voltage isolation switch state recognition method based on background difference and iterative search is proposed in this paper. The initial position of the switch is detected in real time through the background difference method. When the switch starts to open and close, the target tracking algorithm is used to track the motion trajectory of the switch. The opening and closing state of the switch is determined according to the angle variation of the switch tracking point and the center line. The effectiveness of the method is verified by experiments on different switched video frames of switching states. Compared with the traditional methods, this method is more robust and effective.
NASA Astrophysics Data System (ADS)
Cannata, A.; Montalto, P.; Aliotta, M.; Cassisi, C.; Pulvirenti, A.; Privitera, E.; Patanè, D.
2011-04-01
Active volcanoes generate sonic and infrasonic signals, whose investigation provides useful information for both monitoring purposes and the study of the dynamics of explosive phenomena. At Mt. Etna volcano (Italy), a pattern recognition system based on infrasonic waveform features has been developed. First, by a parametric power spectrum method, the features describing and characterizing the infrasound events were extracted: peak frequency and quality factor. Then, together with the peak-to-peak amplitude, these features constituted a 3-D ‘feature space’; by Density-Based Spatial Clustering of Applications with Noise algorithm (DBSCAN) three clusters were recognized inside it. After the clustering process, by using a common location method (semblance method) and additional volcanological information concerning the intensity of the explosive activity, we were able to associate each cluster to a particular source vent and/or a kind of volcanic activity. Finally, for automatic event location, clusters were used to train a model based on Support Vector Machine, calculating optimal hyperplanes able to maximize the margins of separation among the clusters. After the training phase this system automatically allows recognizing the active vent with no location algorithm and by using only a single station.
Dismount Threat Recognition through Automatic Pose Identification
2012-03-01
10 2.2.2 Enabling Technologies . . . . . . . . . . . . . . 11 2.2.3 Associative Memory Neural Networks . . . . . . 12 III. Methodology...20 3.2.3 Creating Separability . . . . . . . . . . . . . . . 23 3.3 Training the Associative Memory Neural Network... Effects of Parameter and Method Choices . . . . . . . . 30 4.3.1 Decimel versus Bipolar . . . . . . . . . . . . . . 30 4.3.2 Bipolar and Binary Values
Clustering-Based Ensemble Learning for Activity Recognition in Smart Homes
Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli
2014-01-01
Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks. PMID:25014095
Clustering-based ensemble learning for activity recognition in smart homes.
Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli
2014-07-10
Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.
Sparse representation based SAR vehicle recognition along with aspect angle.
Xing, Xiangwei; Ji, Kefeng; Zou, Huanxin; Sun, Jixiang
2014-01-01
As a method of representing the test sample with few training samples from an overcomplete dictionary, sparse representation classification (SRC) has attracted much attention in synthetic aperture radar (SAR) automatic target recognition (ATR) recently. In this paper, we develop a novel SAR vehicle recognition method based on sparse representation classification along with aspect information (SRCA), in which the correlation between the vehicle's aspect angle and the sparse representation vector is exploited. The detailed procedure presented in this paper can be summarized as follows. Initially, the sparse representation vector of a test sample is solved by sparse representation algorithm with a principle component analysis (PCA) feature-based dictionary. Then, the coefficient vector is projected onto a sparser one within a certain range of the vehicle's aspect angle. Finally, the vehicle is classified into a certain category that minimizes the reconstruction error with the novel sparse representation vector. Extensive experiments are conducted on the moving and stationary target acquisition and recognition (MSTAR) dataset and the results demonstrate that the proposed method performs robustly under the variations of depression angle and target configurations, as well as incomplete observation.
Semi-automatic recognition of marine debris on beaches
Ge, Zhenpeng; Shi, Huahong; Mei, Xuefei; Dai, Zhijun; Li, Daoji
2016-01-01
An increasing amount of anthropogenic marine debris is pervading the earth’s environmental systems, resulting in an enormous threat to living organisms. Additionally, the large amount of marine debris around the world has been investigated mostly through tedious manual methods. Therefore, we propose the use of a new technique, light detection and ranging (LIDAR), for the semi-automatic recognition of marine debris on a beach because of its substantially more efficient role in comparison with other more laborious methods. Our results revealed that LIDAR should be used for the classification of marine debris into plastic, paper, cloth and metal. Additionally, we reconstructed a 3-dimensional model of different types of debris on a beach with a high validity of debris revivification using LIDAR-based individual separation. These findings demonstrate that the availability of this new technique enables detailed observations to be made of debris on a large beach that was previously not possible. It is strongly suggested that LIDAR could be implemented as an appropriate monitoring tool for marine debris by global researchers and governments. PMID:27156433
NASA Astrophysics Data System (ADS)
Xiong, Yan; Reichenbach, Stephen E.
1999-01-01
Understanding of hand-written Chinese characters is at such a primitive stage that models include some assumptions about hand-written Chinese characters that are simply false. So Maximum Likelihood Estimation (MLE) may not be an optimal method for hand-written Chinese characters recognition. This concern motivates the research effort to consider alternative criteria. Maximum Mutual Information Estimation (MMIE) is an alternative method for parameter estimation that does not derive its rationale from presumed model correctness, but instead examines the pattern-modeling problem in automatic recognition system from an information- theoretic point of view. The objective of MMIE is to find a set of parameters in such that the resultant model allows the system to derive from the observed data as much information as possible about the class. We consider MMIE for recognition of hand-written Chinese characters using on a simplified hidden Markov Random Field. MMIE provides improved performance improvement over MLE in this application.
Surgical gesture segmentation and recognition.
Tao, Lingling; Zappella, Luca; Hager, Gregory D; Vidal, René
2013-01-01
Automatic surgical gesture segmentation and recognition can provide useful feedback for surgical training in robotic surgery. Most prior work in this field relies on the robot's kinematic data. Although recent work [1,2] shows that the robot's video data can be equally effective for surgical gesture recognition, the segmentation of the video into gestures is assumed to be known. In this paper, we propose a framework for joint segmentation and recognition of surgical gestures from kinematic and video data. Unlike prior work that relies on either frame-level kinematic cues, or segment-level kinematic or video cues, our approach exploits both cues by using a combined Markov/semi-Markov conditional random field (MsM-CRF) model. Our experiments show that the proposed model improves over a Markov or semi-Markov CRF when using video data alone, gives results that are comparable to state-of-the-art methods on kinematic data alone, and improves over state-of-the-art methods when combining kinematic and video data.
Aishima, Jun; Russel, Daniel S; Guibas, Leonidas J; Adams, Paul D; Brunger, Axel T
2005-10-01
Automatic fitting methods that build molecules into electron-density maps usually fail below 3.5 A resolution. As a first step towards addressing this problem, an algorithm has been developed using an approximation of the medial axis to simplify an electron-density isosurface. This approximation captures the central axis of the isosurface with a graph which is then matched against a graph of the molecular model. One of the first applications of the medial axis to X-ray crystallography is presented here. When applied to ligand fitting, the method performs at least as well as methods based on selecting peaks in electron-density maps. Generalization of the method to recognition of common features across multiple contour levels could lead to powerful automatic fitting methods that perform well even at low resolution.
Zhao, Yu; Ge, Fangfei; Liu, Tianming
2018-07-01
fMRI data decomposition techniques have advanced significantly from shallow models such as Independent Component Analysis (ICA) and Sparse Coding and Dictionary Learning (SCDL) to deep learning models such Deep Belief Networks (DBN) and Convolutional Autoencoder (DCAE). However, interpretations of those decomposed networks are still open questions due to the lack of functional brain atlases, no correspondence across decomposed or reconstructed networks across different subjects, and significant individual variabilities. Recent studies showed that deep learning, especially deep convolutional neural networks (CNN), has extraordinary ability of accommodating spatial object patterns, e.g., our recent works using 3D CNN for fMRI-derived network classifications achieved high accuracy with a remarkable tolerance for mistakenly labelled training brain networks. However, the training data preparation is one of the biggest obstacles in these supervised deep learning models for functional brain network map recognitions, since manual labelling requires tedious and time-consuming labours which will sometimes even introduce label mistakes. Especially for mapping functional networks in large scale datasets such as hundreds of thousands of brain networks used in this paper, the manual labelling method will become almost infeasible. In response, in this work, we tackled both the network recognition and training data labelling tasks by proposing a new iteratively optimized deep learning CNN (IO-CNN) framework with an automatic weak label initialization, which enables the functional brain networks recognition task to a fully automatic large-scale classification procedure. Our extensive experiments based on ABIDE-II 1099 brains' fMRI data showed the great promise of our IO-CNN framework. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sun, Kaioqiong; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Torigian, Drew A.
2014-03-01
This paper proposes a thoracic anatomy segmentation method based on hierarchical recognition and delineation guided by a built fuzzy model. Labeled binary samples for each organ are registered and aligned into a 3D fuzzy set representing the fuzzy shape model for the organ. The gray intensity distributions of the corresponding regions of the organ in the original image are recorded in the model. The hierarchical relation and mean location relation between different organs are also captured in the model. Following the hierarchical structure and location relation, the fuzzy shape model of different organs is registered to the given target image to achieve object recognition. A fuzzy connected delineation method is then used to obtain the final segmentation result of organs with seed points provided by recognition. The hierarchical structure and location relation integrated in the model provide the initial parameters for registration and make the recognition efficient and robust. The 3D fuzzy model combined with hierarchical affine registration ensures that accurate recognition can be obtained for both non-sparse and sparse organs. The results on real images are presented and shown to be better than a recently reported fuzzy model-based anatomy recognition strategy.
Automatic three-dimensional measurement of large-scale structure based on vision metrology.
Zhu, Zhaokun; Guan, Banglei; Zhang, Xiaohu; Li, Daokui; Yu, Qifeng
2014-01-01
All relevant key techniques involved in photogrammetric vision metrology for fully automatic 3D measurement of large-scale structure are studied. A new kind of coded target consisting of circular retroreflective discs is designed, and corresponding detection and recognition algorithms based on blob detection and clustering are presented. Then a three-stage strategy starting with view clustering is proposed to achieve automatic network orientation. As for matching of noncoded targets, the concept of matching path is proposed, and matches for each noncoded target are found by determination of the optimal matching path, based on a novel voting strategy, among all possible ones. Experiments on a fixed keel of airship have been conducted to verify the effectiveness and measuring accuracy of the proposed methods.
Computer Recognition of Facial Profiles
1974-08-01
facial recognition 20. ABSTRACT (Continue on reverse side It necessary and Identify by block number) A system for the recognition of human faces from...21 2.6 Classification Algorithms ........... ... 32 III FACIAL RECOGNITION AND AUTOMATIC TRAINING . . . 37 3.1 Facial Profile Recognition...provide a fair test of the classification system. The work of Goldstein, Harmon, and Lesk [81 indicates, however, that for facial recognition , a ten class
Automatic Mexican sign language and digits recognition using normalized central moments
NASA Astrophysics Data System (ADS)
Solís, Francisco; Martínez, David; Espinosa, Oscar; Toxqui, Carina
2016-09-01
This work presents a framework for automatic Mexican sign language and digits recognition based on computer vision system using normalized central moments and artificial neural networks. Images are captured by digital IP camera, four LED reflectors and a green background in order to reduce computational costs and prevent the use of special gloves. 42 normalized central moments are computed per frame and used in a Multi-Layer Perceptron to recognize each database. Four versions per sign and digit were used in training phase. 93% and 95% of recognition rates were achieved for Mexican sign language and digits respectively.
Infrared Cephalic-Vein to Assist Blood Extraction Tasks: Automatic Projection and Recognition
NASA Astrophysics Data System (ADS)
Lagüela, S.; Gesto, M.; Riveiro, B.; González-Aguilera, D.
2017-05-01
Thermal infrared band is not commonly used in photogrammetric and computer vision algorithms, mainly due to the low spatial resolution of this type of imagery. However, this band captures sub-superficial information, increasing the capabilities of visible bands regarding applications. This fact is especially important in biomedicine and biometrics, allowing the geometric characterization of interior organs and pathologies with photogrammetric principles, as well as the automatic identification and labelling using computer vision algorithms. This paper presents advances of close-range photogrammetry and computer vision applied to thermal infrared imagery, with the final application of Augmented Reality in order to widen its application in the biomedical field. In this case, the thermal infrared image of the arm is acquired and simultaneously projected on the arm, together with the identification label of the cephalic-vein. This way, blood analysts are assisted in finding the vein for blood extraction, especially in those cases where the identification by the human eye is a complex task. Vein recognition is performed based on the Gaussian temperature distribution in the area of the vein, while the calibration between projector and thermographic camera is developed through feature extraction and pattern recognition. The method is validated through its application to a set of volunteers, with different ages and genres, in such way that different conditions of body temperature and vein depth are covered for the applicability and reproducibility of the method.
Clustered Multi-Task Learning for Automatic Radar Target Recognition
Li, Cong; Bao, Weimin; Xu, Luping; Zhang, Hua
2017-01-01
Model training is a key technique for radar target recognition. Traditional model training algorithms in the framework of single task leaning ignore the relationships among multiple tasks, which degrades the recognition performance. In this paper, we propose a clustered multi-task learning, which can reveal and share the multi-task relationships for radar target recognition. To further make full use of these relationships, the latent multi-task relationships in the projection space are taken into consideration. Specifically, a constraint term in the projection space is proposed, the main idea of which is that multiple tasks within a close cluster should be close to each other in the projection space. In the proposed method, the cluster structures and multi-task relationships can be autonomously learned and utilized in both of the original and projected space. In view of the nonlinear characteristics of radar targets, the proposed method is extended to a non-linear kernel version and the corresponding non-linear multi-task solving method is proposed. Comprehensive experimental studies on simulated high-resolution range profile dataset and MSTAR SAR public database verify the superiority of the proposed method to some related algorithms. PMID:28953267
A robust recognition and accurate locating method for circular coded diagonal target
NASA Astrophysics Data System (ADS)
Bao, Yunna; Shang, Yang; Sun, Xiaoliang; Zhou, Jiexin
2017-10-01
As a category of special control points which can be automatically identified, artificial coded targets have been widely developed in the field of computer vision, photogrammetry, augmented reality, etc. In this paper, a new circular coded target designed by RockeTech technology Corp. Ltd is analyzed and studied, which is called circular coded diagonal target (CCDT). A novel detection and recognition method with good robustness is proposed in the paper, and implemented on Visual Studio. In this algorithm, firstly, the ellipse features of the center circle are used for rough positioning. Then, according to the characteristics of the center diagonal target, a circular frequency filter is designed to choose the correct center circle and eliminates non-target noise. The precise positioning of the coded target is done by the correlation coefficient fitting extreme value method. Finally, the coded target recognition is achieved by decoding the binary sequence in the outer ring of the extracted target. To test the proposed algorithm, this paper has carried out simulation experiments and real experiments. The results show that the CCDT recognition and accurate locating method proposed in this paper can robustly recognize and accurately locate the targets in complex and noisy background.
Automatic Speech Recognition from Neural Signals: A Focused Review.
Herff, Christian; Schultz, Tanja
2016-01-01
Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.
Quest Hierarchy for Hyperspectral Face Recognition
2011-03-01
numerous face recognition algorithms available, several very good literature surveys are available that include Abate [29], Samal [110], Kong [18], Zou...Perception, Japan (January 1994). [110] Samal , Ashok and P. Iyengar, Automatic Recognition and Analysis of Human Faces and Facial Expressions: A Survey
Zeng, Jinle; Chang, Baohua; Du, Dong; Wang, Li; Chang, Shuhe; Peng, Guodong; Wang, Wenzhu
2018-01-05
Multi-layer/multi-pass welding (MLMPW) technology is widely used in the energy industry to join thick components. During automatic welding using robots or other actuators, it is very important to recognize the actual weld pass position using visual methods, which can then be used not only to perform reasonable path planning for actuators, but also to correct any deviations between the welding torch and the weld pass position in real time. However, due to the small geometrical differences between adjacent weld passes, existing weld position recognition technologies such as structured light methods are not suitable for weld position detection in MLMPW. This paper proposes a novel method for weld position detection, which fuses various kinds of information in MLMPW. First, a synchronous acquisition method is developed to obtain various kinds of visual information when directional light and structured light sources are on, respectively. Then, interferences are eliminated by fusing adjacent images. Finally, the information from directional and structured light images is fused to obtain the 3D positions of the weld passes. Experiment results show that each process can be done in 30 ms and the deviation is less than 0.6 mm. The proposed method can be used for automatic path planning and seam tracking in the robotic MLMPW process as well as electron beam freeform fabrication process.
NASA Astrophysics Data System (ADS)
Wang, Hongcui; Kawahara, Tatsuya
CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.
The Suitability of Cloud-Based Speech Recognition Engines for Language Learning
ERIC Educational Resources Information Center
Daniels, Paul; Iwago, Koji
2017-01-01
As online automatic speech recognition (ASR) engines become more accurate and more widely implemented with call software, it becomes important to evaluate the effectiveness and the accuracy of these recognition engines using authentic speech samples. This study investigates two of the most prominent cloud-based speech recognition engines--Apple's…
NASA Astrophysics Data System (ADS)
Tian, Fuyang; Cao, Dong; Dong, Xiaoning; Zhao, Xinqiang; Li, Fade; Wang, Zhonghua
2017-06-01
Behavioral features recognition was an important effect to detect oestrus and sickness in dairy herds and there is a need for heat detection aid. The detection method was based on the measure of the individual behavioural activity, standing time, and temperature of dairy using vibrational sensor and temperature sensor in this paper. The data of behavioural activity index, standing time, lying time and walking time were sent to computer by lower power consumption wireless communication system. The fast approximate K-means algorithm (FAKM) was proposed to deal the data of the sensor for behavioral features recognition. As a result of technical progress in monitoring cows using computers, automatic oestrus detection has become possible.
Arabic Language Modeling with Stem-Derived Morphemes for Automatic Speech Recognition
ERIC Educational Resources Information Center
Heintz, Ilana
2010-01-01
The goal of this dissertation is to introduce a method for deriving morphemes from Arabic words using stem patterns, a feature of Arabic morphology. The motivations are three-fold: modeling with morphemes rather than words should help address the out-of-vocabulary problem; working with stem patterns should prove to be a cross-dialectally valid…
Random Deep Belief Networks for Recognizing Emotions from Speech Signals.
Wen, Guihua; Li, Huihui; Huang, Jubing; Li, Danyang; Xun, Eryang
2017-01-01
Now the human emotions can be recognized from speech signals using machine learning methods; however, they are challenged by the lower recognition accuracies in real applications due to lack of the rich representation ability. Deep belief networks (DBN) can automatically discover the multiple levels of representations in speech signals. To make full of its advantages, this paper presents an ensemble of random deep belief networks (RDBN) method for speech emotion recognition. It firstly extracts the low level features of the input speech signal and then applies them to construct lots of random subspaces. Each random subspace is then provided for DBN to yield the higher level features as the input of the classifier to output an emotion label. All outputted emotion labels are then fused through the majority voting to decide the final emotion label for the input speech signal. The conducted experimental results on benchmark speech emotion databases show that RDBN has better accuracy than the compared methods for speech emotion recognition.
Random Deep Belief Networks for Recognizing Emotions from Speech Signals
Li, Huihui; Huang, Jubing; Li, Danyang; Xun, Eryang
2017-01-01
Now the human emotions can be recognized from speech signals using machine learning methods; however, they are challenged by the lower recognition accuracies in real applications due to lack of the rich representation ability. Deep belief networks (DBN) can automatically discover the multiple levels of representations in speech signals. To make full of its advantages, this paper presents an ensemble of random deep belief networks (RDBN) method for speech emotion recognition. It firstly extracts the low level features of the input speech signal and then applies them to construct lots of random subspaces. Each random subspace is then provided for DBN to yield the higher level features as the input of the classifier to output an emotion label. All outputted emotion labels are then fused through the majority voting to decide the final emotion label for the input speech signal. The conducted experimental results on benchmark speech emotion databases show that RDBN has better accuracy than the compared methods for speech emotion recognition. PMID:28356908
Estes, Zachary; Adelman, James S
2008-08-01
An automatic vigilance hypothesis states that humans preferentially attend to negative stimuli, and this attention to negative valence disrupts the processing of other stimulus properties. Thus, negative words typically elicit slower color naming, word naming, and lexical decisions than neutral or positive words. Larsen, Mercer, and Balota analyzed the stimuli from 32 published studies, and they found that word valence was confounded with several lexical factors known to affect word recognition. Indeed, with these lexical factors covaried out, Larsen et al. found no evidence of automatic vigilance. The authors report a more sensitive analysis of 1011 words. Results revealed a small but reliable valence effect, such that negative words (e.g., "shark") elicit slower lexical decisions and naming than positive words (e.g., "beach"). Moreover, the relation between valence and recognition was categorical rather than linear; the extremity of a word's valence did not affect its recognition. This valence effect was not attributable to word length, frequency, orthographic neighborhood size, contextual diversity, first phoneme, or arousal. Thus, the present analysis provides the most powerful demonstration of automatic vigilance to date.
Automatic Recognition of Phonemes Using a Syntactic Processor for Error Correction.
1980-12-01
OF PHONEMES USING A SYNTACTIC PROCESSOR FOR ERROR CORRECTION THESIS AFIT/GE/EE/8D-45 Robert B. ’Taylor 2Lt USAF Approved for public release...distribution unlimilted. AbP AFIT/GE/EE/ 80D-45 AUTOMATIC RECOGNITION OF PHONEMES USING A SYNTACTIC PROCESSOR FOR ERROR CORRECTION THESIS Presented to the...Testing ..................... 37 Bayes Decision Rule for Minimum Error ........... 37 Bayes Decision Rule for Minimum Risk ............ 39 Mini Max Test
NASA Astrophysics Data System (ADS)
Fernández Pozo, Rubén; Blanco Murillo, Jose Luis; Hernández Gómez, Luis; López Gonzalo, Eduardo; Alcázar Ramírez, José; Toledano, Doroteo T.
2009-12-01
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Early Detection of Severe Apnoea through Voice Analysis and Automatic Speaker Recognition Techniques
NASA Astrophysics Data System (ADS)
Fernández, Ruben; Blanco, Jose Luis; Díaz, David; Hernández, Luis A.; López, Eduardo; Alcázar, José
This study is part of an on-going collaborative effort between the medical and the signal processing communities to promote research on applying voice analysis and Automatic Speaker Recognition techniques (ASR) for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based diagnosis could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we present and discuss the possibilities of using generative Gaussian Mixture Models (GMMs), generally used in ASR systems, to model distinctive apnoea voice characteristics (i.e. abnormal nasalization). Finally, we present experimental findings regarding the discriminative power of speaker recognition techniques applied to severe apnoea detection. We have achieved an 81.25 % correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Automatic anatomy recognition in whole-body PET/CT images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Huiqian; Udupa, Jayaram K., E-mail: jay@mail.med.upenn.edu; Odhner, Dewey
Purpose: Whole-body positron emission tomography/computed tomography (PET/CT) has become a standard method of imaging patients with various disease conditions, especially cancer. Body-wide accurate quantification of disease burden in PET/CT images is important for characterizing lesions, staging disease, prognosticating patient outcome, planning treatment, and evaluating disease response to therapeutic interventions. However, body-wide anatomy recognition in PET/CT is a critical first step for accurately and automatically quantifying disease body-wide, body-region-wise, and organwise. This latter process, however, has remained a challenge due to the lower quality of the anatomic information portrayed in the CT component of this imaging modality and the paucity ofmore » anatomic details in the PET component. In this paper, the authors demonstrate the adaptation of a recently developed automatic anatomy recognition (AAR) methodology [Udupa et al., “Body-wide hierarchical fuzzy modeling, recognition, and delineation of anatomy in medical images,” Med. Image Anal. 18, 752–771 (2014)] to PET/CT images. Their goal was to test what level of object localization accuracy can be achieved on PET/CT compared to that achieved on diagnostic CT images. Methods: The authors advance the AAR approach in this work in three fronts: (i) from body-region-wise treatment in the work of Udupa et al. to whole body; (ii) from the use of image intensity in optimal object recognition in the work of Udupa et al. to intensity plus object-specific texture properties, and (iii) from the intramodality model-building-recognition strategy to the intermodality approach. The whole-body approach allows consideration of relationships among objects in different body regions, which was previously not possible. Consideration of object texture allows generalizing the previous optimal threshold-based fuzzy model recognition method from intensity images to any derived fuzzy membership image, and in the process, to bring performance to the level achieved on diagnostic CT and MR images in body-region-wise approaches. The intermodality approach fosters the use of already existing fuzzy models, previously created from diagnostic CT images, on PET/CT and other derived images, thus truly separating the modality-independent object assembly anatomy from modality-specific tissue property portrayal in the image. Results: Key ways of combining the above three basic ideas lead them to 15 different strategies for recognizing objects in PET/CT images. Utilizing 50 diagnostic CT image data sets from the thoracic and abdominal body regions and 16 whole-body PET/CT image data sets, the authors compare the recognition performance among these 15 strategies on 18 objects from the thorax, abdomen, and pelvis in object localization error and size estimation error. Particularly on texture membership images, object localization is within three voxels on whole-body low-dose CT images and 2 voxels on body-region-wise low-dose images of known true locations. Surprisingly, even on direct body-region-wise PET images, localization error within 3 voxels seems possible. Conclusions: The previous body-region-wise approach can be extended to whole-body torso with similar object localization performance. Combined use of image texture and intensity property yields the best object localization accuracy. In both body-region-wise and whole-body approaches, recognition performance on low-dose CT images reaches levels previously achieved on diagnostic CT images. The best object recognition strategy varies among objects; the proposed framework however allows employing a strategy that is optimal for each object.« less
Histogram equalization with Bayesian estimation for noise robust speech recognition.
Suh, Youngjoo; Kim, Hoirin
2018-02-01
The histogram equalization approach is an efficient feature normalization technique for noise robust automatic speech recognition. However, it suffers from performance degradation when some fundamental conditions are not satisfied in the test environment. To remedy these limitations of the original histogram equalization methods, class-based histogram equalization approach has been proposed. Although this approach showed substantial performance improvement under noise environments, it still suffers from performance degradation due to the overfitting problem when test data are insufficient. To address this issue, the proposed histogram equalization technique employs the Bayesian estimation method in the test cumulative distribution function estimation. It was reported in a previous study conducted on the Aurora-4 task that the proposed approach provided substantial performance gains in speech recognition systems based on the acoustic modeling of the Gaussian mixture model-hidden Markov model. In this work, the proposed approach was examined in speech recognition systems with deep neural network-hidden Markov model (DNN-HMM), the current mainstream speech recognition approach where it also showed meaningful performance improvement over the conventional maximum likelihood estimation-based method. The fusion of the proposed features with the mel-frequency cepstral coefficients provided additional performance gains in DNN-HMM systems, which otherwise suffer from performance degradation in the clean test condition.
Automatic recognition of ship types from infrared images using superstructure moment invariants
NASA Astrophysics Data System (ADS)
Li, Heng; Wang, Xinyu
2007-11-01
Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.
A novel automatic method for monitoring Tourette motor tics through a wearable device.
Bernabei, Michel; Preatoni, Ezio; Mendez, Martin; Piccini, Luca; Porta, Mauro; Andreoni, Giuseppe
2010-09-15
The aim of this study was to propose a novel automatic method for quantifying motor-tics caused by the Tourette Syndrome (TS). In this preliminary report, the feasibility of the monitoring process was tested over a series of standard clinical trials in a population of 12 subjects affected by TS. A wearable instrument with an embedded three-axial accelerometer was used to detect and classify motor tics during standing and walking activities. An algorithm was devised to analyze acceleration data by: eliminating noise; detecting peaks connected to pathological events; and classifying intensity and frequency of motor tics into quantitative scores. These indexes were compared with the video-based ones provided by expert clinicians, which were taken as the gold-standard. Sensitivity, specificity, and accuracy of tic detection were estimated, and an agreement analysis was performed through the least square regression and the Bland-Altman test. The tic recognition algorithm showed sensitivity = 80.8% ± 8.5% (mean ± SD), specificity = 75.8% ± 17.3%, and accuracy = 80.5% ± 12.2%. The agreement study showed that automatic detection tended to overestimate the number of tics occurred. Although, it appeared this may be a systematic error due to the different recognition principles of the wearable and video-based systems. Furthermore, there was substantial concurrency with the gold-standard in estimating the severity indexes. The proposed methodology gave promising performances in terms of automatic motor-tics detection and classification in a standard clinical context. The system may provide physicians with a quantitative aid for TS assessment. Further developments will focus on the extension of its application to everyday long-term monitoring out of clinical environments. © 2010 Movement Disorder Society.
Street curb recognition in 3d point cloud data using morphological operations
NASA Astrophysics Data System (ADS)
Rodríguez-Cuenca, Borja; Concepción Alonso-Rodríguez, María; García-Cortés, Silverio; Ordóñez, Celestino
2015-04-01
Accurate and automatic detection of cartographic-entities saves a great deal of time and money when creating and updating cartographic databases. The current trend in remote sensing feature extraction is to develop methods that are as automatic as possible. The aim is to develop algorithms that can obtain accurate results with the least possible human intervention in the process. Non-manual curb detection is an important issue in road maintenance, 3D urban modeling, and autonomous navigation fields. This paper is focused on the semi-automatic recognition of curbs and street boundaries using a 3D point cloud registered by a mobile laser scanner (MLS) system. This work is divided into four steps. First, a coordinate system transformation is carried out, moving from a global coordinate system to a local one. After that and in order to simplify the calculations involved in the procedure, a rasterization based on the projection of the measured point cloud on the XY plane was carried out, passing from the 3D original data to a 2D image. To determine the location of curbs in the image, different image processing techniques such as thresholding and morphological operations were applied. Finally, the upper and lower edges of curbs are detected by an unsupervised classification algorithm on the curvature and roughness of the points that represent curbs. The proposed method is valid in both straight and curved road sections and applicable both to laser scanner and stereo vision 3D data due to the independence of its scanning geometry. This method has been successfully tested with two datasets measured by different sensors. The first dataset corresponds to a point cloud measured by a TOPCON sensor in the Spanish town of Cudillero. That point cloud comprises more than 6,000,000 points and covers a 400-meter street. The second dataset corresponds to a point cloud measured by a RIEGL sensor in the Austrian town of Horn. That point cloud comprises 8,000,000 points and represents a 160-meter street. The proposed method provides success rates in curb recognition of over 85% in both datasets.
Understanding Cognitive Development: Automaticity and the Early Years Child
ERIC Educational Resources Information Center
Gray, Colette
2004-01-01
In recent years a growing body of evidence has implicated deficits in the automaticity of fundamental facts such as word and number recognition in a range of disorders: including attention deficit hyperactivity disorder, dyslexia, apraxia and autism. Variously described as habits, fluency, chunking and over learning, automatic processes are best…
Automatic Surveying For Hazard Prevention On Glacier De GiÉtro, Switzerland
NASA Astrophysics Data System (ADS)
Bauder, A.; Funk, M.; Bösch, H.
Breaking off of large ice masses from the steep tongue of Glacier de Giétro may endanger a nearby reservoir. Such a falling ice mass could cause an oversplash over the dam at timeof a nearly filled lake. For this reason the glacier has been monitored intensively since the 1960's. An automatic theodolite was installed three years ago. It allows continuous displacement measurements of several targets on the glacier in order to detect short-term acceleration events. The installation includes a telemetric data transmission, which provides for immediate recognition of hazardous situations and early alarming. The obtained data were analysed in terms of precision and performance of the applied method. A high temporal resolution was gained. The comparison with traditional ob- servations shows clearly the potential of modern instruments to improve monitoring schems. We summarize the main results of this study and discuss the applicability of a modern motorized theodolite with target tracking and recognition ability for moni- toring purposes.
Kauppi, Jukka-Pekka; Martikainen, Kalle; Ruotsalainen, Ulla
2010-12-01
The central purpose of passive signal intercept receivers is to perform automatic categorization of unknown radar signals. Currently, there is an urgent need to develop intelligent classification algorithms for these devices due to emerging complexity of radar waveforms. Especially multifunction radars (MFRs) capable of performing several simultaneous tasks by utilizing complex, dynamically varying scheduled waveforms are a major challenge for automatic pattern classification systems. To assist recognition of complex radar emissions in modern intercept receivers, we have developed a novel method to recognize dynamically varying pulse repetition interval (PRI) modulation patterns emitted by MFRs. We use robust feature extraction and classifier design techniques to assist recognition in unpredictable real-world signal environments. We classify received pulse trains hierarchically which allows unambiguous detection of the subpatterns using a sliding window. Accuracy, robustness and reliability of the technique are demonstrated with extensive simulations using both static and dynamically varying PRI modulation patterns. Copyright © 2010 Elsevier Ltd. All rights reserved.
Automatic recognition and analysis of synapses. [in brain tissue
NASA Technical Reports Server (NTRS)
Ungerleider, J. A.; Ledley, R. S.; Bloom, F. E.
1976-01-01
An automatic system for recognizing synaptic junctions would allow analysis of large samples of tissue for the possible classification of specific well-defined sets of synapses based upon structural morphometric indices. In this paper the three steps of our system are described: (1) cytochemical tissue preparation to allow easy recognition of the synaptic junctions; (2) transmitting the tissue information to a computer; and (3) analyzing each field to recognize the synapses and make measurements on them.
Health smart home for elders - a tool for automatic recognition of activities of daily living.
Le, Xuan Hoa Binh; Di Mascolo, Maria; Gouin, Alexia; Noury, Norbert
2008-01-01
Elders live preferently in their own home, but with aging comes the loss of autonomy and associated risks. In order to help them live longer in safe conditions, we need a tool to automatically detect their loss of autonomy by assessing the degree of performance of activities of daily living. This article presents an approach enabling the activities recognition of an elder living alone in a home equipped with noninvasive sensors.
Unification of automatic target tracking and automatic target recognition
NASA Astrophysics Data System (ADS)
Schachter, Bruce J.
2014-06-01
The subject being addressed is how an automatic target tracker (ATT) and an automatic target recognizer (ATR) can be fused together so tightly and so well that their distinctiveness becomes lost in the merger. This has historically not been the case outside of biology and a few academic papers. The biological model of ATT∪ATR arises from dynamic patterns of activity distributed across many neural circuits and structures (including retina). The information that the brain receives from the eyes is "old news" at the time that it receives it. The eyes and brain forecast a tracked object's future position, rather than relying on received retinal position. Anticipation of the next moment - building up a consistent perception - is accomplished under difficult conditions: motion (eyes, head, body, scene background, target) and processing limitations (neural noise, delays, eye jitter, distractions). Not only does the human vision system surmount these problems, but it has innate mechanisms to exploit motion in support of target detection and classification. Biological vision doesn't normally operate on snapshots. Feature extraction, detection and recognition are spatiotemporal. When vision is viewed as a spatiotemporal process, target detection, recognition, tracking, event detection and activity recognition, do not seem as distinct as they are in current ATT and ATR designs. They appear as similar mechanism taking place at varying time scales. A framework is provided for unifying ATT and ATR.
Face averages enhance user recognition for smartphone security.
Robertson, David J; Kramer, Robin S S; Burton, A Mike
2015-01-01
Our recognition of familiar faces is excellent, and generalises across viewing conditions. However, unfamiliar face recognition is much poorer. For this reason, automatic face recognition systems might benefit from incorporating the advantages of familiarity. Here we put this to the test using the face verification system available on a popular smartphone (the Samsung Galaxy). In two experiments we tested the recognition performance of the smartphone when it was encoded with an individual's 'face-average'--a representation derived from theories of human face perception. This technique significantly improved performance for both unconstrained celebrity images (Experiment 1) and for real faces (Experiment 2): users could unlock their phones more reliably when the device stored an average of the user's face than when they stored a single image. This advantage was consistent across a wide variety of everyday viewing conditions. Furthermore, the benefit did not reduce the rejection of imposter faces. This benefit is brought about solely by consideration of suitable representations for automatic face recognition, and we argue that this is just as important as development of matching algorithms themselves. We propose that this representation could significantly improve recognition rates in everyday settings.
Dynamic gesture recognition using neural networks: a fundament for advanced interaction construction
NASA Astrophysics Data System (ADS)
Boehm, Klaus; Broll, Wolfgang; Sokolewicz, Michael A.
1994-04-01
Interaction in virtual reality environments is still a challenging task. Static hand posture recognition is currently the most common and widely used method for interaction using glove input devices. In order to improve the naturalness of interaction, and thereby decrease the user-interface learning time, there is a need to be able to recognize dynamic gestures. In this paper we describe our approach to overcoming the difficulties of dynamic gesture recognition (DGR) using neural networks. Backpropagation neural networks have already proven themselves to be appropriate and efficient for posture recognition. However, the extensive amount of data involved in DGR requires a different approach. Because of features such as topology preservation and automatic-learning, Kohonen Feature Maps are particularly suitable for the reduction of the high dimensional data space that is the result of a dynamic gesture, and are thus implemented for this task.
Gimli: open source and high-performance biomedical name recognition
2013-01-01
Background Automatic recognition of biomedical names is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. In recent years, various solutions have been implemented to tackle this problem. However, limitations regarding system characteristics, customization and usability still hinder their wider application outside text mining research. Results We present Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli achieves an F-measure of 87.17% on GENETAG and 72.23% on JNLPBA corpus, significantly outperforming existing open-source solutions. Conclusions Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field. Gimli is freely available at http://bioinformatics.ua.pt/gimli. PMID:23413997
NASA Astrophysics Data System (ADS)
Gloger, Oliver; Tönnies, Klaus; Mensel, Birger; Völzke, Henry
2015-11-01
In epidemiological studies as well as in clinical practice the amount of produced medical image data strongly increased in the last decade. In this context organ segmentation in MR volume data gained increasing attention for medical applications. Especially in large-scale population-based studies organ volumetry is highly relevant requiring exact organ segmentation. Since manual segmentation is time-consuming and prone to reader variability, large-scale studies need automatized methods to perform organ segmentation. Fully automatic organ segmentation in native MR image data has proven to be a very challenging task. Imaging artifacts as well as inter- and intrasubject MR-intensity differences complicate the application of supervised learning strategies. Thus, we propose a modularized framework of a two-stepped probabilistic approach that generates subject-specific probability maps for renal parenchyma tissue, which are refined subsequently by using several, extended segmentation strategies. We present a three class-based support vector machine recognition system that incorporates Fourier descriptors as shape features to recognize and segment characteristic parenchyma parts. Probabilistic methods use the segmented characteristic parenchyma parts to generate high quality subject-specific parenchyma probability maps. Several refinement strategies including a final shape-based 3D level set segmentation technique are used in subsequent processing modules to segment renal parenchyma. Furthermore, our framework recognizes and excludes renal cysts from parenchymal volume, which is important to analyze renal functions. Volume errors and Dice coefficients show that our presented framework outperforms existing approaches.
Gloger, Oliver; Tönnies, Klaus; Mensel, Birger; Völzke, Henry
2015-11-21
In epidemiological studies as well as in clinical practice the amount of produced medical image data strongly increased in the last decade. In this context organ segmentation in MR volume data gained increasing attention for medical applications. Especially in large-scale population-based studies organ volumetry is highly relevant requiring exact organ segmentation. Since manual segmentation is time-consuming and prone to reader variability, large-scale studies need automatized methods to perform organ segmentation. Fully automatic organ segmentation in native MR image data has proven to be a very challenging task. Imaging artifacts as well as inter- and intrasubject MR-intensity differences complicate the application of supervised learning strategies. Thus, we propose a modularized framework of a two-stepped probabilistic approach that generates subject-specific probability maps for renal parenchyma tissue, which are refined subsequently by using several, extended segmentation strategies. We present a three class-based support vector machine recognition system that incorporates Fourier descriptors as shape features to recognize and segment characteristic parenchyma parts. Probabilistic methods use the segmented characteristic parenchyma parts to generate high quality subject-specific parenchyma probability maps. Several refinement strategies including a final shape-based 3D level set segmentation technique are used in subsequent processing modules to segment renal parenchyma. Furthermore, our framework recognizes and excludes renal cysts from parenchymal volume, which is important to analyze renal functions. Volume errors and Dice coefficients show that our presented framework outperforms existing approaches.
Generation, recognition, and consistent fusion of partial boundary representations from range images
NASA Astrophysics Data System (ADS)
Kohlhepp, Peter; Hanczak, Andrzej M.; Li, Gang
1994-10-01
This paper presents SOMBRERO, a new system for recognizing and locating 3D, rigid, non- moving objects from range data. The objects may be polyhedral or curved, partially occluding, touching or lying flush with each other. For data collection, we employ 2D time- of-flight laser scanners mounted to a moving gantry robot. By combining sensor and robot coordinates, we obtain 3D cartesian coordinates. Boundary representations (Brep's) provide view independent geometry models that are both efficiently recognizable and derivable automatically from sensor data. SOMBRERO's methods for generating, matching and fusing Brep's are highly synergetic. A split-and-merge segmentation algorithm with dynamic triangular builds a partial (21/2D) Brep from scattered data. The recognition module matches this scene description with a model database and outputs recognized objects, their positions and orientations, and possibly surfaces corresponding to unknown objects. We present preliminary results in scene segmentation and recognition. Partial Brep's corresponding to different range sensors or viewpoints can be merged into a consistent, complete and irredundant 3D object or scene model. This fusion algorithm itself uses the recognition and segmentation methods.
NASA Astrophysics Data System (ADS)
Yan, Yue
2018-03-01
A synthetic aperture radar (SAR) automatic target recognition (ATR) method based on the convolutional neural networks (CNN) trained by augmented training samples is proposed. To enhance the robustness of CNN to various extended operating conditions (EOCs), the original training images are used to generate the noisy samples at different signal-to-noise ratios (SNRs), multiresolution representations, and partially occluded images. Then, the generated images together with the original ones are used to train a designed CNN for target recognition. The augmented training samples can contrapuntally improve the robustness of the trained CNN to the covered EOCs, i.e., the noise corruption, resolution variance, and partial occlusion. Moreover, the significantly larger training set effectively enhances the representation capability for other conditions, e.g., the standard operating condition (SOC), as well as the stability of the network. Therefore, better performance can be achieved by the proposed method for SAR ATR. For experimental evaluation, extensive experiments are conducted on the Moving and Stationary Target Acquisition and Recognition dataset under SOC and several typical EOCs.
Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf
2018-05-01
Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.
ERIC Educational Resources Information Center
Frye, Elizabeth M.; Gosky, Ross
2012-01-01
The present study investigated the relationship between rapid recognition of individual words (Word Recognition Test) and two measures of contextual reading: (1) grade-level Passage Reading Test (IRI passage) and (2) performance on standardized STAR Reading Test. To establish if time of presentation on the word recognition test was a factor in…
New technique for real-time distortion-invariant multiobject recognition and classification
NASA Astrophysics Data System (ADS)
Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan
2001-04-01
A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.
Voice reaction times with recognition for Commodore computers
NASA Technical Reports Server (NTRS)
Washburn, David A.; Putney, R. Thompson
1990-01-01
Hardware and software modifications are presented that allow for collection and recognition by a Commodore computer of spoken responses. Responses are timed with millisecond accuracy and automatically analyzed and scored. Accuracy data for this device from several experiments are presented. Potential applications and suggestions for improving recognition accuracy are also discussed.
Automatic Intention Recognition in Conversation Processing
ERIC Educational Resources Information Center
Holtgraves, Thomas
2008-01-01
A fundamental assumption of many theories of conversation is that comprehension of a speaker's utterance involves recognition of the speaker's intention in producing that remark. However, the nature of intention recognition is not clear. One approach is to conceptualize a speaker's intention in terms of speech acts [Searle, J. (1969). "Speech…
Seamless Tracing of Human Behavior Using Complementary Wearable and House-Embedded Sensors
Augustyniak, Piotr; Smoleń, Magdalena; Mikrut, Zbigniew; Kańtoch, Eliasz
2014-01-01
This paper presents a multimodal system for seamless surveillance of elderly people in their living environment. The system uses simultaneously a wearable sensor network for each individual and premise-embedded sensors specific for each environment. The paper demonstrates the benefits of using complementary information from two types of mobility sensors: visual flow-based image analysis and an accelerometer-based wearable network. The paper provides results for indoor recognition of several elementary poses and outdoor recognition of complex movements. Instead of complete system description, particular attention was drawn to a polar histogram-based method of visual pose recognition, complementary use and synchronization of the data from wearable and premise-embedded networks and an automatic danger detection algorithm driven by two premise- and subject-related databases. The novelty of our approach also consists in feeding the databases with real-life recordings from the subject, and in using the dynamic time-warping algorithm for measurements of distance between actions represented as elementary poses in behavioral records. The main results of testing our method include: 95.5% accuracy of elementary pose recognition by the video system, 96.7% accuracy of elementary pose recognition by the accelerometer-based system, 98.9% accuracy of elementary pose recognition by the combined accelerometer and video-based system, and 80% accuracy of complex outdoor activity recognition by the accelerometer-based wearable system. PMID:24787640
NASA Astrophysics Data System (ADS)
Kayasith, Prakasith; Theeramunkong, Thanaruk
It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang
2018-05-16
The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.
Looking inside the Ocean: Toward an Autonomous Imaging System for Monitoring Gelatinous Zooplankton
Corgnati, Lorenzo; Marini, Simone; Mazzei, Luca; Ottaviani, Ennio; Aliani, Stefano; Conversi, Alessandra; Griffa, Annalisa
2016-01-01
Marine plankton abundance and dynamics in the open and interior ocean is still an unknown field. The knowledge of gelatinous zooplankton distribution is especially challenging, because this type of plankton has a very fragile structure and cannot be directly sampled using traditional net based techniques. To overcome this shortcoming, Computer Vision techniques can be successfully used for the automatic monitoring of this group.This paper presents the GUARD1 imaging system, a low-cost stand-alone instrument for underwater image acquisition and recognition of gelatinous zooplankton, and discusses the performance of three different methodologies, Tikhonov Regularization, Support Vector Machines and Genetic Programming, that have been compared in order to select the one to be run onboard the system for the automatic recognition of gelatinous zooplankton. The performance comparison results highlight the high accuracy of the three methods in gelatinous zooplankton identification, showing their good capability in robustly selecting relevant features. In particular, Genetic Programming technique achieves the same performances of the other two methods by using a smaller set of features, thus being the most efficient in avoiding computationally consuming preprocessing stages, that is a crucial requirement for running on an autonomous imaging system designed for long lasting deployments, like the GUARD1. The Genetic Programming algorithm has been installed onboard the system, that has been operationally tested in a two-months survey in the Ligurian Sea, providing satisfactory results in terms of monitoring and recognition performances. PMID:27983638
Hidden Markov models in automatic speech recognition
NASA Astrophysics Data System (ADS)
Wrzoskowicz, Adam
1993-11-01
This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
Scene Analysis: Non-Linear Spatial Filtering for Automatic Target Detection.
1982-12-01
In this thesis, a method for two-dimensional pattern recognition was developed and tested. The method included a global search scheme for candidate...test global switch TYPEO Creating negative video file only.W 11=0 12=256 13=512 14=768 GO 70 2 1 TYPE" Creating negative and horizontally flipped video...purpose was to develop a base of image processing software for the AFIT Digital Signal Processing Laboratory NOVA- ECLIPSE minicomputer system, for
Data handling and analysis for the 1971 corn blight watch experiment.
NASA Technical Reports Server (NTRS)
Anuta, P. E.; Phillips, T. L.; Landgrebe, D. A.
1972-01-01
Review of the data handling and analysis methods used in the near-operational test of remote sensing systems provided by the 1971 corn blight watch experiment. The general data analysis techniques and, particularly, the statistical multispectral pattern recognition methods for automatic computer analysis of aircraft scanner data are described. Some of the results obtained are examined, and the implications of the experiment for future data communication requirements of earth resource survey systems are discussed.
Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar
2016-10-01
Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.
Recognition of surface lithologic and topographic patterns in southwest Colorado with ADP techniques
NASA Technical Reports Server (NTRS)
Melhorn, W. N.; Sinnock, S.
1973-01-01
Analysis of ERTS-1 multispectral data by automatic pattern recognition procedures is applicable toward grappling with current and future resource stresses by providing a means for refining existing geologic maps. The procedures used in the current analysis already yield encouraging results toward the eventual machine recognition of extensive surface lithologic and topographic patterns. Automatic mapping of a series of hogbacks, strike valleys, and alluvial surfaces along the northwest flank of the San Juan Basin in Colorado can be obtained by minimal man-machine interaction. The determination of causes for separable spectral signatures is dependent upon extensive correlation of micro- and macro field based ground truth observations and aircraft underflight data with the satellite data.
Implicit Shape Models for Object Detection in 3d Point Clouds
NASA Astrophysics Data System (ADS)
Velizhev, A.; Shapovalov, R.; Schindler, K.
2012-07-01
We present a method for automatic object localization and recognition in 3D point clouds representing outdoor urban scenes. The method is based on the implicit shape models (ISM) framework, which recognizes objects by voting for their center locations. It requires only few training examples per class, which is an important property for practical use. We also introduce and evaluate an improved version of the spin image descriptor, more robust to point density variation and uncertainty in normal direction estimation. Our experiments reveal a significant impact of these modifications on the recognition performance. We compare our results against the state-of-the-art method and get significant improvement in both precision and recall on the Ohio dataset, consisting of combined aerial and terrestrial LiDAR scans of 150,000 m2 of urban area in total.
A new license plate extraction framework based on fast mean shift
NASA Astrophysics Data System (ADS)
Pan, Luning; Li, Shuguang
2010-08-01
License plate extraction is considered to be the most crucial step of Automatic license plate recognition (ALPR) system. In this paper, a region-based license plate hybrid detection method is proposed to solve practical problems under complex background in which existing large quantity of disturbing information. In this method, coarse license plate location is carried out firstly to get the head part of a vehicle. Then a new Fast Mean Shift method based on random sampling of Kernel Density Estimate (KDE) is adopted to segment the color vehicle images, in order to get candidate license plate regions. The remarkable speed-up it brings makes Mean Shift segmentation more suitable for this application. Feature extraction and classification is used to accurately separate license plate from other candidate regions. At last, tilted license plate regulation is used for future recognition steps.
Deep learning approach to bacterial colony classification.
Zieliński, Bartosz; Plichta, Anna; Misztal, Krzysztof; Spurek, Przemysław; Brzychczy-Włoch, Monika; Ochońska, Dorota
2017-01-01
In microbiology it is diagnostically useful to recognize various genera and species of bacteria. It can be achieved using computer-aided methods, which make the recognition processes more automatic and thus significantly reduce the time necessary for the classification. Moreover, in case of diagnostic uncertainty (the misleading similarity in shape or structure of bacterial cells), such methods can minimize the risk of incorrect recognition. In this article, we apply the state of the art method for texture analysis to classify genera and species of bacteria. This method uses deep Convolutional Neural Networks to obtain image descriptors, which are then encoded and classified with Support Vector Machine or Random Forest. To evaluate this approach and to make it comparable with other approaches, we provide a new dataset of images. DIBaS dataset (Digital Image of Bacterial Species) contains 660 images with 33 different genera and species of bacteria.
Anatomical entity mention recognition at literature scale
Pyysalo, Sampo; Ananiadou, Sophia
2014-01-01
Motivation: Anatomical entities ranging from subcellular structures to organ systems are central to biomedical science, and mentions of these entities are essential to understanding the scientific literature. Despite extensive efforts to automatically analyze various aspects of biomedical text, there have been only few studies focusing on anatomical entities, and no dedicated methods for learning to automatically recognize anatomical entity mentions in free-form text have been introduced. Results: We present AnatomyTagger, a machine learning-based system for anatomical entity mention recognition. The system incorporates a broad array of approaches proposed to benefit tagging, including the use of Unified Medical Language System (UMLS)- and Open Biomedical Ontologies (OBO)-based lexical resources, word representations induced from unlabeled text, statistical truecasing and non-local features. We train and evaluate the system on a newly introduced corpus that substantially extends on previously available resources, and apply the resulting tagger to automatically annotate the entire open access scientific domain literature. The resulting analyses have been applied to extend services provided by the Europe PubMed Central literature database. Availability and implementation: All tools and resources introduced in this work are available from http://nactem.ac.uk/anatomytagger. Contact: sophia.ananiadou@manchester.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:24162468
Toth, Laszlo; Hoffmann, Ildiko; Gosztolya, Gabor; Vincze, Veronika; Szatloczki, Greta; Banreti, Zoltan; Pakaski, Magdolna; Kalman, Janos
2018-01-01
Even today the reliable diagnosis of the prodromal stages of Alzheimer's disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive decline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech signals, first manually (using the Praat software), and then automatically, with an automatic speech recognition (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process - that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78.8%. The temporal analysis of spontaneous speech can be exploited in implementing a new, automatic detection-based tool for screening MCI for the community. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Automatic integration of social information in emotion recognition.
Mumenthaler, Christian; Sander, David
2015-04-01
This study investigated the automaticity of the influence of social inference on emotion recognition. Participants were asked to recognize dynamic facial expressions of emotion (fear or anger in Experiment 1 and blends of fear and surprise or of anger and disgust in Experiment 2) in a target face presented at the center of a screen while a subliminal contextual face appearing in the periphery expressed an emotion (fear or anger) or not (neutral) and either looked at the target face or not. Results of Experiment 1 revealed that recognition of the target emotion of fear was improved when a subliminal angry contextual face gazed toward-rather than away from-the fearful face. We replicated this effect in Experiment 2, in which facial expression blends of fear and surprise were more often and more rapidly categorized as expressing fear when the subliminal contextual face expressed anger and gazed toward-rather than away from-the target face. With the contextual face appearing for 30 ms in total, including only 10 ms of emotion expression, and being immediately masked, our data provide the first evidence that social influence on emotion recognition can occur automatically. (c) 2015 APA, all rights reserved).
Botti, F; Alexander, A; Drygajlo, A
2004-12-02
This paper deals with a procedure to compensate for mismatched recording conditions in forensic speaker recognition, using a statistical score normalization. Bayesian interpretation of the evidence in forensic automatic speaker recognition depends on three sets of recordings in order to perform forensic casework: reference (R) and control (C) recordings of the suspect, and a potential population database (P), as well as a questioned recording (QR) . The requirement of similar recording conditions between suspect control database (C) and the questioned recording (QR) is often not satisfied in real forensic cases. The aim of this paper is to investigate a procedure of normalization of scores, which is based on an adaptation of the Test-normalization (T-norm) [2] technique used in the speaker verification domain, to compensate for the mismatch. Polyphone IPSC-02 database and ASPIC (an automatic speaker recognition system developed by EPFL and IPS-UNIL in Lausanne, Switzerland) were used in order to test the normalization procedure. Experimental results for three different recording condition scenarios are presented using Tippett plots and the effect of the compensation on the evaluation of the strength of the evidence is discussed.
Automatic image database generation from CAD for 3D object recognition
NASA Astrophysics Data System (ADS)
Sardana, Harish K.; Daemi, Mohammad F.; Ibrahim, Mohammad K.
1993-06-01
The development and evaluation of Multiple-View 3-D object recognition systems is based on a large set of model images. Due to the various advantages of using CAD, it is becoming more and more practical to use existing CAD data in computer vision systems. Current PC- level CAD systems are capable of providing physical image modelling and rendering involving positional variations in cameras, light sources etc. We have formulated a modular scheme for automatic generation of various aspects (views) of the objects in a model based 3-D object recognition system. These views are generated at desired orientations on the unit Gaussian sphere. With a suitable network file sharing system (NFS), the images can directly be stored on a database located on a file server. This paper presents the image modelling solutions using CAD in relation to multiple-view approach. Our modular scheme for data conversion and automatic image database storage for such a system is discussed. We have used this approach in 3-D polyhedron recognition. An overview of the results, advantages and limitations of using CAD data and conclusions using such as scheme are also presented.
Testing Saliency Parameters for Automatic Target Recognition
NASA Technical Reports Server (NTRS)
Pandya, Sagar
2012-01-01
A bottom-up visual attention model (the saliency model) is tested to enhance the performance of Automated Target Recognition (ATR). JPL has developed an ATR system that identifies regions of interest (ROI) using a trained OT-MACH filter, and then classifies potential targets as true- or false-positives using machine-learning techniques. In this project, saliency is used as a pre-processing step to reduce the space for performing OT-MACH filtering. Saliency parameters, such as output level and orientation weight, are tuned to detect known target features. Preliminary results are promising and future work entails a rigrous and parameter-based search to gain maximum insight about this method.
Speaker emotion recognition: from classical classifiers to deep neural networks
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2018-04-01
Speaker emotion recognition is considered among the most challenging tasks in recent years. In fact, automatic systems for security, medicine or education can be improved when considering the speech affective state. In this paper, a twofold approach for speech emotion classification is proposed. At the first side, a relevant set of features is adopted, and then at the second one, numerous supervised training techniques, involving classic methods as well as deep learning, are experimented. Experimental results indicate that deep architecture can improve classification performance on two affective databases, the Berlin Dataset of Emotional Speech and the SAVEE Dataset Surrey Audio-Visual Expressed Emotion.
Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn
2016-01-01
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486
Vignally, P; Fondi, G; Taggi, F; Pitidis, A
2011-03-31
In Italy the European Union Injury Database reports the involvement of chemical products in 0.9% of home and leisure accidents. The Emergency Department registry on domestic accidents in Italy and the Poison Control Centres record that 90% of cases of exposure to toxic substances occur in the home. It is not rare for the effects of chemical agents to be observed in hospitals, with a high potential risk of damage - the rate of this cause of hospital admission is double the domestic injury average. The aim of this study was to monitor the effects of injuries caused by caustic agents in Italy using automatic free-text recognition in Emergency Department medical databases. We created a Stata software program to automatically identify caustic or corrosive injury cases using an agent-specific list of keywords. We focused attention on the procedure's sensitivity and specificity. Ten hospitals in six regions of Italy participated in the study. The program identified 112 cases of injury by caustic or corrosive agents. Checking the cases by quality controls (based on manual reading of ED reports), we assessed 99 cases as true positive, i.e. 88.4% of the patients were automatically recognized by the software as being affected by caustic substances (99% CI: 80.6%- 96.2%), that is to say 0.59% (99% CI: 0.45%-0.76%) of the whole sample of home injuries, a value almost three times as high as that expected (p < 0.0001) from European codified information. False positives were 11.6% of the recognized cases (99% CI: 5.1%- 21.5%). Our automatic procedure for caustic agent identification proved to have excellent product recognition capacity with an acceptable level of excess sensitivity. Contrary to our a priori hypothesis, the automatic recognition system provided a level of identification of agents possessing caustic effects that was significantly much greater than was predictable on the basis of the values from current codifications reported in the European Database.
Key features for ATA / ATR database design in missile systems
NASA Astrophysics Data System (ADS)
Özertem, Kemal Arda
2017-05-01
Automatic target acquisition (ATA) and automatic target recognition (ATR) are two vital tasks for missile systems, and having a robust detection and recognition algorithm is crucial for overall system performance. In order to have a robust target detection and recognition algorithm, an extensive image database is required. Automatic target recognition algorithms use the database of images in training and testing steps of algorithm. This directly affects the recognition performance, since the training accuracy is driven by the quality of the image database. In addition, the performance of an automatic target detection algorithm can be measured effectively by using an image database. There are two main ways for designing an ATA / ATR database. The first and easy way is by using a scene generator. A scene generator can model the objects by considering its material information, the atmospheric conditions, detector type and the territory. Designing image database by using a scene generator is inexpensive and it allows creating many different scenarios quickly and easily. However the major drawback of using a scene generator is its low fidelity, since the images are created virtually. The second and difficult way is designing it using real-world images. Designing image database with real-world images is a lot more costly and time consuming; however it offers high fidelity, which is critical for missile algorithms. In this paper, critical concepts in ATA / ATR database design with real-world images are discussed. Each concept is discussed in the perspective of ATA and ATR separately. For the implementation stage, some possible solutions and trade-offs for creating the database are proposed, and all proposed approaches are compared to each other with regards to their pros and cons.
NASA Astrophysics Data System (ADS)
Xu, Guoping; Udupa, Jayaram K.; Tong, Yubing; Cao, Hanqiang; Odhner, Dewey; Torigian, Drew A.; Wu, Xingyu
2018-03-01
Currently, there are many papers that have been published on the detection and segmentation of lymph nodes from medical images. However, it is still a challenging problem owing to low contrast with surrounding soft tissues and the variations of lymph node size and shape on computed tomography (CT) images. This is particularly very difficult on low-dose CT of PET/CT acquisitions. In this study, we utilize our previous automatic anatomy recognition (AAR) framework to recognize the thoracic-lymph node stations defined by the International Association for the Study of Lung Cancer (IASLC) lymph node map. The lymph node stations themselves are viewed as anatomic objects and are localized by using a one-shot method in the AAR framework. Two strategies have been taken in this paper for integration into AAR framework. The first is to combine some lymph node stations into composite lymph node stations according to their geometrical nearness. The other is to find the optimal parent (organ or union of organs) as an anchor for each lymph node station based on the recognition error and thereby find an overall optimal hierarchy to arrange anchor organs and lymph node stations. Based on 28 contrast-enhanced thoracic CT image data sets for model building, 12 independent data sets for testing, our results show that thoracic lymph node stations can be localized within 2-3 voxels compared to the ground truth.
Cost/benefit analysis of electronic license plates
DOT National Transportation Integrated Search
2008-06-01
The objective of this report is to determine whether electronic vehicle recognition systems (EVR) or automatic license plate recognition systems (ALPR) would be beneficial to the Arizona Department of Transportation (AZDOT). EVR uses radio frequency ...
A Horizontal Tilt Correction Method for Ship License Numbers Recognition
NASA Astrophysics Data System (ADS)
Liu, Baolong; Zhang, Sanyuan; Hong, Zhenjie; Ye, Xiuzi
2018-02-01
An automatic ship license numbers (SLNs) recognition system plays a significant role in intelligent waterway transportation systems since it can be used to identify ships by recognizing the characters in SLNs. Tilt occurs frequently in many SLNs because the monitors and the ships usually have great vertical or horizontal angles, which decreases the accuracy and robustness of a SLNs recognition system significantly. In this paper, we present a horizontal tilt correction method for SLNs. For an input tilt SLN image, the proposed method accomplishes the correction task through three main steps. First, a MSER-based characters’ center-points computation algorithm is designed to compute the accurate center-points of the characters contained in the input SLN image. Second, a L 1- L 2 distance-based straight line is fitted to the computed center-points using M-estimator algorithm. The tilt angle is estimated at this stage. Finally, based on the computed tilt angle, an affine transformation rotation is conducted to rotate and to correct the input SLN horizontally. At last, the proposed method is tested on 200 tilt SLN images, the proposed method is proved to be effective with a tilt correction rate of 80.5%.
NASA Astrophysics Data System (ADS)
Pérez Rosas, Osvaldo G.; Rivera Martínez, José L.; Maldonado Cano, Luis A.; López Rodríguez, Mario; Amaya Reyes, Laura M.; Cano Martínez, Elizabeth; García Vázquez, Mireya S.; Ramírez Acosta, Alejandro A.
2017-09-01
The automatic identification and classification of musical genres based on the sound similarities to form musical textures, it is a very active investigation area. In this context it has been created recognition systems of musical genres, formed by time-frequency characteristics extraction methods and by classification methods. The selection of this methods are important for a good development in the recognition systems. In this article they are proposed the Mel-Frequency Cepstral Coefficients (MFCC) methods as a characteristic extractor and Support Vector Machines (SVM) as a classifier for our system. The stablished parameters of the MFCC method in the system by our time-frequency analysis, represents the gamma of Mexican culture musical genres in this article. For the precision of a classification system of musical genres it is necessary that the descriptors represent the correct spectrum of each gender; to achieve this we must realize a correct parametrization of the MFCC like the one we present in this article. With the system developed we get satisfactory detection results, where the least identification percentage of musical genres was 66.67% and the one with the most precision was 100%.
Face recognition in the thermal infrared domain
NASA Astrophysics Data System (ADS)
Kowalski, M.; Grudzień, A.; Palka, N.; Szustakowski, M.
2017-10-01
Biometrics refers to unique human characteristics. Each unique characteristic may be used to label and describe individuals and for automatic recognition of a person based on physiological or behavioural properties. One of the most natural and the most popular biometric trait is a face. The most common research methods on face recognition are based on visible light. State-of-the-art face recognition systems operating in the visible light spectrum achieve very high level of recognition accuracy under controlled environmental conditions. Thermal infrared imagery seems to be a promising alternative or complement to visible range imaging due to its relatively high resistance to illumination changes. A thermal infrared image of the human face presents its unique heat-signature and can be used for recognition. The characteristics of thermal images maintain advantages over visible light images, and can be used to improve algorithms of human face recognition in several aspects. Mid-wavelength or far-wavelength infrared also referred to as thermal infrared seems to be promising alternatives. We present the study on 1:1 recognition in thermal infrared domain. The two approaches we are considering are stand-off face verification of non-moving person as well as stop-less face verification on-the-move. The paper presents methodology of our studies and challenges for face recognition systems in the thermal infrared domain.
An adaptive Hidden Markov Model for activity recognition based on a wearable multi-sensor device
USDA-ARS?s Scientific Manuscript database
Human activity recognition is important in the study of personal health, wellness and lifestyle. In order to acquire human activity information from the personal space, many wearable multi-sensor devices have been developed. In this paper, a novel technique for automatic activity recognition based o...
Effects and modeling of phonetic and acoustic confusions in accented speech.
Fung, Pascale; Liu, Yi
2005-11-01
Accented speech recognition is more challenging than standard speech recognition due to the effects of phonetic and acoustic confusions. Phonetic confusion in accented speech occurs when an expected phone is pronounced as a different one, which leads to erroneous recognition. Acoustic confusion occurs when the pronounced phone is found to lie acoustically between two baseform models and can be equally recognized as either one. We propose that it is necessary to analyze and model these confusions separately in order to improve accented speech recognition without degrading standard speech recognition. Since low phonetic confusion units in accented speech do not give rise to automatic speech recognition errors, we focus on analyzing and reducing phonetic and acoustic confusability under high phonetic confusion conditions. We propose using likelihood ratio test to measure phonetic confusion, and asymmetric acoustic distance to measure acoustic confusion. Only accent-specific phonetic units with low acoustic confusion are used in an augmented pronunciation dictionary, while phonetic units with high acoustic confusion are reconstructed using decision tree merging. Experimental results show that our approach is effective and superior to methods modeling phonetic confusion or acoustic confusion alone in accented speech, with a significant 5.7% absolute WER reduction, without degrading standard speech recognition.
Artificial intelligence in sports on the example of weight training.
Novatchkov, Hristo; Baca, Arnold
2013-01-01
The overall goal of the present study was to illustrate the potential of artificial intelligence (AI) techniques in sports on the example of weight training. The research focused in particular on the implementation of pattern recognition methods for the evaluation of performed exercises on training machines. The data acquisition was carried out using way and cable force sensors attached to various weight machines, thereby enabling the measurement of essential displacement and force determinants during training. On the basis of the gathered data, it was consequently possible to deduce other significant characteristics like time periods or movement velocities. These parameters were applied for the development of intelligent methods adapted from conventional machine learning concepts, allowing an automatic assessment of the exercise technique and providing individuals with appropriate feedback. In practice, the implementation of such techniques could be crucial for the investigation of the quality of the execution, the assistance of athletes but also coaches, the training optimization and for prevention purposes. For the current study, the data was based on measurements from 15 rather inexperienced participants, performing 3-5 sets of 10-12 repetitions on a leg press machine. The initially preprocessed data was used for the extraction of significant features, on which supervised modeling methods were applied. Professional trainers were involved in the assessment and classification processes by analyzing the video recorded executions. The so far obtained modeling results showed good performance and prediction outcomes, indicating the feasibility and potency of AI techniques in assessing performances on weight training equipment automatically and providing sportsmen with prompt advice. Key pointsArtificial intelligence is a promising field for sport-related analysis.Implementations integrating pattern recognition techniques enable the automatic evaluation of data measurements.Artificial neural networks applied for the analysis of weight training data show good performance and high classification rates.
Artificial Intelligence in Sports on the Example of Weight Training
Novatchkov, Hristo; Baca, Arnold
2013-01-01
The overall goal of the present study was to illustrate the potential of artificial intelligence (AI) techniques in sports on the example of weight training. The research focused in particular on the implementation of pattern recognition methods for the evaluation of performed exercises on training machines. The data acquisition was carried out using way and cable force sensors attached to various weight machines, thereby enabling the measurement of essential displacement and force determinants during training. On the basis of the gathered data, it was consequently possible to deduce other significant characteristics like time periods or movement velocities. These parameters were applied for the development of intelligent methods adapted from conventional machine learning concepts, allowing an automatic assessment of the exercise technique and providing individuals with appropriate feedback. In practice, the implementation of such techniques could be crucial for the investigation of the quality of the execution, the assistance of athletes but also coaches, the training optimization and for prevention purposes. For the current study, the data was based on measurements from 15 rather inexperienced participants, performing 3-5 sets of 10-12 repetitions on a leg press machine. The initially preprocessed data was used for the extraction of significant features, on which supervised modeling methods were applied. Professional trainers were involved in the assessment and classification processes by analyzing the video recorded executions. The so far obtained modeling results showed good performance and prediction outcomes, indicating the feasibility and potency of AI techniques in assessing performances on weight training equipment automatically and providing sportsmen with prompt advice. Key points Artificial intelligence is a promising field for sport-related analysis. Implementations integrating pattern recognition techniques enable the automatic evaluation of data measurements. Artificial neural networks applied for the analysis of weight training data show good performance and high classification rates. PMID:24149722
Prosody's Contribution to Fluency: An Examination of the Theory of Automatic Information Processing
ERIC Educational Resources Information Center
Schrauben, Julie E.
2010-01-01
LaBerge and Samuels' (1974) theory of automatic information processing in reading offers a model that explains how and where the processing of information occurs and the degree to which processing of information occurs. These processes are dependent upon two criteria: accurate word decoding and automatic word recognition. However, LaBerge and…
Apply lightweight recognition algorithms in optical music recognition
NASA Astrophysics Data System (ADS)
Pham, Viet-Khoi; Nguyen, Hai-Dang; Nguyen-Khac, Tung-Anh; Tran, Minh-Triet
2015-02-01
The problems of digitalization and transformation of musical scores into machine-readable format are necessary to be solved since they help people to enjoy music, to learn music, to conserve music sheets, and even to assist music composers. However, the results of existing methods still require improvements for higher accuracy. Therefore, the authors propose lightweight algorithms for Optical Music Recognition to help people to recognize and automatically play musical scores. In our proposal, after removing staff lines and extracting symbols, each music symbol is represented as a grid of identical M ∗ N cells, and the features are extracted and classified with multiple lightweight SVM classifiers. Through experiments, the authors find that the size of 10 ∗ 12 cells yields the highest precision value. Experimental results on the dataset consisting of 4929 music symbols taken from 18 modern music sheets in the Synthetic Score Database show that our proposed method is able to classify printed musical scores with accuracy up to 99.56%.
Incremental concept learning with few training examples and hierarchical classification
NASA Astrophysics Data System (ADS)
Bouma, Henri; Eendebak, Pieter T.; Schutte, Klamer; Azzopardi, George; Burghouts, Gertjan J.
2015-10-01
Object recognition and localization are important to automatically interpret video and allow better querying on its content. We propose a method for object localization that learns incrementally and addresses four key aspects. Firstly, we show that for certain applications, recognition is feasible with only a few training samples. Secondly, we show that novel objects can be added incrementally without retraining existing objects, which is important for fast interaction. Thirdly, we show that an unbalanced number of positive training samples leads to biased classifier scores that can be corrected by modifying weights. Fourthly, we show that the detector performance can deteriorate due to hard-negative mining for similar or closely related classes (e.g., for Barbie and dress, because the doll is wearing a dress). This can be solved by our hierarchical classification. We introduce a new dataset, which we call TOSO, and use it to demonstrate the effectiveness of the proposed method for the localization and recognition of multiple objects in images.
NASA Astrophysics Data System (ADS)
Silva, Ricardo Petri; Naozuka, Gustavo Taiji; Mastelini, Saulo Martiello; Felinto, Alan Salvany
2018-01-01
The incidence of luminous reflections (LR) in captured images can interfere with the color of the affected regions. These regions tend to oversaturate, becoming whitish and, consequently, losing the original color information of the scene. Decision processes that employ images acquired from digital cameras can be impaired by the LR incidence. Such applications include real-time video surgeries, facial, and ocular recognition. This work proposes an algorithm called contrast enhancement of potential LR regions, which is a preprocessing to increase the contrast of potential LR regions, in order to improve the performance of automatic LR detectors. In addition, three automatic detectors were compared with and without the employment of our preprocessing method. The first one is a technique already consolidated in the literature called the Chang-Tseng threshold. We propose two automatic detectors called adapted histogram peak and global threshold. We employed four performance metrics to evaluate the detectors, namely, accuracy, precision, exactitude, and root mean square error. The exactitude metric is developed by this work. Thus, a manually defined reference model was created. The global threshold detector combined with our preprocessing method presented the best results, with an average exactitude rate of 82.47%.
Semi-automated contour recognition using DICOMautomaton
NASA Astrophysics Data System (ADS)
Clark, H.; Wu, J.; Moiseenko, V.; Lee, R.; Gill, B.; Duzenli, C.; Thomas, S.
2014-03-01
Purpose: A system has been developed which recognizes and classifies Digital Imaging and Communication in Medicine contour data with minimal human intervention. It allows researchers to overcome obstacles which tax analysis and mining systems, including inconsistent naming conventions and differences in data age or resolution. Methods: Lexicographic and geometric analysis is used for recognition. Well-known lexicographic methods implemented include Levenshtein-Damerau, bag-of-characters, Double Metaphone, Soundex, and (word and character)-N-grams. Geometrical implementations include 3D Fourier Descriptors, probability spheres, boolean overlap, simple feature comparison (e.g. eccentricity, volume) and rule-based techniques. Both analyses implement custom, domain-specific modules (e.g. emphasis differentiating left/right organ variants). Contour labels from 60 head and neck patients are used for cross-validation. Results: Mixed-lexicographical methods show an effective improvement in more than 10% of recognition attempts compared with a pure Levenshtein-Damerau approach when withholding 70% of the lexicon. Domain-specific and geometrical techniques further boost performance. Conclusions: DICOMautomaton allows users to recognize contours semi-automatically. As usage increases and the lexicon is filled with additional structures, performance improves, increasing the overall utility of the system.
NASA Astrophysics Data System (ADS)
Lv, Zheng; Sui, Haigang; Zhang, Xilin; Huang, Xianfeng
2007-11-01
As one of the most important geo-spatial objects and military establishment, airport is always a key target in fields of transportation and military affairs. Therefore, automatic recognition and extraction of airport from remote sensing images is very important and urgent for updating of civil aviation and military application. In this paper, a new multi-source data fusion approach on automatic airport information extraction, updating and 3D modeling is addressed. Corresponding key technologies including feature extraction of airport information based on a modified Ostu algorithm, automatic change detection based on new parallel lines-based buffer detection algorithm, 3D modeling based on gradual elimination of non-building points algorithm, 3D change detecting between old airport model and LIDAR data, typical CAD models imported and so on are discussed in detail. At last, based on these technologies, we develop a prototype system and the results show our method can achieve good effects.
Face Averages Enhance User Recognition for Smartphone Security
Robertson, David J.; Kramer, Robin S. S.; Burton, A. Mike
2015-01-01
Our recognition of familiar faces is excellent, and generalises across viewing conditions. However, unfamiliar face recognition is much poorer. For this reason, automatic face recognition systems might benefit from incorporating the advantages of familiarity. Here we put this to the test using the face verification system available on a popular smartphone (the Samsung Galaxy). In two experiments we tested the recognition performance of the smartphone when it was encoded with an individual’s ‘face-average’ – a representation derived from theories of human face perception. This technique significantly improved performance for both unconstrained celebrity images (Experiment 1) and for real faces (Experiment 2): users could unlock their phones more reliably when the device stored an average of the user’s face than when they stored a single image. This advantage was consistent across a wide variety of everyday viewing conditions. Furthermore, the benefit did not reduce the rejection of imposter faces. This benefit is brought about solely by consideration of suitable representations for automatic face recognition, and we argue that this is just as important as development of matching algorithms themselves. We propose that this representation could significantly improve recognition rates in everyday settings. PMID:25807251
Jung, Jaehoon; Yoon, Inhye; Paik, Joonki
2016-01-01
This paper presents an object occlusion detection algorithm using object depth information that is estimated by automatic camera calibration. The object occlusion problem is a major factor to degrade the performance of object tracking and recognition. To detect an object occlusion, the proposed algorithm consists of three steps: (i) automatic camera calibration using both moving objects and a background structure; (ii) object depth estimation; and (iii) detection of occluded regions. The proposed algorithm estimates the depth of the object without extra sensors but with a generic red, green and blue (RGB) camera. As a result, the proposed algorithm can be applied to improve the performance of object tracking and object recognition algorithms for video surveillance systems. PMID:27347978
Automatic identification and normalization of dosage forms in drug monographs
2012-01-01
Background Each day, millions of health consumers seek drug-related information on the Web. Despite some efforts in linking related resources, drug information is largely scattered in a wide variety of websites of different quality and credibility. Methods As a step toward providing users with integrated access to multiple trustworthy drug resources, we aim to develop a method capable of identifying drug's dosage form information in addition to drug name recognition. We developed rules and patterns for identifying dosage forms from different sections of full-text drug monographs, and subsequently normalized them to standardized RxNorm dosage forms. Results Our method represents a significant improvement compared with a baseline lookup approach, achieving overall macro-averaged Precision of 80%, Recall of 98%, and F-Measure of 85%. Conclusions We successfully developed an automatic approach for drug dosage form identification, which is critical for building links between different drug-related resources. PMID:22336431
Remote logo detection using angle-distance histograms
NASA Astrophysics Data System (ADS)
Youn, Sungwook; Ok, Jiheon; Baek, Sangwook; Woo, Seongyoun; Lee, Chulhee
2016-05-01
Among all the various computer vision applications, automatic logo recognition has drawn great interest from industry as well as various academic institutions. In this paper, we propose an angle-distance map, which we used to develop a robust logo detection algorithm. The proposed angle-distance histogram is invariant against scale and rotation. The proposed method first used shape information and color characteristics to find the candidate regions and then applied the angle-distance histogram. Experiments show that the proposed method detected logos of various sizes and orientations.
Space infrared telescope pointing control system. Automated star pattern recognition
NASA Technical Reports Server (NTRS)
Powell, J. D.; Vanbezooijen, R. W. H.
1985-01-01
The Space Infrared Telescope Facility (SIRTF) is a free flying spacecraft carrying a 1 meter class cryogenically cooled infrared telescope nearly three oders of magnitude most sensitive than the current generation of infrared telescopes. Three automatic target acquisition methods will be presented that are based on the use of an imaging star tracker. The methods are distinguished by the number of guidestars that are required per target, the amount of computational capability necessary, and the time required for the complete acquisition process. Each method is described in detail.
Photonic correlator pattern recognition: Application to autonomous docking
NASA Technical Reports Server (NTRS)
Sjolander, Gary W.
1991-01-01
Optical correlators for real-time automatic pattern recognition applications have recently become feasible due to advances in high speed devices and filter formulation concepts. The devices are discussed in the context of their use in autonomous docking.
Measurement Marker Recognition In A Time Sequence Of Infrared Images For Biomedical Applications
NASA Astrophysics Data System (ADS)
Fiorini, A. R.; Fumero, R.; Marchesi, R.
1986-03-01
In thermographic measurements, quantitative surface temperature evaluation is often uncertain. The main reason is in the lack of available reference points in transient conditions. Reflective markers were used for automatic marker recognition and pixel coordinate computations. An algorithm selects marker icons to match marker references where particular luminance conditions are satisfied. Automatic marker recognition allows luminance compensation and temperature calibration of recorded infrared images. A biomedical application is presented: the dynamic behaviour of the surface temperature distributions is investigated in order to study the performance of two different pumping systems for extracorporeal circulation. Sequences of images are compared and results are discussed. Finally, the algorithm allows to monitor the experimental environment and to alert for the presence of unusual experimental conditions.
Towards automatic musical instrument timbre recognition
NASA Astrophysics Data System (ADS)
Park, Tae Hong
This dissertation is comprised of two parts---focus on issues concerning research and development of an artificial system for automatic musical instrument timbre recognition and musical compositions. The technical part of the essay includes a detailed record of developed and implemented algorithms for feature extraction and pattern recognition. A review of existing literature introducing historical aspects surrounding timbre research, problems associated with a number of timbre definitions, and highlights of selected research activities that have had significant impact in this field are also included. The developed timbre recognition system follows a bottom-up, data-driven model that includes a pre-processing module, feature extraction module, and a RBF/EBF (Radial/Elliptical Basis Function) neural network-based pattern recognition module. 829 monophonic samples from 12 instruments have been chosen from the Peter Siedlaczek library (Best Service) and other samples from the Internet and personal collections. Significant emphasis has been put on feature extraction development and testing to achieve robust and consistent feature vectors that are eventually passed to the neural network module. In order to avoid a garbage-in-garbage-out (GIGO) trap and improve generality, extra care was taken in designing and testing the developed algorithms using various dynamics, different playing techniques, and a variety of pitches for each instrument with inclusion of attack and steady-state portions of a signal. Most of the research and development was conducted in Matlab. The compositional part of the essay includes brief introductions to "A d'Ess Are ," "Aboji," "48 13 N, 16 20 O," and "pH-SQ." A general outline pertaining to the ideas and concepts behind the architectural designs of the pieces including formal structures, time structures, orchestration methods, and pitch structures are also presented.
Audio-visual affective expression recognition
NASA Astrophysics Data System (ADS)
Huang, Thomas S.; Zeng, Zhihong
2007-11-01
Automatic affective expression recognition has attracted more and more attention of researchers from different disciplines, which will significantly contribute to a new paradigm for human computer interaction (affect-sensitive interfaces, socially intelligent environments) and advance the research in the affect-related fields including psychology, psychiatry, and education. Multimodal information integration is a process that enables human to assess affective states robustly and flexibly. In order to understand the richness and subtleness of human emotion behavior, the computer should be able to integrate information from multiple sensors. We introduce in this paper our efforts toward machine understanding of audio-visual affective behavior, based on both deliberate and spontaneous displays. Some promising methods are presented to integrate information from both audio and visual modalities. Our experiments show the advantage of audio-visual fusion in affective expression recognition over audio-only or visual-only approaches.
Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.
Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan
2017-01-01
Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.
Combination of minimum enclosing balls classifier with SVM in coal-rock recognition
Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan
2017-01-01
Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition. PMID:28937987
An automatic iris occlusion estimation method based on high-dimensional density estimation.
Li, Yung-Hui; Savvides, Marios
2013-04-01
Iris masks play an important role in iris recognition. They indicate which part of the iris texture map is useful and which part is occluded or contaminated by noisy image artifacts such as eyelashes, eyelids, eyeglasses frames, and specular reflections. The accuracy of the iris mask is extremely important. The performance of the iris recognition system will decrease dramatically when the iris mask is inaccurate, even when the best recognition algorithm is used. Traditionally, people used the rule-based algorithms to estimate iris masks from iris images. However, the accuracy of the iris masks generated this way is questionable. In this work, we propose to use Figueiredo and Jain's Gaussian Mixture Models (FJ-GMMs) to model the underlying probabilistic distributions of both valid and invalid regions on iris images. We also explored possible features and found that Gabor Filter Bank (GFB) provides the most discriminative information for our goal. Finally, we applied Simulated Annealing (SA) technique to optimize the parameters of GFB in order to achieve the best recognition rate. Experimental results show that the masks generated by the proposed algorithm increase the iris recognition rate on both ICE2 and UBIRIS dataset, verifying the effectiveness and importance of our proposed method for iris occlusion estimation.
Automatic speech recognition using a predictive echo state network classifier.
Skowronski, Mark D; Harris, John G
2007-04-01
We have combined an echo state network (ESN) with a competitive state machine framework to create a classification engine called the predictive ESN classifier. We derive the expressions for training the predictive ESN classifier and show that the model was significantly more noise robust compared to a hidden Markov model in noisy speech classification experiments by 8+/-1 dB signal-to-noise ratio. The simple training algorithm and noise robustness of the predictive ESN classifier make it an attractive classification engine for automatic speech recognition.
Flexible methods for segmentation evaluation: results from CT-based luggage screening.
Karimi, Seemeen; Jiang, Xiaoqian; Cosman, Pamela; Martz, Harry
2014-01-01
Imaging systems used in aviation security include segmentation algorithms in an automatic threat recognition pipeline. The segmentation algorithms evolve in response to emerging threats and changing performance requirements. Analysis of segmentation algorithms' behavior, including the nature of errors and feature recovery, facilitates their development. However, evaluation methods from the literature provide limited characterization of the segmentation algorithms. To develop segmentation evaluation methods that measure systematic errors such as oversegmentation and undersegmentation, outliers, and overall errors. The methods must measure feature recovery and allow us to prioritize segments. We developed two complementary evaluation methods using statistical techniques and information theory. We also created a semi-automatic method to define ground truth from 3D images. We applied our methods to evaluate five segmentation algorithms developed for CT luggage screening. We validated our methods with synthetic problems and an observer evaluation. Both methods selected the same best segmentation algorithm. Human evaluation confirmed the findings. The measurement of systematic errors and prioritization helped in understanding the behavior of each segmentation algorithm. Our evaluation methods allow us to measure and explain the accuracy of segmentation algorithms.
Sun, Weifang; Yao, Bin; Zeng, Nianyin; Chen, Binqiang; He, Yuchao; Cao, Xincheng; He, Wangpeng
2017-07-12
As a typical example of large and complex mechanical systems, rotating machinery is prone to diversified sorts of mechanical faults. Among these faults, one of the prominent causes of malfunction is generated in gear transmission chains. Although they can be collected via vibration signals, the fault signatures are always submerged in overwhelming interfering contents. Therefore, identifying the critical fault's characteristic signal is far from an easy task. In order to improve the recognition accuracy of a fault's characteristic signal, a novel intelligent fault diagnosis method is presented. In this method, a dual-tree complex wavelet transform (DTCWT) is employed to acquire the multiscale signal's features. In addition, a convolutional neural network (CNN) approach is utilized to automatically recognise a fault feature from the multiscale signal features. The experiment results of the recognition for gear faults show the feasibility and effectiveness of the proposed method, especially in the gear's weak fault features.
[Terahertz Spectroscopic Identification with Deep Belief Network].
Ma, Shuai; Shen, Tao; Wang, Rui-qi; Lai, Hua; Yu, Zheng-tao
2015-12-01
Feature extraction and classification are the key issues of terahertz spectroscopy identification. Because many materials have no apparent absorption peaks in the terahertz band, it is difficult to extract theirs terahertz spectroscopy feature and identify. To this end, a novel of identify terahertz spectroscopy approach with Deep Belief Network (DBN) was studied in this paper, which combines the advantages of DBN and K-Nearest Neighbors (KNN) classifier. Firstly, cubic spline interpolation and S-G filter were used to normalize the eight kinds of substances (ATP, Acetylcholine Bromide, Bifenthrin, Buprofezin, Carbazole, Bleomycin, Buckminster and Cylotriphosphazene) terahertz transmission spectra in the range of 0.9-6 THz. Secondly, the DBN model was built by two restricted Boltzmann machine (RBM) and then trained layer by layer using unsupervised approach. Instead of using handmade features, the DBN was employed to learn suitable features automatically with raw input data. Finally, a KNN classifier was applied to identify the terahertz spectrum. Experimental results show that using the feature learned by DBN can identify the terahertz spectrum of different substances with the recognition rate of over 90%, which demonstrates that the proposed method can automatically extract the effective features of terahertz spectrum. Furthermore, this KNN classifier was compared with others (BP neural network, SOM neural network and RBF neural network). Comparisons showed that the recognition rate of KNN classifier is better than the other three classifiers. Using the approach that automatic extract terahertz spectrum features by DBN can greatly reduce the workload of feature extraction. This proposed method shows a promising future in the application of identifying the mass terahertz spectroscopy.
NASA Astrophysics Data System (ADS)
Perner, Petra
2017-03-01
Molecular image-based techniques are widely used in medicine to detect specific diseases. Look diagnosis is an important issue but also the analysis of the eye plays an important role in order to detect specific diseases. These topics are important topics in medicine and the standardization of these topics by an automatic system can be a new challenging field for machine vision. Compared to iris recognition has the iris diagnosis much more higher demands for the image acquisition and interpretation of the iris. One understands by iris diagnosis (Iridology) the investigation and analysis of the colored part of the eye, the iris, to discover factors, which play an important role for the prevention and treatment of illnesses, but also for the preservation of an optimum health. An automatic system would pave the way for a much wider use of the iris diagnosis for the diagnosis of illnesses and for the purpose of individual health protection. With this paper, we describe our work towards an automatic iris diagnosis system. We describe the image acquisition and the problems with it. Different ways are explained for image acquisition and image preprocessing. We describe the image analysis method for the detection of the iris. The meta-model for image interpretation is given. Based on this model we show the many tasks for image analysis that range from different image-object feature analysis, spatial image analysis to color image analysis. Our first results for the recognition of the iris are given. We describe how detecting the pupil and not wanted lamp spots. We explain how to recognize orange blue spots in the iris and match them against the topological map of the iris. Finally, we give an outlook for further work.
Zhang, Chen; Sun, Chao; Gao, Liqiang; Zheng, Nenggan; Chen, Weidong; Zheng, Xiaoxiang
2013-01-01
Bio-robots based on brain computer interface (BCI) suffer from the lack of considering the characteristic of the animals in navigation. This paper proposed a new method for bio-robots' automatic navigation combining the reward generating algorithm base on Reinforcement Learning (RL) with the learning intelligence of animals together. Given the graded electrical reward, the animal e.g. the rat, intends to seek the maximum reward while exploring an unknown environment. Since the rat has excellent spatial recognition, the rat-robot and the RL algorithm can convergent to an optimal route by co-learning. This work has significant inspiration for the practical development of bio-robots' navigation with hybrid intelligence.
Automated Meteor Detection by All-Sky Digital Camera Systems
NASA Astrophysics Data System (ADS)
Suk, Tomáš; Šimberová, Stanislava
2017-12-01
We have developed a set of methods to detect meteor light traces captured by all-sky CCD cameras. Operating at small automatic observatories (stations), these cameras create a network spread over a large territory. Image data coming from these stations are merged in one central node. Since a vast amount of data is collected by the stations in a single night, robotic storage and analysis are essential to processing. The proposed methodology is adapted to data from a network of automatic stations equipped with digital fish-eye cameras and includes data capturing, preparation, pre-processing, analysis, and finally recognition of objects in time sequences. In our experiments we utilized real observed data from two stations.
Thai Automatic Speech Recognition
2005-01-01
used in an external DARPA evaluation involving medical scenarios between an American Doctor and a naïve monolingual Thai patient. 2. Thai Language... dictionary generation more challenging, and (3) the lack of word segmentation, which calls for automatic segmentation approaches to make n-gram language...requires a dictionary and provides various segmentation algorithms to automatically select suitable segmentations. Here we used a maximal matching
A multilingual gold-standard corpus for biomedical concept recognition: the Mantra GSC.
Kors, Jan A; Clematide, Simon; Akhondi, Saber A; van Mulligen, Erik M; Rebholz-Schuhmann, Dietrich
2015-09-01
To create a multilingual gold-standard corpus for biomedical concept recognition. We selected text units from different parallel corpora (Medline abstract titles, drug labels, biomedical patent claims) in English, French, German, Spanish, and Dutch. Three annotators per language independently annotated the biomedical concepts, based on a subset of the Unified Medical Language System and covering a wide range of semantic groups. To reduce the annotation workload, automatically generated preannotations were provided. Individual annotations were automatically harmonized and then adjudicated, and cross-language consistency checks were carried out to arrive at the final annotations. The number of final annotations was 5530. Inter-annotator agreement scores indicate good agreement (median F-score 0.79), and are similar to those between individual annotators and the gold standard. The automatically generated harmonized annotation set for each language performed equally well as the best annotator for that language. The use of automatic preannotations, harmonized annotations, and parallel corpora helped to keep the manual annotation efforts manageable. The inter-annotator agreement scores provide a reference standard for gauging the performance of automatic annotation techniques. To our knowledge, this is the first gold-standard corpus for biomedical concept recognition in languages other than English. Other distinguishing features are the wide variety of semantic groups that are being covered, and the diversity of text genres that were annotated. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
[Research on automatic external defibrillator based on DSP].
Jing, Jun; Ding, Jingyan; Zhang, Wei; Hong, Wenxue
2012-10-01
Electrical defibrillation is the most effective way to treat the ventricular tachycardia (VT) and ventricular fibrillation (VF). An automatic external defibrillator based on DSP is introduced in this paper. The whole design consists of the signal collection module, the microprocessor controlingl module, the display module, the defibrillation module and the automatic recognition algorithm for VF and non VF, etc. This automatic external defibrillator has achieved goals such as ECG signal real-time acquisition, ECG wave synchronous display, data delivering to U disk and automatic defibrillate when shockable rhythm appears, etc.
A robust automatic phase correction method for signal dense spectra
NASA Astrophysics Data System (ADS)
Bao, Qingjia; Feng, Jiwen; Chen, Li; Chen, Fang; Liu, Zao; Jiang, Bin; Liu, Chaoyang
2013-09-01
A robust automatic phase correction method for Nuclear Magnetic Resonance (NMR) spectra is presented. In this work, a new strategy combining ‘coarse tuning' with ‘fine tuning' is introduced to correct various spectra accurately. In the ‘coarse tuning' procedure, a new robust baseline recognition method is proposed for determining the positions of the tail ends of the peaks, and then the preliminary phased spectra are obtained by minimizing the objective function based on the height difference of these tail ends. After the ‘coarse tuning', the peaks in the preliminary corrected spectra can be categorized into three classes: positive, negative, and distorted. Based on the classification result, a new custom negative penalty function used in the step of ‘fine tuning' is constructed to avoid the negative peak points in the spectra excluded in the negative peaks and distorted peaks. Finally, the fine phased spectra can be obtained by minimizing the custom negative penalty function. This method is proven to be very robust for it is tolerant to low signal-to-noise ratio, large baseline distortion and independent of the starting search points of phasing parameters. The experimental results on both 1D metabonomics spectra with over-crowded peaks and 2D spectra demonstrate the high efficiency of this automatic method.
Automatic detection of Martian dark slope streaks by machine learning using HiRISE images
NASA Astrophysics Data System (ADS)
Wang, Yexin; Di, Kaichang; Xin, Xin; Wan, Wenhui
2017-07-01
Dark slope streaks (DSSs) on the Martian surface are one of the active geologic features that can be observed on Mars nowadays. The detection of DSS is a prerequisite for studying its appearance, morphology, and distribution to reveal its underlying geological mechanisms. In addition, increasingly massive amounts of Mars high resolution data are now available. Hence, an automatic detection method for locating DSSs is highly desirable. In this research, we present an automatic DSS detection method by combining interest region extraction and machine learning techniques. The interest region extraction combines gradient and regional grayscale information. Moreover, a novel recognition strategy is proposed that takes the normalized minimum bounding rectangles (MBRs) of the extracted regions to calculate the Local Binary Pattern (LBP) feature and train a DSS classifier using the Adaboost machine learning algorithm. Comparative experiments using five different feature descriptors and three different machine learning algorithms show the superiority of the proposed method. Experimental results utilizing 888 extracted region samples from 28 HiRISE images show that the overall detection accuracy of our proposed method is 92.4%, with a true positive rate of 79.1% and false positive rate of 3.7%, which in particular indicates great performance of the method at eliminating non-DSS regions.
[The endpoint detection of cough signal in continuous speech].
Yang, Guoqing; Mo, Hongqiang; Li, Wen; Lian, Lianfang; Zheng, Zeguang
2010-06-01
The endpoint detection of cough signal in continuous speech has been researched in order to improve the efficiency and veracity of manual recognition or computer-based automatic recognition. First, using the short time zero crossing ratio(ZCR) for identifying the suspicious coughs and getting the threshold of short time energy based on acoustic characteristics of cough. Then, the short time energy is combined with short time ZCR in order to implement the endpoint detection of cough in continuous speech. To evaluate the effect of the method, first, the virtual number of coughs in each recording was identified by two experienced doctors using the graphical user interface (GUI). Second, the recordings were analyzed by automatic endpoint detection program under Matlab7.0. Finally, the comparison between these two results showed: The error rate of undetected cough is 2.18%, and 98.13% of noise, silence and speech were removed. The way of setting short time energy threshold is robust. The endpoint detection program can remove most speech and noise, thus maintaining a lower rate of error.
Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems
NASA Technical Reports Server (NTRS)
Ye, Sherry
2015-01-01
NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.
Automatic target recognition apparatus and method
Baumgart, Chris W.; Ciarcia, Christopher A.
2000-01-01
An automatic target recognition apparatus (10) is provided, having a video camera/digitizer (12) for producing a digitized image signal (20) representing an image containing therein objects which objects are to be recognized if they meet predefined criteria. The digitized image signal (20) is processed within a video analysis subroutine (22) residing in a computer (14) in a plurality of parallel analysis chains such that the objects are presumed to be lighter in shading than the background in the image in three of the chains and further such that the objects are presumed to be darker than the background in the other three chains. In two of the chains the objects are defined by surface texture analysis using texture filter operations. In another two of the chains the objects are defined by background subtraction operations. In yet another two of the chains the objects are defined by edge enhancement processes. In each of the analysis chains a calculation operation independently determines an error factor relating to the probability that the objects are of the type which should be recognized, and a probability calculation operation combines the results of the analysis chains.
A novel probabilistic framework for event-based speech recognition
NASA Astrophysics Data System (ADS)
Juneja, Amit; Espy-Wilson, Carol
2003-10-01
One of the reasons for unsatisfactory performance of the state-of-the-art automatic speech recognition (ASR) systems is the inferior acoustic modeling of low-level acoustic-phonetic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal, but such a system for continuous speech recognition (CSR) is not known to exist. A probabilistic and statistical framework for CSR based on the idea of the representation of speech sounds by bundles of binary valued articulatory phonetic features is proposed. Multiple probabilistic sequences of linguistically motivated landmarks are obtained using binary classifiers of manner phonetic features-syllabic, sonorant and continuant-and the knowledge-based acoustic parameters (APs) that are acoustic correlates of those features. The landmarks are then used for the extraction of knowledge-based APs for source and place phonetic features and their binary classification. Probabilistic landmark sequences are constrained using manner class language models for isolated or connected word recognition. The proposed method could overcome the disadvantages encountered by the early acoustic-phonetic knowledge-based systems that led the ASR community to switch to systems highly dependent on statistical pattern analysis methods and probabilistic language or grammar models.
Learning target masks in infrared linescan imagery
NASA Astrophysics Data System (ADS)
Fechner, Thomas; Rockinger, Oliver; Vogler, Axel; Knappe, Peter
1997-04-01
In this paper we propose a neural network based method for the automatic detection of ground targets in airborne infrared linescan imagery. Instead of using a dedicated feature extraction stage followed by a classification procedure, we propose the following three step scheme: In the first step of the recognition process, the input image is decomposed into its pyramid representation, thus obtaining a multiresolution signal representation. At the lowest three levels of the Laplacian pyramid a neural network filter of moderate size is trained to indicate the target location. The last step consists of a fusion process of the several neural network filters to obtain the final result. To perform this fusion we use a belief network to combine the various filter outputs in a statistical meaningful way. In addition, the belief network allows the integration of further knowledge about the image domain. By applying this multiresolution recognition scheme, we obtain a nearly scale- and rotational invariant target recognition with a significantly decreased false alarm rate compared with a single resolution target recognition scheme.
A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic
NASA Astrophysics Data System (ADS)
Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier
2015-01-01
In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.
Evaluation of Model Recognition for Grammar-Based Automatic 3d Building Model Reconstruction
NASA Astrophysics Data System (ADS)
Yu, Qian; Helmholz, Petra; Belton, David
2016-06-01
In recent years, 3D city models are in high demand by many public and private organisations, and the steadily growing capacity in both quality and quantity are increasing demand. The quality evaluation of these 3D models is a relevant issue both from the scientific and practical points of view. In this paper, we present a method for the quality evaluation of 3D building models which are reconstructed automatically from terrestrial laser scanning (TLS) data based on an attributed building grammar. The entire evaluation process has been performed in all the three dimensions in terms of completeness and correctness of the reconstruction. Six quality measures are introduced to apply on four datasets of reconstructed building models in order to describe the quality of the automatic reconstruction, and also are assessed on their validity from the evaluation point of view.
A System for Mailpiece ZIP Code Assignment through Contextual Analysis. Phase 2
1991-03-01
Segmentation Address Block Interpretation Automatic Feature Generation Word Recognition Feature Detection Word Verification Optical Character Recognition Directory...in the Phase III effort. 1.1 Motivation The United States Postal Service (USPS) deploys large numbers of optical character recognition (OCR) machines...4):208-218, November 1986. [2] Gronmeyer, L. K., Ruffin, B. W., Lybanon, M. A., Neely, P. L., and Pierce, S. E. An Overview of Optical Character Recognition (OCR
Visual object recognition for automatic micropropagation of plants
NASA Astrophysics Data System (ADS)
Brendel, Thorsten; Schwanke, Joerg; Jensch, Peter F.
1994-11-01
Micropropagation of plants is done by cutting juvenile plants and placing them into special container-boxes with nutrient-solution where the pieces can grow up and be cut again several times. To produce high amounts of biomass it is necessary to do plant micropropagation by a robotic system. In this paper we describe parts of the vision system that recognizes plants and their particular cutting points. Therefore, it is necessary to extract elements of the plants and relations between these elements (for example root, stem, leaf). Different species vary in their morphological appearance, variation is also immanent in plants of the same species. Therefore, we introduce several morphological classes of plants from that we expect same recognition methods.
Automatic recognition of lactating sow behaviors through depth image processing
USDA-ARS?s Scientific Manuscript database
Manual observation and classification of animal behaviors is laborious, time-consuming, and of limited ability to process large amount of data. A computer vision-based system was developed that automatically recognizes sow behaviors (lying, sitting, standing, kneeling, feeding, drinking, and shiftin...
Three-dimensional model-based object recognition and segmentation in cluttered scenes.
Mian, Ajmal S; Bennamoun, Mohammed; Owens, Robyn
2006-10-01
Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency.
NASA Astrophysics Data System (ADS)
Kroll, Christine; von der Werth, Monika; Leuck, Holger; Stahl, Christoph; Schertler, Klaus
2017-05-01
For Intelligence, Surveillance, Reconnaissance (ISR) missions of manned and unmanned air systems typical electrooptical payloads provide high-definition video data which has to be exploited with respect to relevant ground targets in real-time by automatic/assisted target recognition software. Airbus Defence and Space is developing required technologies for real-time sensor exploitation since years and has combined the latest advances of Deep Convolutional Neural Networks (CNN) with a proprietary high-speed Support Vector Machine (SVM) learning method into a powerful object recognition system with impressive results on relevant high-definition video scenes compared to conventional target recognition approaches. This paper describes the principal requirements for real-time target recognition in high-definition video for ISR missions and the Airbus approach of combining an invariant feature extraction using pre-trained CNNs and the high-speed training and classification ability of a novel frequency-domain SVM training method. The frequency-domain approach allows for a highly optimized implementation for General Purpose Computation on a Graphics Processing Unit (GPGPU) and also an efficient training of large training samples. The selected CNN which is pre-trained only once on domain-extrinsic data reveals a highly invariant feature extraction. This allows for a significantly reduced adaptation and training of the target recognition method for new target classes and mission scenarios. A comprehensive training and test dataset was defined and prepared using relevant high-definition airborne video sequences. The assessment concept is explained and performance results are given using the established precision-recall diagrams, average precision and runtime figures on representative test data. A comparison to legacy target recognition approaches shows the impressive performance increase by the proposed CNN+SVM machine-learning approach and the capability of real-time high-definition video exploitation.
iFER: facial expression recognition using automatically selected geometric eye and eyebrow features
NASA Astrophysics Data System (ADS)
Oztel, Ismail; Yolcu, Gozde; Oz, Cemil; Kazan, Serap; Bunyak, Filiz
2018-03-01
Facial expressions have an important role in interpersonal communications and estimation of emotional states or intentions. Automatic recognition of facial expressions has led to many practical applications and became one of the important topics in computer vision. We present a facial expression recognition system that relies on geometry-based features extracted from eye and eyebrow regions of the face. The proposed system detects keypoints on frontal face images and forms a feature set using geometric relationships among groups of detected keypoints. Obtained feature set is refined and reduced using the sequential forward selection (SFS) algorithm and fed to a support vector machine classifier to recognize five facial expression classes. The proposed system, iFER (eye-eyebrow only facial expression recognition), is robust to lower face occlusions that may be caused by beards, mustaches, scarves, etc. and lower face motion during speech production. Preliminary experiments on benchmark datasets produced promising results outperforming previous facial expression recognition studies using partial face features, and comparable results to studies using whole face information, only slightly lower by ˜ 2.5 % compared to the best whole face facial recognition system while using only ˜ 1 / 3 of the facial region.
Neural-network classifiers for automatic real-world aerial image recognition
NASA Astrophysics Data System (ADS)
Greenberg, Shlomo; Guterman, Hugo
1996-08-01
We describe the application of the multilayer perceptron (MLP) network and a version of the adaptive resonance theory version 2-A (ART 2-A) network to the problem of automatic aerial image recognition (AAIR). The classification of aerial images, independent of their positions and orientations, is required for automatic tracking and target recognition. Invariance is achieved by the use of different invariant feature spaces in combination with supervised and unsupervised neural networks. The performance of neural-network-based classifiers in conjunction with several types of invariant AAIR global features, such as the Fourier-transform space, Zernike moments, central moments, and polar transforms, are examined. The advantages of this approach are discussed. The performance of the MLP network is compared with that of a classical correlator. The MLP neural-network correlator outperformed the binary phase-only filter (BPOF) correlator. It was found that the ART 2-A distinguished itself with its speed and its low number of required training vectors. However, only the MLP classifier was able to deal with a combination of shift and rotation geometric distortions.
Neural-network classifiers for automatic real-world aerial image recognition.
Greenberg, S; Guterman, H
1996-08-10
We describe the application of the multilayer perceptron (MLP) network and a version of the adaptive resonance theory version 2-A (ART 2-A) network to the problem of automatic aerial image recognition (AAIR). The classification of aerial images, independent of their positions and orientations, is required for automatic tracking and target recognition. Invariance is achieved by the use of different invariant feature spaces in combination with supervised and unsupervised neural networks. The performance of neural-network-based classifiers in conjunction with several types of invariant AAIR global features, such as the Fourier-transform space, Zernike moments, central moments, and polar transforms, are examined. The advantages of this approach are discussed. The performance of the MLP network is compared with that of a classical correlator. The MLP neural-network correlator outperformed the binary phase-only filter (BPOF) correlator. It was found that the ART 2-A distinguished itself with its speed and its low number of required training vectors. However, only the MLP classifier was able to deal with a combination of shift and rotation geometric distortions.
Fashioning the Face: Sensorimotor Simulation Contributes to Facial Expression Recognition.
Wood, Adrienne; Rychlowska, Magdalena; Korb, Sebastian; Niedenthal, Paula
2016-03-01
When we observe a facial expression of emotion, we often mimic it. This automatic mimicry reflects underlying sensorimotor simulation that supports accurate emotion recognition. Why this is so is becoming more obvious: emotions are patterns of expressive, behavioral, physiological, and subjective feeling responses. Activation of one component can therefore automatically activate other components. When people simulate a perceived facial expression, they partially activate the corresponding emotional state in themselves, which provides a basis for inferring the underlying emotion of the expresser. We integrate recent evidence in favor of a role for sensorimotor simulation in emotion recognition. We then connect this account to a domain-general understanding of how sensory information from multiple modalities is integrated to generate perceptual predictions in the brain. Copyright © 2016 Elsevier Ltd. All rights reserved.
Object-oriented recognition of high-resolution remote sensing image
NASA Astrophysics Data System (ADS)
Wang, Yongyan; Li, Haitao; Chen, Hong; Xu, Yuannan
2016-01-01
With the development of remote sensing imaging technology and the improvement of multi-source image's resolution in satellite visible light, multi-spectral and hyper spectral , the high resolution remote sensing image has been widely used in various fields, for example military field, surveying and mapping, geophysical prospecting, environment and so forth. In remote sensing image, the segmentation of ground targets, feature extraction and the technology of automatic recognition are the hotspot and difficulty in the research of modern information technology. This paper also presents an object-oriented remote sensing image scene classification method. The method is consist of vehicles typical objects classification generation, nonparametric density estimation theory, mean shift segmentation theory, multi-scale corner detection algorithm, local shape matching algorithm based on template. Remote sensing vehicles image classification software system is designed and implemented to meet the requirements .
Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes.
Meyer, Bernd T; Brand, Thomas; Kollmeier, Birger
2011-01-01
The aim of this study is to quantify the gap between the recognition performance of human listeners and an automatic speech recognition (ASR) system with special focus on intrinsic variations of speech, such as speaking rate and effort, altered pitch, and the presence of dialect and accent. Second, it is investigated if the most common ASR features contain all information required to recognize speech in noisy environments by using resynthesized ASR features in listening experiments. For the phoneme recognition task, the ASR system achieved the human performance level only when the signal-to-noise ratio (SNR) was increased by 15 dB, which is an estimate for the human-machine gap in terms of the SNR. The major part of this gap is attributed to the feature extraction stage, since human listeners achieve comparable recognition scores when the SNR difference between unaltered and resynthesized utterances is 10 dB. Intrinsic variabilities result in strong increases of error rates, both in human speech recognition (HSR) and ASR (with a relative increase of up to 120%). An analysis of phoneme duration and recognition rates indicates that human listeners are better able to identify temporal cues than the machine at low SNRs, which suggests incorporating information about the temporal dynamics of speech into ASR systems.
Intelligent Automatic Right-Left Sign Lamp Based on Brain Signal Recognition System
NASA Astrophysics Data System (ADS)
Winda, A.; Sofyan; Sthevany; Vincent, R. S.
2017-12-01
Comfort as a part of the human factor, plays important roles in nowadays advanced automotive technology. Many of the current technologies go in the direction of automotive driver assistance features. However, many of the driver assistance features still require physical movement by human to enable the features. In this work, the proposed method is used in order to make certain feature to be functioning without any physical movement, instead human just need to think about it in their mind. In this work, brain signal is recorded and processed in order to be used as input to the recognition system. Right-Left sign lamp based on the brain signal recognition system can potentially replace the button or switch of the specific device in order to make the lamp work. The system then will decide whether the signal is ‘Right’ or ‘Left’. The decision of the Right-Left side of brain signal recognition will be sent to a processing board in order to activate the automotive relay, which will be used to activate the sign lamp. Furthermore, the intelligent system approach is used to develop authorized model based on the brain signal. Particularly Support Vector Machines (SVMs)-based classification system is used in the proposed system to recognize the Left-Right of the brain signal. Experimental results confirm the effectiveness of the proposed intelligent Automatic brain signal-based Right-Left sign lamp access control system. The signal is processed by Linear Prediction Coefficient (LPC) and Support Vector Machines (SVMs), and the resulting experiment shows the training and testing accuracy of 100% and 80%, respectively.
Hybrid neuro-fuzzy approach for automatic vehicle license plate recognition
NASA Astrophysics Data System (ADS)
Lee, Hsi-Chieh; Jong, Chung-Shi
1998-03-01
Most currently available vehicle identification systems use techniques such as R.F., microwave, or infrared to help identifying the vehicle. Transponders are usually installed in the vehicle in order to transmit the corresponding information to the sensory system. It is considered expensive to install a transponder in each vehicle and the malfunction of the transponder will result in the failure of the vehicle identification system. In this study, novel hybrid approach is proposed for automatic vehicle license plate recognition. A system prototype is built which can be used independently or cooperating with current vehicle identification system in identifying a vehicle. The prototype consists of four major modules including the module for license plate region identification, the module for character extraction from the license plate, the module for character recognition, and the module for the SimNet neuro-fuzzy system. To test the performance of the proposed system, three hundred and eighty vehicle image samples are taken by a digital camera. The license plate recognition success rate of the prototype is approximately 91% while the character recognition success rate of the prototype is approximately 97%.
Sparse and redundant representations for inverse problems and recognition
NASA Astrophysics Data System (ADS)
Patel, Vishal M.
Sparse and redundant representation of data enables the description of signals as linear combinations of a few atoms from a dictionary. In this dissertation, we study applications of sparse and redundant representations in inverse problems and object recognition. Furthermore, we propose two novel imaging modalities based on the recently introduced theory of Compressed Sensing (CS). This dissertation consists of four major parts. In the first part of the dissertation, we study a new type of deconvolution algorithm that is based on estimating the image from a shearlet decomposition. Shearlets provide a multi-directional and multi-scale decomposition that has been mathematically shown to represent distributed discontinuities such as edges better than traditional wavelets. We develop a deconvolution algorithm that allows for the approximation inversion operator to be controlled on a multi-scale and multi-directional basis. Furthermore, we develop a method for the automatic determination of the threshold values for the noise shrinkage for each scale and direction without explicit knowledge of the noise variance using a generalized cross validation method. In the second part of the dissertation, we study a reconstruction method that recovers highly undersampled images assumed to have a sparse representation in a gradient domain by using partial measurement samples that are collected in the Fourier domain. Our method makes use of a robust generalized Poisson solver that greatly aids in achieving a significantly improved performance over similar proposed methods. We will demonstrate by experiments that this new technique is more flexible to work with either random or restricted sampling scenarios better than its competitors. In the third part of the dissertation, we introduce a novel Synthetic Aperture Radar (SAR) imaging modality which can provide a high resolution map of the spatial distribution of targets and terrain using a significantly reduced number of needed transmitted and/or received electromagnetic waveforms. We demonstrate that this new imaging scheme, requires no new hardware components and allows the aperture to be compressed. Also, it presents many new applications and advantages which include strong resistance to countermesasures and interception, imaging much wider swaths and reduced on-board storage requirements. The last part of the dissertation deals with object recognition based on learning dictionaries for simultaneous sparse signal approximations and feature extraction. A dictionary is learned for each object class based on given training examples which minimize the representation error with a sparseness constraint. A novel test image is then projected onto the span of the atoms in each learned dictionary. The residual vectors along with the coefficients are then used for recognition. Applications to illumination robust face recognition and automatic target recognition are presented.
Neural networks: Alternatives to conventional techniques for automatic docking
NASA Technical Reports Server (NTRS)
Vinz, Bradley L.
1994-01-01
Automatic docking of orbiting spacecraft is a crucial operation involving the identification of vehicle orientation as well as complex approach dynamics. The chaser spacecraft must be able to recognize the target spacecraft within a scene and achieve accurate closing maneuvers. In a video-based system, a target scene must be captured and transformed into a pattern of pixels. Successful recognition lies in the interpretation of this pattern. Due to their powerful pattern recognition capabilities, artificial neural networks offer a potential role in interpretation and automatic docking processes. Neural networks can reduce the computational time required by existing image processing and control software. In addition, neural networks are capable of recognizing and adapting to changes in their dynamic environment, enabling enhanced performance, redundancy, and fault tolerance. Most neural networks are robust to failure, capable of continued operation with a slight degradation in performance after minor failures. This paper discusses the particular automatic docking tasks neural networks can perform as viable alternatives to conventional techniques.
A neural approach for improving the measurement capability of an electronic nose
NASA Astrophysics Data System (ADS)
Chimenti, M.; DeRossi, D.; Di Francesco, F.; Domenici, C.; Pieri, G.; Pioggia, G.; Salvetti, O.
2003-06-01
Electronic noses, instruments for automatic recognition of odours, are typically composed of an array of partially selective sensors, a sampling system, a data acquisition device and a data processing system. For the purpose of evaluating the quality of olive oil, an electronic nose based on an array of conducting polymer sensors capable of discriminating olive oil aromas was developed. The selection of suitable pattern recognition techniques for a particular application can enhance the performance of electronic noses. Therefore, an advanced neural recognition algorithm for improving the measurement capability of the device was designed and implemented. This method combines multivariate statistical analysis and a hierarchical neural-network architecture based on self-organizing maps and error back-propagation. The complete system was tested using samples composed of characteristic olive oil aromatic components in refined olive oil. The results obtained have shown that this approach is effective in grouping aromas into different categories representative of their chemical structure.
Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung
2018-01-01
Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases. PMID:29495417
Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung
2018-02-26
Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases.
Passive Polarimetric Information Processing for Target Classification
NASA Astrophysics Data System (ADS)
Sadjadi, Firooz; Sadjadi, Farzad
Polarimetric sensing is an area of active research in a variety of applications. In particular, the use of polarization diversity has been shown to improve performance in automatic target detection and recognition. Within the diverse scope of polarimetric sensing, the field of passive polarimetric sensing is of particular interest. This chapter presents several new methods for gathering in formation using such passive techniques. One method extracts three-dimensional (3D) information and surface properties using one or more sensors. Another method extracts scene-specific algebraic expressions that remain unchanged under polariza tion transformations (such as along the transmission path to the sensor).
Data handling and analysis for the 1971 corn blight watch experiment
NASA Technical Reports Server (NTRS)
Anuta, P. E.; Phillips, T. L.
1973-01-01
The overall corn blight watch experiment data flow is described and the organization of the LARS/Purdue data center is discussed. Data analysis techniques are discussed in general and the use of statistical multispectral pattern recognition methods for automatic computer analysis of aircraft scanner data is described. Some of the results obtained are discussed and the implications of the experiment on future data communication requirements for earth resource survey systems is discussed.
2013-09-30
method has been successfully implemented to automatically detect and recognize pulse trains from minke whales ( songs ) and sperm whales (Physeter...workshops, conferences and data challenges 2. Enhancements of the ASR algorithm for frequency-modulated sounds: Right Whale Study 3...Enhancements of the ASR algorithm for pulse trains: Minke Whale Study 4. Mining Big Data Sound Archives using High Performance Computing software and hardware
Multilingual Vocabularies in Automatic Speech Recognition
2000-08-01
monolingual (a few thousands) is an obstacle to a full generalization of the inventories, then moved to the multilingual case. In the approach towards the...direction of language independence. In this monolingual experiment, we developed two types of unit sets for paper, we extend the method presented in [3...sound ji is not assimilated 3.2.1 Monolingual experiments to the corresponding sound in Spanish, but it is left apart as a The baseline model for English
Facial Asymmetry-Based Age Group Estimation: Role in Recognizing Age-Separated Face Images.
Sajid, Muhammad; Taj, Imtiaz Ahmad; Bajwa, Usama Ijaz; Ratyal, Naeem Iqbal
2018-04-23
Face recognition aims to establish the identity of a person based on facial characteristics. On the other hand, age group estimation is the automatic calculation of an individual's age range based on facial features. Recognizing age-separated face images is still a challenging research problem due to complex aging processes involving different types of facial tissues, skin, fat, muscles, and bones. Certain holistic and local facial features are used to recognize age-separated face images. However, most of the existing methods recognize face images without incorporating the knowledge learned from age group estimation. In this paper, we propose an age-assisted face recognition approach to handle aging variations. Inspired by the observation that facial asymmetry is an age-dependent intrinsic facial feature, we first use asymmetric facial dimensions to estimate the age group of a given face image. Deeply learned asymmetric facial features are then extracted for face recognition using a deep convolutional neural network (dCNN). Finally, we integrate the knowledge learned from the age group estimation into the face recognition algorithm using the same dCNN. This integration results in a significant improvement in the overall performance compared to using the face recognition algorithm alone. The experimental results on two large facial aging datasets, the MORPH and FERET sets, show that the proposed age group estimation based on the face recognition approach yields superior performance compared to some existing state-of-the-art methods. © 2018 American Academy of Forensic Sciences.
Noise-robust speech recognition through auditory feature detection and spike sequence decoding.
Schafer, Phillip B; Jin, Dezhe Z
2014-03-01
Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition (ASR) systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons. Each neuron is trained to respond selectively to a brief spectrotemporal pattern, or feature, drawn from the simulated auditory nerve response to speech. The neural population conveys the time-dependent structure of a sound by its sequence of spikes. We compare two methods for decoding the spike sequences--one using a hidden Markov model-based recognizer, the other using a novel template-based recognition scheme. In the latter case, words are recognized by comparing their spike sequences to template sequences obtained from clean training data, using a similarity measure based on the length of the longest common sub-sequence. Using isolated spoken digits from the AURORA-2 database, we show that our combined system outperforms a state-of-the-art robust speech recognizer at low signal-to-noise ratios. Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods. Our system highlights potential advantages of spike-based acoustic coding and provides a biologically motivated framework for robust ASR development.
ASM Based Synthesis of Handwritten Arabic Text Pages
Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
Xu, Jing; Wang, Zhongbin; Tan, Chao; Si, Lei; Liu, Xinhua
2015-01-01
In order to guarantee the stable operation of shearers and promote construction of an automatic coal mining working face, an online cutting pattern recognition method with high accuracy and speed based on Improved Ensemble Empirical Mode Decomposition (IEEMD) and Probabilistic Neural Network (PNN) is proposed. An industrial microphone is installed on the shearer and the cutting sound is collected as the recognition criterion to overcome the disadvantages of giant size, contact measurement and low identification rate of traditional detectors. To avoid end-point effects and get rid of undesirable intrinsic mode function (IMF) components in the initial signal, IEEMD is conducted on the sound. The end-point continuation based on the practical storage data is performed first to overcome the end-point effect. Next the average correlation coefficient, which is calculated by the correlation of the first IMF with others, is introduced to select essential IMFs. Then the energy and standard deviation of the reminder IMFs are extracted as features and PNN is applied to classify the cutting patterns. Finally, a simulation example, with an accuracy of 92.67%, and an industrial application prove the efficiency and correctness of the proposed method. PMID:26528985
ASM Based Synthesis of Handwritten Arabic Text Pages.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
Puzzle test: A tool for non-analytical clinical reasoning assessment.
Monajemi, Alireza; Yaghmaei, Minoo
2016-01-01
Most contemporary clinical reasoning tests typically assess non-automatic thinking. Therefore, a test is needed to measure automatic reasoning or pattern recognition, which has been largely neglected in clinical reasoning tests. The Puzzle Test (PT) is dedicated to assess automatic clinical reasoning in routine situations. This test has been introduced first in 2009 by Monajemi et al in the Olympiad for Medical Sciences Students.PT is an item format that has gained acceptance in medical education, but no detailed guidelines exist for this test's format, construction and scoring. In this article, a format is described and the steps to prepare and administer valid and reliable PTs are presented. PT examines a specific clinical reasoning task: Pattern recognition. PT does not replace other clinical reasoning assessment tools. However, it complements them in strategies for assessing comprehensive clinical reasoning.
The Fringe Reading Facility at the Max-Planck-Institut fuer Stroemungsforschung
NASA Astrophysics Data System (ADS)
Becker, F.; Meier, G. E. A.; Wegner, H.; Timm, R.; Wenskus, R.
1987-05-01
A Mach-Zehnder interferometer is used for optical flow measurements in a transonic wind tunnel. Holographic interferograms are reconstructed by illumination with a He-Ne-laser and viewed by a video camera through wide angle optics. This setup was used for investigating industrial double exposure holograms of truck tires in order to develop methods of automatic recognition of certain manufacturing faults. Automatic input is achieved by a transient recorder digitizing the output of a TV camera and transferring the digitized data to a PDP11-34. Interest centered around sequences of interferograms showing the interaction of vortices with a profile and subsequent emission of sound generated by this process. The objective is the extraction of quantitative data which relates to the emission of noise.
The Fringe Reading Facility at the Max-Planck-Institut fuer Stroemungsforschung
NASA Technical Reports Server (NTRS)
Becker, F.; Meier, G. E. A.; Wegner, H.; Timm, R.; Wenskus, R.
1987-01-01
A Mach-Zehnder interferometer is used for optical flow measurements in a transonic wind tunnel. Holographic interferograms are reconstructed by illumination with a He-Ne-laser and viewed by a video camera through wide angle optics. This setup was used for investigating industrial double exposure holograms of truck tires in order to develop methods of automatic recognition of certain manufacturing faults. Automatic input is achieved by a transient recorder digitizing the output of a TV camera and transferring the digitized data to a PDP11-34. Interest centered around sequences of interferograms showing the interaction of vortices with a profile and subsequent emission of sound generated by this process. The objective is the extraction of quantitative data which relates to the emission of noise.
Automatic temporal segment detection via bilateral long short-term memory recurrent neural networks
NASA Astrophysics Data System (ADS)
Sun, Bo; Cao, Siming; He, Jun; Yu, Lejun; Li, Liandong
2017-03-01
Constrained by the physiology, the temporal factors associated with human behavior, irrespective of facial movement or body gesture, are described by four phases: neutral, onset, apex, and offset. Although they may benefit related recognition tasks, it is not easy to accurately detect such temporal segments. An automatic temporal segment detection framework using bilateral long short-term memory recurrent neural networks (BLSTM-RNN) to learn high-level temporal-spatial features, which synthesizes the local and global temporal-spatial information more efficiently, is presented. The framework is evaluated in detail over the face and body database (FABO). The comparison shows that the proposed framework outperforms state-of-the-art methods for solving the problem of temporal segment detection.
Combining heterogenous features for 3D hand-held object recognition
NASA Astrophysics Data System (ADS)
Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang
2014-10-01
Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
Automatic assessment of voice quality according to the GRBAS scale.
Sáenz-Lechón, Nicolás; Godino-Llorente, Juan I; Osma-Ruiz, Víctor; Blanco-Velasco, Manuel; Cruz-Roldán, Fernando
2006-01-01
Nowadays, the most extended techniques to measure the voice quality are based on perceptual evaluation by well trained professionals. The GRBAS scale is a widely used method for perceptual evaluation of voice quality. The GRBAS scale is widely used in Japan and there is increasing interest in both Europe and the United States. However, this technique needs well-trained experts, and is based on the evaluator's expertise, depending a lot on his own psycho-physical state. Furthermore, a great variability in the assessments performed from one evaluator to another is observed. Therefore, an objective method to provide such measurement of voice quality would be very valuable. In this paper, the automatic assessment of voice quality is addressed by means of short-term Mel cepstral parameters (MFCC), and learning vector quantization (LVQ) in a pattern recognition stage. Results show that this approach provides acceptable results for this purpose, with accuracy around 65% at the best.
A method for real-time implementation of HOG feature extraction
NASA Astrophysics Data System (ADS)
Luo, Hai-bo; Yu, Xin-rong; Liu, Hong-mei; Ding, Qing-hai
2011-08-01
Histogram of oriented gradient (HOG) is an efficient feature extraction scheme, and HOG descriptors are feature descriptors which is widely used in computer vision and image processing for the purpose of biometrics, target tracking, automatic target detection(ATD) and automatic target recognition(ATR) etc. However, computation of HOG feature extraction is unsuitable for hardware implementation since it includes complicated operations. In this paper, the optimal design method and theory frame for real-time HOG feature extraction based on FPGA were proposed. The main principle is as follows: firstly, the parallel gradient computing unit circuit based on parallel pipeline structure was designed. Secondly, the calculation of arctangent and square root operation was simplified. Finally, a histogram generator based on parallel pipeline structure was designed to calculate the histogram of each sub-region. Experimental results showed that the HOG extraction can be implemented in a pixel period by these computing units.
LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.
Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu
2005-01-01
Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.
SAR target recognition and posture estimation using spatial pyramid pooling within CNN
NASA Astrophysics Data System (ADS)
Peng, Lijiang; Liu, Xiaohua; Liu, Ming; Dong, Liquan; Hui, Mei; Zhao, Yuejin
2018-01-01
Many convolution neural networks(CNN) architectures have been proposed to strengthen the performance on synthetic aperture radar automatic target recognition (SAR-ATR) and obtained state-of-art results on targets classification on MSTAR database, but few methods concern about the estimation of depression angle and azimuth angle of targets. To get better effect on learning representation of hierarchies of features on both 10-class target classification task and target posture estimation tasks, we propose a new CNN architecture with spatial pyramid pooling(SPP) which can build high hierarchy of features map by dividing the convolved feature maps from finer to coarser levels to aggregate local features of SAR images. Experimental results on MSTAR database show that the proposed architecture can get high recognition accuracy as 99.57% on 10-class target classification task as the most current state-of-art methods, and also get excellent performance on target posture estimation tasks which pays attention to depression angle variety and azimuth angle variety. What's more, the results inspire us the application of deep learning on SAR target posture description.
Restoring the missing features of the corrupted speech using linear interpolation methods
NASA Astrophysics Data System (ADS)
Rassem, Taha H.; Makbol, Nasrin M.; Hasan, Ali Muttaleb; Zaki, Siti Syazni Mohd; Girija, P. N.
2017-10-01
One of the main challenges in the Automatic Speech Recognition (ASR) is the noise. The performance of the ASR system reduces significantly if the speech is corrupted by noise. In spectrogram representation of a speech signal, after deleting low Signal to Noise Ratio (SNR) elements, the incomplete spectrogram is obtained. In this case, the speech recognizer should make modifications to the spectrogram in order to restore the missing elements, which is one direction. In another direction, speech recognizer should be able to restore the missing elements due to deleting low SNR elements before performing the recognition. This is can be done using different spectrogram reconstruction methods. In this paper, the geometrical spectrogram reconstruction methods suggested by some researchers are implemented as a toolbox. In these geometrical reconstruction methods, the linear interpolation along time or frequency methods are used to predict the missing elements between adjacent observed elements in the spectrogram. Moreover, a new linear interpolation method using time and frequency together is presented. The CMU Sphinx III software is used in the experiments to test the performance of the linear interpolation reconstruction method. The experiments are done under different conditions such as different lengths of the window and different lengths of utterances. Speech corpus consists of 20 males and 20 females; each one has two different utterances are used in the experiments. As a result, 80% recognition accuracy is achieved with 25% SNR ratio.
Some effects of stress on users of a voice recognition system: A preliminary inquiry
NASA Astrophysics Data System (ADS)
French, B. A.
1983-03-01
Recent work with Automatic Speech Recognition has focused on applications and productivity considerations in the man-machine interface. This thesis is an attempt to see if placing users of such equipment under time-induced stress has an effect on their percent correct recognition rates. Subjects were given a message-handling task of fixed length and allowed progressively shorter times to attempt to complete it. Questionnaire responses indicate stress levels increased with decreased time-allowance; recognition rates decreased as time was reduced.
2014-03-27
and machine learning for a range of research including such topics as medical imaging [10] and handwriting recognition [11]. The type of feature...1989. [11] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “Online handwriting recognition with support vector machines-a kernel approach,” in Eighth...International Workshop on Frontiers in Handwriting Recognition, pp. 49–54, IEEE, 2002. [12] C. Cortes and V. Vapnik, “Support-vector networks,” Machine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, X; Mazur, T; Yang, D
Purpose: To investigate an approach of automatically recognizing anatomical sites and imaging views (the orientation of the image acquisition) in 2D X-ray images. Methods: A hierarchical (binary tree) multiclass recognition model was developed to recognize the treatment sites and views in x-ray images. From top to bottom of the tree, the treatment sites are grouped hierarchically from more general to more specific. Each node in the hierarchical model was designed to assign images to one of two categories of anatomical sites. The binary image classification function of each node in the hierarchical model is implemented by using a PCA transformationmore » and a support vector machine (SVM) model. The optimal PCA transformation matrices and SVM models are obtained by learning from a set of sample images. Alternatives of the hierarchical model were developed to support three scenarios of site recognition that may happen in radiotherapy clinics, including two or one X-ray images with or without view information. The performance of the approach was tested with images of 120 patients from six treatment sites – brain, head-neck, breast, lung, abdomen and pelvis – with 20 patients per site and two views (AP and RT) per patient. Results: Given two images in known orthogonal views (AP and RT), the hierarchical model achieved a 99% average F1 score to recognize the six sites. Site specific view recognition models have 100 percent accuracy. The computation time to process a new patient case (preprocessing, site and view recognition) is 0.02 seconds. Conclusion: The proposed hierarchical model of site and view recognition is effective and computationally efficient. It could be useful to automatically and independently confirm the treatment sites and views in daily setup x-ray 2D images. It could also be applied to guide subsequent image processing tasks, e.g. site and view dependent contrast enhancement and image registration. The senior author received research grants from ViewRay Inc. and Varian Medical System.« less
Chen, Xiaoyi; Faviez, Carole; Schuck, Stéphane; Lillo-Le-Louët, Agnès; Texier, Nathalie; Dahamna, Badisse; Huot, Charles; Foulquié, Pierre; Pereira, Suzanne; Leroux, Vincent; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Katsahian, Sandrine; Bousquet, Cédric; Burgun, Anita
2018-01-01
Background: The Food and Drug Administration (FDA) in the United States and the European Medicines Agency (EMA) have recognized social media as a new data source to strengthen their activities regarding drug safety. Objective: Our objective in the ADR-PRISM project was to provide text mining and visualization tools to explore a corpus of posts extracted from social media. We evaluated this approach on a corpus of 21 million posts from five patient forums, and conducted a qualitative analysis of the data available on methylphenidate in this corpus. Methods: We applied text mining methods based on named entity recognition and relation extraction in the corpus, followed by signal detection using proportional reporting ratio (PRR). We also used topic modeling based on the Correlated Topic Model to obtain the list of the matics in the corpus and classify the messages based on their topics. Results: We automatically identified 3443 posts about methylphenidate published between 2007 and 2016, among which 61 adverse drug reactions (ADR) were automatically detected. Two pharmacovigilance experts evaluated manually the quality of automatic identification, and a f-measure of 0.57 was reached. Patient's reports were mainly neuro-psychiatric effects. Applying PRR, 67% of the ADRs were signals, including most of the neuro-psychiatric symptoms but also palpitations. Topic modeling showed that the most represented topics were related to Childhood and Treatment initiation , but also Side effects . Cases of misuse were also identified in this corpus, including recreational use and abuse. Conclusion: Named entity recognition combined with signal detection and topic modeling have demonstrated their complementarity in mining social media data. An in-depth analysis focused on methylphenidate showed that this approach was able to detect potential signals and to provide better understanding of patients' behaviors regarding drugs, including misuse.
Assessing the impact of graphical quality on automatic text recognition in digital maps
NASA Astrophysics Data System (ADS)
Chiang, Yao-Yi; Leyk, Stefan; Honarvar Nazari, Narges; Moghaddam, Sima; Tan, Tian Xiang
2016-08-01
Converting geographic features (e.g., place names) in map images into a vector format is the first step for incorporating cartographic information into a geographic information system (GIS). With the advancement in computational power and algorithm design, map processing systems have been considerably improved over the last decade. However, the fundamental map processing techniques such as color image segmentation, (map) layer separation, and object recognition are sensitive to minor variations in graphical properties of the input image (e.g., scanning resolution). As a result, most map processing results would not meet user expectations if the user does not "properly" scan the map of interest, pre-process the map image (e.g., using compression or not), and train the processing system, accordingly. These issues could slow down the further advancement of map processing techniques as such unsuccessful attempts create a discouraged user community, and less sophisticated tools would be perceived as more viable solutions. Thus, it is important to understand what kinds of maps are suitable for automatic map processing and what types of results and process-related errors can be expected. In this paper, we shed light on these questions by using a typical map processing task, text recognition, to discuss a number of map instances that vary in suitability for automatic processing. We also present an extensive experiment on a diverse set of scanned historical maps to provide measures of baseline performance of a standard text recognition tool under varying map conditions (graphical quality) and text representations (that can vary even within the same map sheet). Our experimental results help the user understand what to expect when a fully or semi-automatic map processing system is used to process a scanned map with certain (varying) graphical properties and complexities in map content.
Integrated approach for automatic target recognition using a network of collaborative sensors.
Mahalanobis, Abhijit; Van Nevel, Alan
2006-10-01
We introduce what is believed to be a novel concept by which several sensors with automatic target recognition (ATR) capability collaborate to recognize objects. Such an approach would be suitable for netted systems in which the sensors and platforms can coordinate to optimize end-to-end performance. We use correlation filtering techniques to facilitate the development of the concept, although other ATR algorithms may be easily substituted. Essentially, a self-configuring geometry of netted platforms is proposed that positions the sensors optimally with respect to each other, and takes into account the interactions among the sensor, the recognition algorithms, and the classes of the objects to be recognized. We show how such a paradigm optimizes overall performance, and illustrate the collaborative ATR scheme for recognizing targets in synthetic aperture radar imagery by using viewing position as a sensor parameter.
A novel fully automatic scheme for fiducial marker-based alignment in electron tomography.
Han, Renmin; Wang, Liansan; Liu, Zhiyong; Sun, Fei; Zhang, Fa
2015-12-01
Although the topic of fiducial marker-based alignment in electron tomography (ET) has been widely discussed for decades, alignment without human intervention remains a difficult problem. Specifically, the emergence of subtomogram averaging has increased the demand for batch processing during tomographic reconstruction; fully automatic fiducial marker-based alignment is the main technique in this process. However, the lack of an accurate method for detecting and tracking fiducial markers precludes fully automatic alignment. In this paper, we present a novel, fully automatic alignment scheme for ET. Our scheme has two main contributions: First, we present a series of algorithms to ensure a high recognition rate and precise localization during the detection of fiducial markers. Our proposed solution reduces fiducial marker detection to a sampling and classification problem and further introduces an algorithm to solve the parameter dependence of marker diameter and marker number. Second, we propose a novel algorithm to solve the tracking of fiducial markers by reducing the tracking problem to an incomplete point set registration problem. Because a global optimization of a point set registration occurs, the result of our tracking is independent of the initial image position in the tilt series, allowing for the robust tracking of fiducial markers without pre-alignment. The experimental results indicate that our method can achieve an accurate tracking, almost identical to the current best one in IMOD with half automatic scheme. Furthermore, our scheme is fully automatic, depends on fewer parameters (only requires a gross value of the marker diameter) and does not require any manual interaction, providing the possibility of automatic batch processing of electron tomographic reconstruction. Copyright © 2015 Elsevier Inc. All rights reserved.
A beat-to-beat calculator for the diastolic pressure time index and the tension time index.
Nose, Y; Tajimi, T; Watanabe, Y; Yokota, M; Akazawa, K; Nakamura, M
1987-01-01
We have developed a beat-to-beat calculator which can calculate in real-time the ratio of the diastolic pressure time index (DPTI), and the tension time index (TTI) as an index of the myocardial oxygen supply/demand balance. Physicians set up presumed value for the left ventricular endodiastolic pressure, a search area for the dicrotic notch, a threshold for the onset of the up-slope and the corresponding value of the calibration signal on the digital switches of the calculator. Next, the arterial pressure analog signal is input into the calculator. The calculator searches automatically for both the onset of the up-slope and the dicrotic notch. The arterial pressure curve is displayed beat-to-beat with the recognized onset and the dicrotic notch on the CRT to be confirmed by physicians. When physicians do not agree with the automatic recognition they can fit the automatic recognition to the observation. If the recognition of the onset is inadequate, the threshold can be re-adjusted to trigger the onset. If recognition of the dicrotic notch is inadequate, the physician can adjust the search-area. Therefore, physicians who operate the calculator can rely on the calculated DPTI/TTI. This calculator can continuously monitor the myocardial oxygen supply/demand balance in patients with acute myocardial infarction or just after open-heart surgery.
Flexible methods for segmentation evaluation: Results from CT-based luggage screening
Karimi, Seemeen; Jiang, Xiaoqian; Cosman, Pamela; Martz, Harry
2017-01-01
BACKGROUND Imaging systems used in aviation security include segmentation algorithms in an automatic threat recognition pipeline. The segmentation algorithms evolve in response to emerging threats and changing performance requirements. Analysis of segmentation algorithms’ behavior, including the nature of errors and feature recovery, facilitates their development. However, evaluation methods from the literature provide limited characterization of the segmentation algorithms. OBJECTIVE To develop segmentation evaluation methods that measure systematic errors such as oversegmentation and undersegmentation, outliers, and overall errors. The methods must measure feature recovery and allow us to prioritize segments. METHODS We developed two complementary evaluation methods using statistical techniques and information theory. We also created a semi-automatic method to define ground truth from 3D images. We applied our methods to evaluate five segmentation algorithms developed for CT luggage screening. We validated our methods with synthetic problems and an observer evaluation. RESULTS Both methods selected the same best segmentation algorithm. Human evaluation confirmed the findings. The measurement of systematic errors and prioritization helped in understanding the behavior of each segmentation algorithm. CONCLUSIONS Our evaluation methods allow us to measure and explain the accuracy of segmentation algorithms. PMID:24699346
Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach.
Liu, Li; Shao, Ling; Li, Xuelong; Lu, Ke
2016-01-01
Extracting discriminative and robust features from video sequences is the first and most critical step in human action recognition. In this paper, instead of using handcrafted features, we automatically learn spatio-temporal motion features for action recognition. This is achieved via an evolutionary method, i.e., genetic programming (GP), which evolves the motion feature descriptor on a population of primitive 3D operators (e.g., 3D-Gabor and wavelet). In this way, the scale and shift invariant features can be effectively extracted from both color and optical flow sequences. We intend to learn data adaptive descriptors for different datasets with multiple layers, which makes fully use of the knowledge to mimic the physical structure of the human visual cortex for action recognition and simultaneously reduce the GP searching space to effectively accelerate the convergence of optimal solutions. In our evolutionary architecture, the average cross-validation classification error, which is calculated by an support-vector-machine classifier on the training set, is adopted as the evaluation criterion for the GP fitness function. After the entire evolution procedure finishes, the best-so-far solution selected by GP is regarded as the (near-)optimal action descriptor obtained. The GP-evolving feature extraction method is evaluated on four popular action datasets, namely KTH, HMDB51, UCF YouTube, and Hollywood2. Experimental results show that our method significantly outperforms other types of features, either hand-designed or machine-learned.
Multi-objects recognition for distributed intelligent sensor networks
NASA Astrophysics Data System (ADS)
He, Haibo; Chen, Sheng; Cao, Yuan; Desai, Sachi; Hohil, Myron E.
2008-04-01
This paper proposes an innovative approach for multi-objects recognition for homeland security and defense based intelligent sensor networks. Unlike the conventional way of information analysis, data mining in such networks is typically characterized with high information ambiguity/uncertainty, data redundancy, high dimensionality and real-time constrains. Furthermore, since a typical military based network normally includes multiple mobile sensor platforms, ground forces, fortified tanks, combat flights, and other resources, it is critical to develop intelligent data mining approaches to fuse different information resources to understand dynamic environments, to support decision making processes, and finally to achieve the goals. This paper aims to address these issues with a focus on multi-objects recognition. Instead of classifying a single object as in the traditional image classification problems, the proposed method can automatically learn multiple objectives simultaneously. Image segmentation techniques are used to identify the interesting regions in the field, which correspond to multiple objects such as soldiers or tanks. Since different objects will come with different feature sizes, we propose a feature scaling method to represent each object in the same number of dimensions. This is achieved by linear/nonlinear scaling and sampling techniques. Finally, support vector machine (SVM) based learning algorithms are developed to learn and build the associations for different objects, and such knowledge will be adaptively accumulated for objects recognition in the testing stage. We test the effectiveness of proposed method in different simulated military environments.
NASA Astrophysics Data System (ADS)
Wang, Deng-wei; Zhang, Tian-xu; Shi, Wen-jun; Wei, Long-sheng; Wang, Xiao-ping; Ao, Guo-qing
2009-07-01
Infrared images at sea background are notorious for the low signal-to-noise ratio, therefore, the target recognition of infrared image through traditional methods is very difficult. In this paper, we present a novel target recognition method based on the integration of visual attention computational model and conventional approach (selective filtering and segmentation). The two distinct techniques for image processing are combined in a manner to utilize the strengths of both. The visual attention algorithm searches the salient regions automatically, and represented them by a set of winner points, at the same time, demonstrated the salient regions in terms of circles centered at these winner points. This provides a priori knowledge for the filtering and segmentation process. Based on the winner point, we construct a rectangular region to facilitate the filtering and segmentation, then the labeling operation will be added selectively by requirement. Making use of the labeled information, from the final segmentation result we obtain the positional information of the interested region, label the centroid on the corresponding original image, and finish the localization for the target. The cost time does not depend on the size of the image but the salient regions, therefore the consumed time is greatly reduced. The method is used in the recognition of several kinds of real infrared images, and the experimental results reveal the effectiveness of the algorithm presented in this paper.
Electrophysiological Evidence of Automatic Early Semantic Processing
ERIC Educational Resources Information Center
Hinojosa, Jose A.; Martin-Loeches, Manuel; Munoz, Francisco; Casado, Pilar; Pozo, Miguel A.
2004-01-01
This study investigates the automatic-controlled nature of early semantic processing by means of the Recognition Potential (RP), an event-related potential response that reflects lexical selection processes. For this purpose tasks differing in their processing requirements were used. Half of the participants performed a physical task involving a…
NASA Astrophysics Data System (ADS)
Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.
2004-11-01
Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
NASA Astrophysics Data System (ADS)
Matsumoto, Monica M. S.; Beig, Niha G.; Udupa, Jayaram K.; Archer, Steven; Torigian, Drew A.
2014-03-01
Lung cancer is associated with the highest cancer mortality rates among men and women in the United States. The accurate and precise identification of the lymph node stations on computed tomography (CT) images is important for staging disease and potentially for prognosticating outcome in patients with lung cancer, as well as for pretreatment planning and response assessment purposes. To facilitate a standard means of referring to lymph nodes, the International Association for the Study of Lung Cancer (IASLC) has recently proposed a definition of the different lymph node stations and zones in the thorax. However, nodal station identification is typically performed manually by visual assessment in clinical radiology. This approach leaves room for error due to the subjective and potentially ambiguous nature of visual interpretation, and is labor intensive. We present a method of automatically recognizing the mediastinal IASLC-defined lymph node stations by modifying a hierarchical fuzzy modeling approach previously developed for body-wide automatic anatomy recognition (AAR) in medical imagery. Our AAR-lymph node (AAR-LN) system follows the AAR methodology and consists of two steps. In the first step, the various lymph node stations are manually delineated on a set of CT images following the IASLC definitions. These delineations are then used to build a fuzzy hierarchical model of the nodal stations which are considered as 3D objects. In the second step, the stations are automatically located on any given CT image of the thorax by using the hierarchical fuzzy model and object recognition algorithms. Based on 23 data sets used for model building, 22 independent data sets for testing, and 10 lymph node stations, a mean localization accuracy of within 1-6 voxels has been achieved by the AAR-LN system.
A coloured oil level indicator detection method based on simple linear iterative clustering
NASA Astrophysics Data System (ADS)
Liu, Tianli; Li, Dongsong; Jiao, Zhiming; Liang, Tao; Zhou, Hao; Yang, Guoqing
2017-12-01
A detection method of coloured oil level indicator is put forward. The method is applied to inspection robot in substation, which realized the automatic inspection and recognition of oil level indicator. Firstly, the detected image of the oil level indicator is collected, and the detected image is clustered and segmented to obtain the label matrix of the image. Secondly, the detection image is processed by colour space transformation, and the feature matrix of the image is obtained. Finally, the label matrix and feature matrix are used to locate and segment the detected image, and the upper edge of the recognized region is obtained. If the upper limb line exceeds the preset oil level threshold, the alarm will alert the station staff. Through the above-mentioned image processing, the inspection robot can independently recognize the oil level of the oil level indicator, and instead of manual inspection. It embodies the automatic and intelligent level of unattended operation.
NASA Technical Reports Server (NTRS)
Badhwar, G. D.
1984-01-01
The techniques used initially for the identification of cultivated crops from Landsat imagery depended greatly on the iterpretation of film products by a human analyst. This approach was not very effective and objective. Since 1978, new methods for crop identification are being developed. Badhwar et al. (1982) showed that multitemporal-multispectral data could be reduced to a simple feature space of alpha and beta and that these features would separate corn and soybean very well. However, there are disadvantages related to the use of alpha and beta parameters. The present investigation is concerned with a suitable method for extracting the required features. Attention is given to a profile model for crop discrimination, corn-soybean separation using profile parameters, and an automatic labeling (target recognition) method. The developed technique is extended to obtain a procedure which makes it possible to estimate the crop proportion of corn and soybean from Landsat data early in the growing season.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santhanam, A; Min, Y; Beron, P
Purpose: Patient safety hazards such as a wrong patient/site getting treated can lead to catastrophic results. The purpose of this project is to automatically detect potential patient safety hazards during the radiotherapy setup and alert the therapist before the treatment is initiated. Methods: We employed a set of co-located and co-registered 3D cameras placed inside the treatment room. Each camera provided a point-cloud of fraxels (fragment pixels with 3D depth information). Each of the cameras were calibrated using a custom-built calibration target to provide 3D information with less than 2 mm error in the 500 mm neighborhood around the isocenter.more » To identify potential patient safety hazards, the treatment room components and the patient’s body needed to be identified and tracked in real-time. For feature recognition purposes, we used a graph-cut based feature recognition with principal component analysis (PCA) based feature-to-object correlation to segment the objects in real-time. Changes in the object’s position were tracked using the CamShift algorithm. The 3D object information was then stored for each classified object (e.g. gantry, couch). A deep learning framework was then used to analyze all the classified objects in both 2D and 3D and was then used to fine-tune a convolutional network for object recognition. The number of network layers were optimized to identify the tracked objects with >95% accuracy. Results: Our systematic analyses showed that, the system was effectively able to recognize wrong patient setups and wrong patient accessories. The combined usage of 2D camera information (color + depth) enabled a topology-preserving approach to verify patient safety hazards in an automatic manner and even in scenarios where the depth information is partially available. Conclusion: By utilizing the 3D cameras inside the treatment room and a deep learning based image classification, potential patient safety hazards can be effectively avoided.« less
Feature Extraction and Selection Strategies for Automated Target Recognition
NASA Technical Reports Server (NTRS)
Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin
2010-01-01
Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory region of-interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.
Feature extraction and selection strategies for automated target recognition
NASA Astrophysics Data System (ADS)
Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin
2010-04-01
Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory regionof- interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.
Mixed Pattern Matching-Based Traffic Abnormal Behavior Recognition
Cui, Zhiming; Zhao, Pengpeng
2014-01-01
A motion trajectory is an intuitive representation form in time-space domain for a micromotion behavior of moving target. Trajectory analysis is an important approach to recognize abnormal behaviors of moving targets. Against the complexity of vehicle trajectories, this paper first proposed a trajectory pattern learning method based on dynamic time warping (DTW) and spectral clustering. It introduced the DTW distance to measure the distances between vehicle trajectories and determined the number of clusters automatically by a spectral clustering algorithm based on the distance matrix. Then, it clusters sample data points into different clusters. After the spatial patterns and direction patterns learned from the clusters, a recognition method for detecting vehicle abnormal behaviors based on mixed pattern matching was proposed. The experimental results show that the proposed technical scheme can recognize main types of traffic abnormal behaviors effectively and has good robustness. The real-world application verified its feasibility and the validity. PMID:24605045
Mitrea, Delia; Mitrea, Paulina; Nedevschi, Sergiu; Badea, Radu; Lupsor, Monica; Socaciu, Mihai; Golea, Adela; Hagiu, Claudia; Ciobanu, Lidia
2012-01-01
The noninvasive diagnosis of the malignant tumors is an important issue in research nowadays. Our purpose is to elaborate computerized, texture-based methods for performing computer-aided characterization and automatic diagnosis of these tumors, using only the information from ultrasound images. In this paper, we considered some of the most frequent abdominal malignant tumors: the hepatocellular carcinoma and the colonic tumors. We compared these structures with the benign tumors and with other visually similar diseases. Besides the textural features that proved in our previous research to be useful in the characterization and recognition of the malignant tumors, we improved our method by using the grey level cooccurrence matrix and the edge orientation cooccurrence matrix of superior order. As resulted from our experiments, the new textural features increased the malignant tumor classification performance, also revealing visual and physical properties of these structures that emphasized the complex, chaotic structure of the corresponding tissue. PMID:22312411
Material recognition based on thermal cues: Mechanisms and applications.
Ho, Hsin-Ni
2018-01-01
Some materials feel colder to the touch than others, and we can use this difference in perceived coldness for material recognition. This review focuses on the mechanisms underlying material recognition based on thermal cues. It provides an overview of the physical, perceptual, and cognitive processes involved in material recognition. It also describes engineering domains in which material recognition based on thermal cues have been applied. This includes haptic interfaces that seek to reproduce the sensations associated with contact in virtual environments and tactile sensors aim for automatic material recognition. The review concludes by considering the contributions of this line of research in both science and engineering.
Material recognition based on thermal cues: Mechanisms and applications
Ho, Hsin-Ni
2018-01-01
ABSTRACT Some materials feel colder to the touch than others, and we can use this difference in perceived coldness for material recognition. This review focuses on the mechanisms underlying material recognition based on thermal cues. It provides an overview of the physical, perceptual, and cognitive processes involved in material recognition. It also describes engineering domains in which material recognition based on thermal cues have been applied. This includes haptic interfaces that seek to reproduce the sensations associated with contact in virtual environments and tactile sensors aim for automatic material recognition. The review concludes by considering the contributions of this line of research in both science and engineering. PMID:29687043
Automatically Log Off Upon Disappearance of Facial Image
2005-03-01
log off a PC when the user’s face disappears for an adjustable time interval. Among the fundamental technologies of biometrics, facial recognition is... facial recognition products. In this report, a brief overview of face detection technologies is provided. The particular neural network-based face...ensure that the user logging onto the system is the same person. Among the fundamental technologies of biometrics, facial recognition is the only
Signal recognition and parameter estimation of BPSK-LFM combined modulation
NASA Astrophysics Data System (ADS)
Long, Chao; Zhang, Lin; Liu, Yu
2015-07-01
Intra-pulse analysis plays an important role in electronic warfare. Intra-pulse feature abstraction focuses on primary parameters such as instantaneous frequency, modulation, and symbol rate. In this paper, automatic modulation recognition and feature extraction for combined BPSK-LFM modulation signals based on decision theoretic approach is studied. The simulation results show good recognition effect and high estimation precision, and the system is easy to be realized.
ERIC Educational Resources Information Center
Stinson, Michael; Elliot, Lisa; McKee, Barbara; Coyne, Gina
This report discusses a project that adapted new automatic speech recognition (ASR) technology to provide real-time speech-to-text transcription as a support service for students who are deaf and hard of hearing (D/HH). In this system, as the teacher speaks, a hearing intermediary, or captionist, dictates into the speech recognition system in a…
Chun, Hong-Woo; Tsuruoka, Yoshimasa; Kim, Jin-Dong; Shiba, Rie; Nagata, Naoki; Hishiki, Teruyoshi; Tsujii, Jun'ichi
2006-11-24
Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques.
A model of traffic signs recognition with convolutional neural network
NASA Astrophysics Data System (ADS)
Hu, Haihe; Li, Yujian; Zhang, Ting; Huo, Yi; Kuang, Wenqing
2016-10-01
In real traffic scenes, the quality of captured images are generally low due to some factors such as lighting conditions, and occlusion on. All of these factors are challengeable for automated recognition algorithms of traffic signs. Deep learning has provided a new way to solve this kind of problems recently. The deep network can automatically learn features from a large number of data samples and obtain an excellent recognition performance. We therefore approach this task of recognition of traffic signs as a general vision problem, with few assumptions related to road signs. We propose a model of Convolutional Neural Network (CNN) and apply the model to the task of traffic signs recognition. The proposed model adopts deep CNN as the supervised learning model, directly takes the collected traffic signs image as the input, alternates the convolutional layer and subsampling layer, and automatically extracts the features for the recognition of the traffic signs images. The proposed model includes an input layer, three convolutional layers, three subsampling layers, a fully-connected layer, and an output layer. To validate the proposed model, the experiments are implemented using the public dataset of China competition of fuzzy image processing. Experimental results show that the proposed model produces a recognition accuracy of 99.01 % on the training dataset, and yield a record of 92% on the preliminary contest within the fourth best.
Automatic anatomy recognition on CT images with pathology
NASA Astrophysics Data System (ADS)
Huang, Lidong; Udupa, Jayaram K.; Tong, Yubing; Odhner, Dewey; Torigian, Drew A.
2016-03-01
Body-wide anatomy recognition on CT images with pathology becomes crucial for quantifying body-wide disease burden. This, however, is a challenging problem because various diseases result in various abnormalities of objects such as shape and intensity patterns. We previously developed an automatic anatomy recognition (AAR) system [1] whose applicability was demonstrated on near normal diagnostic CT images in different body regions on 35 organs. The aim of this paper is to investigate strategies for adapting the previous AAR system to diagnostic CT images of patients with various pathologies as a first step toward automated body-wide disease quantification. The AAR approach consists of three main steps - model building, object recognition, and object delineation. In this paper, within the broader AAR framework, we describe a new strategy for object recognition to handle abnormal images. In the model building stage an optimal threshold interval is learned from near-normal training images for each object. This threshold is optimally tuned to the pathological manifestation of the object in the test image. Recognition is performed following a hierarchical representation of the objects. Experimental results for the abdominal body region based on 50 near-normal images used for model building and 20 abnormal images used for object recognition show that object localization accuracy within 2 voxels for liver and spleen and 3 voxels for kidney can be achieved with the new strategy.
ERIC Educational Resources Information Center
Richler, Jennifer J.; Gauthier, Isabel; Palmeri, Thomas J.
2011-01-01
Are there consequences of calling objects by their names? Lupyan (2008) suggested that overtly labeling objects impairs subsequent recognition memory because labeling shifts stored memory representations of objects toward the category prototype (representational shift hypothesis). In Experiment 1, we show that processing objects at the basic…
Variogram-based feature extraction for neural network recognition of logos
NASA Astrophysics Data System (ADS)
Pham, Tuan D.
2003-03-01
This paper presents a new approach for extracting spatial features of images based on the theory of regionalized variables. These features can be effectively used for automatic recognition of logo images using neural networks. Experimental results on a public-domain logo database show the effectiveness of the proposed approach.
Separating Speed from Accuracy in Beginning Reading Development
ERIC Educational Resources Information Center
Juul, Holger; Poulsen, Mads; Elbro, Carsten
2014-01-01
Phoneme awareness, letter knowledge, and rapid automatized naming (RAN) are well-known kindergarten predictors of later word recognition skills, but it is not clear whether they predict developments in accuracy or speed, or both. The present longitudinal study of 172 Danish beginning readers found that speed of word recognition mainly developed…
Model-based vision using geometric hashing
NASA Astrophysics Data System (ADS)
Akerman, Alexander, III; Patton, Ronald
1991-04-01
The Geometric Hashing technique developed by the NYU Courant Institute has been applied to various automatic target recognition applications. In particular, I-MATH has extended the hashing algorithm to perform automatic target recognition ofsynthetic aperture radar (SAR) imagery. For this application, the hashing is performed upon the geometric locations of dominant scatterers. In addition to being a robust model-based matching algorithm -- invariant under translation, scale, and 3D rotations of the target -- hashing is of particular utility because it can still perform effective matching when the target is partially obscured. Moreover, hashing is very amenable to a SIMD parallel processing architecture, and thus potentially realtime implementable.
Valente, João; Vieira, Pedro M; Couto, Carlos; Lima, Carlos S
2018-02-01
Poor brain extraction in Magnetic Resonance Imaging (MRI) has negative consequences in several types of brain post-extraction such as tissue segmentation and related statistical measures or pattern recognition algorithms. Current state of the art algorithms for brain extraction work on weighted T1 and T2, being not adequate for non-whole brain images such as the case of T2*FLASH@7T partial volumes. This paper proposes two new methods that work directly in T2*FLASH@7T partial volumes. The first is an improvement of the semi-automatic threshold-with-morphology approach adapted to incomplete volumes. The second method uses an improved version of a current implementation of the fuzzy c-means algorithm with bias correction for brain segmentation. Under high inhomogeneity conditions the performance of the first method degrades, requiring user intervention which is unacceptable. The second method performed well for all volumes, being entirely automatic. State of the art algorithms for brain extraction are mainly semi-automatic, requiring a correct initialization by the user and knowledge of the software. These methods can't deal with partial volumes and/or need information from atlas which is not available in T2*FLASH@7T. Also, combined volumes suffer from manipulations such as re-sampling which deteriorates significantly voxel intensity structures making segmentation tasks difficult. The proposed method can overcome all these difficulties, reaching good results for brain extraction using only T2*FLASH@7T volumes. The development of this work will lead to an improvement of automatic brain lesions segmentation in T2*FLASH@7T volumes, becoming more important when lesions such as cortical Multiple-Sclerosis need to be detected. Copyright © 2017 Elsevier B.V. All rights reserved.
Contour matching for a fish recognition and migration-monitoring system
NASA Astrophysics Data System (ADS)
Lee, Dah-Jye; Schoenberger, Robert B.; Shiozawa, Dennis; Xu, Xiaoqian; Zhan, Pengcheng
2004-12-01
Fish migration is being monitored year round to provide valuable information for the study of behavioral responses of fish to environmental variations. However, currently all monitoring is done by human observers. An automatic fish recognition and migration monitoring system is more efficient and can provide more accurate data. Such a system includes automatic fish image acquisition, contour extraction, fish categorization, and data storage. Shape is a very important characteristic and shape analysis and shape matching are studied for fish recognition. Previous work focused on finding critical landmark points on fish shape using curvature function analysis. Fish recognition based on landmark points has shown satisfying results. However, the main difficulty of this approach is that landmark points sometimes cannot be located very accurately. Whole shape matching is used for fish recognition in this paper. Several shape descriptors, such as Fourier descriptors, polygon approximation and line segments, are tested. A power cepstrum technique has been developed in order to improve the categorization speed using contours represented in tangent space with normalized length. Design and integration including image acquisition, contour extraction and fish categorization are discussed in this paper. Fish categorization results based on shape analysis and shape matching are also included.
Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition
NASA Astrophysics Data System (ADS)
Yin, Xi; Liu, Xiaoming
2018-02-01
This paper explores multi-task learning (MTL) for face recognition. We answer the questions of how and why MTL can improve the face recognition performance. First, we propose a multi-task Convolutional Neural Network (CNN) for face recognition where identity classification is the main task and pose, illumination, and expression estimations are the side tasks. Second, we develop a dynamic-weighting scheme to automatically assign the loss weight to each side task, which is a crucial problem in MTL. Third, we propose a pose-directed multi-task CNN by grouping different poses to learn pose-specific identity features, simultaneously across all poses. Last but not least, we propose an energy-based weight analysis method to explore how CNN-based MTL works. We observe that the side tasks serve as regularizations to disentangle the variations from the learnt identity features. Extensive experiments on the entire Multi-PIE dataset demonstrate the effectiveness of the proposed approach. To the best of our knowledge, this is the first work using all data in Multi-PIE for face recognition. Our approach is also applicable to in-the-wild datasets for pose-invariant face recognition and achieves comparable or better performance than state of the art on LFW, CFP, and IJB-A datasets.
Automatic Recognition of Breathing Route During Sleep Using Snoring Sounds
NASA Astrophysics Data System (ADS)
Mikami, Tsuyoshi; Kojima, Yohichiro
This letter classifies snoring sounds into three breathing routes (oral, nasal, and oronasal) with discriminant analysis of the power spectra and k-nearest neighbor method. It is necessary to recognize breathing route during snoring, because oral snoring is a typical symptom of sleep apnea but we cannot know our own breathing and snoring condition during sleep. As a result, about 98.8% classification rate is obtained by using leave-one-out test for performance evaluation.
Howell, Peter; Sackin, Stevie; Glenn, Kazan
2007-01-01
This program of work is intended to develop automatic recognition procedures to locate and assess stuttered dysfluencies. This and the following article together, develop and test recognizers for repetitions and prolongations. The automatic recognizers classify the speech in two stages: In the first, the speech is segmented and in the second the segments are categorized. The units that are segmented are words. Here assessments by human judges on the speech of 12 children who stutter are described using a corresponding procedure. The accuracy of word boundary placement across judges, categorization of the words as fluent, repetition or prolongation, and duration of the different fluency categories are reported. These measures allow reliable instances of repetitions and prolongations to be selected for training and assessing the recognizers in the subsequent paper. PMID:9328878
Fine grained recognition of masonry walls for built heritage assessment
NASA Astrophysics Data System (ADS)
Oses, N.; Dornaika, F.; Moujahid, A.
2015-01-01
This paper presents the ground work carried out to achieve automatic fine grained recognition of stone masonry. This is a necessary first step in the development of the analysis tool. The built heritage that will be assessed consists of stone masonry constructions and many of the features analysed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, we apply image processing on digital images of the elements under inspection. The main contribution of the paper is the performance evaluation of the automatic categorization of masonry walls from a set of extracted straight line segments. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls using machine learning paradigms. These include classifiers as well as automatic feature selection.
Corona-Strauss, Farah I; Delb, Wolfgang; Schick, Bernhard; Strauss, Daniel J
2010-01-01
Auditory Brainstem Responses (ABRs) are used as objective method for diagnostics and quantification of hearing loss. Many methods for automatic recognition of ABRs have been developed, but none of them include the individual measurement setup in the analysis. The purpose of this work was to design a fast recognition scheme for chirp-evoked ABRs that is adjusted to the individual measurement condition using spontaneous electroencephalographic activity (SA). For the classification, the kernel-based novelty detection scheme used features based on the inter-sweep instantaneous phase synchronization as well as energy and entropy relations in the time-frequency domain. This method provided SA discrimination from stimulations above the hearing threshold with a minimum number of sweeps, i.e., 200 individual responses. It is concluded that the proposed paradigm, processing procedures and stimulation techniques improve the detection of ABRs in terms of the degree of objectivity, i.e., automation of procedure, and measurement time.
Research of Daily Conversation Transmitting System Based on Mouth Part Pattern Recognition
NASA Astrophysics Data System (ADS)
Watanabe, Mutsumi; Nishi, Natsuko
The authors are developing a vision-based intension transfer technique by recognizing user’s face expressions and movements, to help free and convenient communications with aged or disabled persons who find difficulties in talking, discriminating small character prints and operating keyboards by hands and fingers. In this paper we report a prototype system, where layered daily conversations are successively selected by recognizing the transition in shape of user’s mouth parts using camera image sequences settled in front of the user. Four mouth part patterns are used in the system. A method that automatically recognizes these patterns by analyzing the intensity histogram data around the mouth region is newly developed. The confirmation of a selection on the way is executed by detecting the open and shut movements of mouth through the temporal change in intensity histogram data. The method has been installed in a desktop PC by VC++ programs. Experimental results of mouth shape pattern recognition by twenty-five persons have shown the effectiveness of the method.
Boundary methods for mode estimation
NASA Astrophysics Data System (ADS)
Pierson, William E., Jr.; Ulug, Batuhan; Ahalt, Stanley C.
1999-08-01
This paper investigates the use of Boundary Methods (BMs), a collection of tools used for distribution analysis, as a method for estimating the number of modes associated with a given data set. Model order information of this type is required by several pattern recognition applications. The BM technique provides a novel approach to this parameter estimation problem and is comparable in terms of both accuracy and computations to other popular mode estimation techniques currently found in the literature and automatic target recognition applications. This paper explains the methodology used in the BM approach to mode estimation. Also, this paper quickly reviews other common mode estimation techniques and describes the empirical investigation used to explore the relationship of the BM technique to other mode estimation techniques. Specifically, the accuracy and computational efficiency of the BM technique are compared quantitatively to the a mixture of Gaussian (MOG) approach and a k-means approach to model order estimation. The stopping criteria of the MOG and k-means techniques is the Akaike Information Criteria (AIC).
Sun, Weifang; Yao, Bin; Zeng, Nianyin; He, Yuchao; Cao, Xincheng; He, Wangpeng
2017-01-01
As a typical example of large and complex mechanical systems, rotating machinery is prone to diversified sorts of mechanical faults. Among these faults, one of the prominent causes of malfunction is generated in gear transmission chains. Although they can be collected via vibration signals, the fault signatures are always submerged in overwhelming interfering contents. Therefore, identifying the critical fault’s characteristic signal is far from an easy task. In order to improve the recognition accuracy of a fault’s characteristic signal, a novel intelligent fault diagnosis method is presented. In this method, a dual-tree complex wavelet transform (DTCWT) is employed to acquire the multiscale signal’s features. In addition, a convolutional neural network (CNN) approach is utilized to automatically recognise a fault feature from the multiscale signal features. The experiment results of the recognition for gear faults show the feasibility and effectiveness of the proposed method, especially in the gear’s weak fault features. PMID:28773148
Dance recognition system using lower body movement.
Simpson, Travis T; Wiesner, Susan L; Bennett, Bradford C
2014-02-01
The current means of locating specific movements in film necessitate hours of viewing, making the task of conducting research into movement characteristics and patterns tedious and difficult. This is particularly problematic for the research and analysis of complex movement systems such as sports and dance. While some systems have been developed to manually annotate film, to date no automated way of identifying complex, full body movement exists. With pattern recognition technology and knowledge of joint locations, automatically describing filmed movement using computer software is possible. This study used various forms of lower body kinematic analysis to identify codified dance movements. We created an algorithm that compares an unknown move with a specified start and stop against known dance moves. Our recognition method consists of classification and template correlation using a database of model moves. This system was optimized to include nearly 90 dance and Tai Chi Chuan movements, producing accurate name identification in over 97% of trials. In addition, the program had the capability to provide a kinematic description of either matched or unmatched moves obtained from classification recognition.
Novel dynamic Bayesian networks for facial action element recognition and understanding
NASA Astrophysics Data System (ADS)
Zhao, Wei; Park, Jeong-Seon; Choi, Dong-You; Lee, Sang-Woong
2011-12-01
In daily life, language is an important tool of communication between people. Besides language, facial action can also provide a great amount of information. Therefore, facial action recognition has become a popular research topic in the field of human-computer interaction (HCI). However, facial action recognition is quite a challenging task due to its complexity. In a literal sense, there are thousands of facial muscular movements, many of which have very subtle differences. Moreover, muscular movements always occur simultaneously when the pose is changed. To address this problem, we first build a fully automatic facial points detection system based on a local Gabor filter bank and principal component analysis. Then, novel dynamic Bayesian networks are proposed to perform facial action recognition using the junction tree algorithm over a limited number of feature points. In order to evaluate the proposed method, we have used the Korean face database for model training. For testing, we used the CUbiC FacePix, facial expressions and emotion database, Japanese female facial expression database, and our own database. Our experimental results clearly demonstrate the feasibility of the proposed approach.
NASA Astrophysics Data System (ADS)
He, Di; Lim, Boon Pang; Yang, Xuesong; Hasegawa-Johnson, Mark; Chen, Deming
2018-06-01
Most mainstream Automatic Speech Recognition (ASR) systems consider all feature frames equally important. However, acoustic landmark theory is based on a contradictory idea, that some frames are more important than others. Acoustic landmark theory exploits quantal non-linearities in the articulatory-acoustic and acoustic-perceptual relations to define landmark times at which the speech spectrum abruptly changes or reaches an extremum; frames overlapping landmarks have been demonstrated to be sufficient for speech perception. In this work, we conduct experiments on the TIMIT corpus, with both GMM and DNN based ASR systems and find that frames containing landmarks are more informative for ASR than others. We find that altering the level of emphasis on landmarks by re-weighting acoustic likelihood tends to reduce the phone error rate (PER). Furthermore, by leveraging the landmark as a heuristic, one of our hybrid DNN frame dropping strategies maintained a PER within 0.44% of optimal when scoring less than half (45.8% to be precise) of the frames. This hybrid strategy out-performs other non-heuristic-based methods and demonstrate the potential of landmarks for reducing computation.
Automatic speech recognition in air-ground data link
NASA Technical Reports Server (NTRS)
Armstrong, Herbert B.
1989-01-01
In the present air traffic system, information presented to the transport aircraft cockpit crew may originate from a variety of sources and may be presented to the crew in visual or aural form, either through cockpit instrument displays or, most often, through voice communication. Voice radio communications are the most error prone method for air-ground data link. Voice messages can be misstated or misunderstood and radio frequency congestion can delay or obscure important messages. To prevent proliferation, a multiplexed data link display can be designed to present information from multiple data link sources on a shared cockpit display unit (CDU) or multi-function display (MFD) or some future combination of flight management and data link information. An aural data link which incorporates an automatic speech recognition (ASR) system for crew response offers several advantages over visual displays. The possibility of applying ASR to the air-ground data link was investigated. The first step was to review current efforts in ASR applications in the cockpit and in air traffic control and evaluated their possible data line application. Next, a series of preliminary research questions is to be developed for possible future collaboration.
Wójcicki, Tomasz; Nowicki, Michał
2016-01-01
The article presents a selected area of research and development concerning the methods of material analysis based on the automatic image recognition of the investigated metallographic sections. The objectives of the analyses of the materials for gas nitriding technology are described. The methods of the preparation of nitrided layers, the steps of the process and the construction and operation of devices for gas nitriding are given. We discuss the possibility of using the methods of digital images processing in the analysis of the materials, as well as their essential task groups: improving the quality of the images, segmentation, morphological transformations and image recognition. The developed analysis model of the nitrided layers formation, covering image processing and analysis techniques, as well as selected methods of artificial intelligence are presented. The model is divided into stages, which are formalized in order to better reproduce their actions. The validation of the presented method is performed. The advantages and limitations of the developed solution, as well as the possibilities of its practical use, are listed. PMID:28773389
Improved Techniques for Automatic Chord Recognition from Music Audio Signals
ERIC Educational Resources Information Center
Cho, Taemin
2014-01-01
This thesis is concerned with the development of techniques that facilitate the effective implementation of capable automatic chord transcription from music audio signals. Since chord transcriptions can capture many important aspects of music, they are useful for a wide variety of music applications and also useful for people who learn and perform…
38 CFR 51.31 - Automatic recognition.
Code of Federal Regulations, 2012 CFR
2012-07-01
...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...
38 CFR 51.31 - Automatic recognition.
Code of Federal Regulations, 2011 CFR
2011-07-01
...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...
38 CFR 51.31 - Automatic recognition.
Code of Federal Regulations, 2013 CFR
2013-07-01
...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...
38 CFR 51.31 - Automatic recognition.
Code of Federal Regulations, 2014 CFR
2014-07-01
...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...
38 CFR 51.31 - Automatic recognition.
Code of Federal Regulations, 2010 CFR
2010-07-01
...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...
Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment
ERIC Educational Resources Information Center
Cox, Troy L.
2013-01-01
Speaking assessments for second language learners have traditionally been expensive to administer because of the cost of rating the speech samples. To reduce the cost, many researchers are investigating the potential of using automatic speech recognition (ASR) as a means to score examinee responses to open-ended prompts. This study examined the…
Computer-Aided Authoring System (AUTHOR) User's Guide. Volume I. Final Report.
ERIC Educational Resources Information Center
Guitard, Charles R.
This user's guide for AUTHOR, an automatic authoring system which produces programmed texts for teaching symbol recognition, provides detailed instructions to help the user construct and enter the information needed to create the programmed text, run the AUTHOR program, and edit the automatically composed paper. Major sections describe steps in…
Psychopaths lack the automatic avoidance of social threat: relation to instrumental aggression.
Louise von Borries, Anna Katinka; Volman, Inge; de Bruijn, Ellen Rosalia Aloïs; Bulten, Berend Hendrik; Verkes, Robbert Jan; Roelofs, Karin
2012-12-30
Psychopathy (PP) is associated with marked abnormalities in social emotional behaviour, such as high instrumental aggression (IA). A crucial but largely ignored question is whether automatic social approach-avoidance tendencies may underlie this condition. We tested whether offenders with PP show lack of automatic avoidance tendencies, usually activated when (healthy) individuals are confronted with social threat stimuli (angry faces). We applied a computerized approach-avoidance task (AAT), where participants pushed or pulled pictures of emotional faces using a joystick, upon which the faces decreased or increased in size, respectively. Furthermore, participants completed an emotion recognition task which was used to control for differences in recognition of facial emotions. In contrast to healthy controls (HC), PP patients showed total absence of avoidance tendencies towards angry faces. Interestingly, those responses were related to levels of instrumental aggression and the (in)ability to experience personal distress (PD). These findings suggest that social performance in psychopaths is disturbed on a basic level of automatic action tendencies. The lack of implicit threat avoidance tendencies may underlie their aggressive behaviour. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Automatic recognition of coronal type II radio bursts: The ARBIS 2 method and first observations
NASA Astrophysics Data System (ADS)
Lobzin, Vasili; Cairns, Iver; Robinson, Peter; Steward, Graham; Patterson, Garth
Major space weather events such as solar flares and coronal mass ejections are usually accompa-nied by solar radio bursts, which can potentially be used for real-time space weather forecasts. Type II radio bursts are produced near the local plasma frequency and its harmonic by fast electrons accelerated by a shock wave moving through the corona and solar wind with a typi-cal speed of 1000 km s-1 . The coronal bursts have dynamic spectra with frequency gradually falling with time and durations of several minutes. We present a new method developed to de-tect type II coronal radio bursts automatically and describe its implementation in an extended Automated Radio Burst Identification System (ARBIS 2). Preliminary tests of the method with spectra obtained in 2002 show that the performance of the current implementation is quite high, ˜ 80%, while the probability of false positives is reasonably low, with one false positive per 100-200 hr for high solar activity and less than one false event per 10000 hr for low solar activity periods. The first automatically detected coronal type II radio bursts are also presented. ARBIS 2 is now operational with IPS Radio and Space Services, providing email alerts and event lists internationally.
Higgins, Eleanor L; Raskind, Marshall H
2004-12-01
This study was conducted to assess the effectiveness of two programs developed by the Frostig Center Research Department to improve the reading and spelling of students with learning disabilities (LD): a computer Speech Recognition-based Program (SRBP) and a computer and text-based Automaticity Program (AP). Twenty-eight LD students with reading and spelling difficulties (aged 8 to 18) received each program for 17 weeks and were compared with 16 students in a contrast group who did not receive either program. After adjusting for age and IQ, both the SRBP and AP groups showed significant differences over the contrast group in improving word recognition and reading comprehension. Neither program showed significant differences over contrasts in spelling. The SRBP also improved the performance of the target group when compared with the contrast group on phonological elision and nonword reading efficiency tasks. The AP showed significant differences in all process and reading efficiency measures.
The Automaticity of Emotional Face-Context Integration
Aviezer, Hillel; Dudarev, Veronica; Bentin, Shlomo; Hassin, Ran R.
2011-01-01
Recent studies have demonstrated that context can dramatically influence the recognition of basic facial expressions, yet the nature of this phenomenon is largely unknown. In the present paper we begin to characterize the underlying process of face-context integration. Specifically, we examine whether it is a relatively controlled or automatic process. In Experiment 1 participants were motivated and instructed to avoid using the context while categorizing contextualized facial expression, or they were led to believe that the context was irrelevant. Nevertheless, they were unable to disregard the context, which exerted a strong effect on their emotion recognition. In Experiment 2, participants categorized contextualized facial expressions while engaged in a concurrent working memory task. Despite the load, the context exerted a strong influence on their recognition of facial expressions. These results suggest that facial expressions and their body contexts are integrated in an unintentional, uncontrollable, and relatively effortless manner. PMID:21707150
NASA Astrophysics Data System (ADS)
Chen, Dan; Guo, Lin-yuan; Wang, Chen-hao; Ke, Xi-zheng
2017-07-01
Equalization can compensate channel distortion caused by channel multipath effects, and effectively improve convergent of modulation constellation diagram in optical wireless system. In this paper, the subspace blind equalization algorithm is used to preprocess M-ary phase shift keying (MPSK) subcarrier modulation signal in receiver. Mountain clustering is adopted to get the clustering centers of MPSK modulation constellation diagram, and the modulation order is automatically identified through the k-nearest neighbor (KNN) classifier. The experiment has been done under four different weather conditions. Experimental results show that the convergent of constellation diagram is improved effectively after using the subspace blind equalization algorithm, which means that the accuracy of modulation recognition is increased. The correct recognition rate of 16PSK can be up to 85% in any kind of weather condition which is mentioned in paper. Meanwhile, the correct recognition rate is the highest in cloudy and the lowest in heavy rain condition.
Pärkkä, Juha; Cluitmans, Luc; Ermes, Miikka
2010-09-01
Inactive and sedentary lifestyle is a major problem in many industrialized countries today. Automatic recognition of type of physical activity can be used to show the user the distribution of his daily activities and to motivate him into more active lifestyle. In this study, an automatic activity-recognition system consisting of wireless motion bands and a PDA is evaluated. The system classifies raw sensor data into activity types online. It uses a decision tree classifier, which has low computational cost and low battery consumption. The classifier parameters can be personalized online by performing a short bout of an activity and by telling the system which activity is being performed. Data were collected with seven volunteers during five everyday activities: lying, sitting/standing, walking, running, and cycling. The online system can detect these activities with overall 86.6% accuracy and with 94.0% accuracy after classifier personalization.
Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers
NASA Astrophysics Data System (ADS)
Caballero Morales, Santiago Omar; Cox, Stephen J.
2009-12-01
Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.
Performance of a Working Face Recognition Machine using Cortical Thought Theory
1984-12-04
been considered (2). Recommendations from Bledsoe’s study included research on facial - recognition systems that are "completely automatic (remove the...C. L. Location of some facial features . computer, Palo Alto: Panoramic Research, Aug 1966. 2. Bledsoe, W. W. Man-machine facial recognition : Is...34 image?" It would seem - that the location and size of the features left in this contrast-expanded image contain the essential information of facial
[Study on the automatic parameters identification of water pipe network model].
Jia, Hai-Feng; Zhao, Qi-Feng
2010-01-01
Based on the problems analysis on development and application of water pipe network model, the model parameters automatic identification is regarded as a kernel bottleneck of model's application in water supply enterprise. The methodology of water pipe network model parameters automatic identification based on GIS and SCADA database is proposed. Then the kernel algorithm of model parameters automatic identification is studied, RSA (Regionalized Sensitivity Analysis) is used for automatic recognition of sensitive parameters, and MCS (Monte-Carlo Sampling) is used for automatic identification of parameters, the detail technical route based on RSA and MCS is presented. The module of water pipe network model parameters automatic identification is developed. At last, selected a typical water pipe network as a case, the case study on water pipe network model parameters automatic identification is conducted and the satisfied results are achieved.
Road Network Extraction from Dsm by Mathematical Morphology and Reasoning
NASA Astrophysics Data System (ADS)
Li, Yan; Wu, Jianliang; Zhu, Lin; Tachibana, Kikuo
2016-06-01
The objective of this research is the automatic extraction of the road network in a scene of the urban area from a high resolution digital surface model (DSM). Automatic road extraction and modeling from remote sensed data has been studied for more than one decade. The methods vary greatly due to the differences of data types, regions, resolutions et al. An advanced automatic road network extraction scheme is proposed to address the issues of tedium steps on segmentation, recognition and grouping. It is on the basis of a geometric road model which describes a multiple-level structure. The 0-dimension element is intersection. The 1-dimension elements are central line and side. The 2-dimension element is plane, which is generated from the 1-dimension elements. The key feature of the presented approach is the cross validation for the three road elements which goes through the entire procedure of their extraction. The advantage of our model and method is that linear elements of the road can be derived directly, without any complex, non-robust connection hypothesis. An example of Japanese scene is presented to display the procedure and the performance of the approach.
Cascaded deep decision networks for classification of endoscopic images
NASA Astrophysics Data System (ADS)
Murthy, Venkatesh N.; Singh, Vivek; Sun, Shanhui; Bhattacharya, Subhabrata; Chen, Terrence; Comaniciu, Dorin
2017-02-01
Both traditional and wireless capsule endoscopes can generate tens of thousands of images for each patient. It is desirable to have the majority of irrelevant images filtered out by automatic algorithms during an offline review process or to have automatic indication for highly suspicious areas during an online guidance. This also applies to the newly invented endomicroscopy, where online indication of tumor classification plays a significant role. Image classification is a standard pattern recognition problem and is well studied in the literature. However, performance on the challenging endoscopic images still has room for improvement. In this paper, we present a novel Cascaded Deep Decision Network (CDDN) to improve image classification performance over standard Deep neural network based methods. During the learning phase, CDDN automatically builds a network which discards samples that are classified with high confidence scores by a previously trained network and concentrates only on the challenging samples which would be handled by the subsequent expert shallow networks. We validate CDDN using two different types of endoscopic imaging, which includes a polyp classification dataset and a tumor classification dataset. From both datasets we show that CDDN can outperform other methods by about 10%. In addition, CDDN can also be applied to other image classification problems.
Transcribe Your Class: Using Speech Recognition to Improve Access for At-Risk Students
ERIC Educational Resources Information Center
Bain, Keith; Lund-Lucas, Eunice; Stevens, Janice
2012-01-01
Through a project supported by Canada's Social Development Partnerships Program, a team of leading National Disability Organizations, universities, and industry partners are piloting a prototype Hosted Transcription Service that uses speech recognition to automatically create multimedia transcripts that can be used by students for study purposes.…
ERIC Educational Resources Information Center
Cordier, Deborah
2009-01-01
A renewed focus on foreign language (FL) learning and speech for communication has resulted in computer-assisted language learning (CALL) software developed with Automatic Speech Recognition (ASR). ASR features for FL pronunciation (Lafford, 2004) are functional components of CALL designs used for FL teaching and learning. The ASR features…
ERIC Educational Resources Information Center
Spironelli, Chiara; Penolazzi, Barbara; Vio, Claudio; Angrilli, Alessandro
2010-01-01
Brain plasticity was investigated in 14 Italian children affected by developmental dyslexia after 6 months of phonological training. The means used to measure language reorganization was the recognition potential, an early wave, also called N150, elicited by automatic word recognition. This component peaks over the left temporo-occipital cortex…
Higher-order neural network software for distortion invariant object recognition
NASA Technical Reports Server (NTRS)
Reid, Max B.; Spirkovska, Lilly
1991-01-01
The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.
Image acquisition system for traffic monitoring applications
NASA Astrophysics Data System (ADS)
Auty, Glen; Corke, Peter I.; Dunn, Paul; Jensen, Murray; Macintyre, Ian B.; Mills, Dennis C.; Nguyen, Hao; Simons, Ben
1995-03-01
An imaging system for monitoring traffic on multilane highways is discussed. The system, named Safe-T-Cam, is capable of operating 24 hours per day in all but extreme weather conditions and can capture still images of vehicles traveling up to 160 km/hr. Systems operating at different remote locations are networked to allow transmission of images and data to a control center. A remote site facility comprises a vehicle detection and classification module (VCDM), an image acquisition module (IAM) and a license plate recognition module (LPRM). The remote site is connected to the central site by an ISDN communications network. The remote site system is discussed in this paper. The VCDM consists of a video camera, a specialized exposure control unit to maintain consistent image characteristics, and a 'real-time' image processing system that processes 50 images per second. The VCDM can detect and classify vehicles (e.g. cars from trucks). The vehicle class is used to determine what data should be recorded. The VCDM uses a vehicle tracking technique to allow optimum triggering of the high resolution camera of the IAM. The IAM camera combines the features necessary to operate consistently in the harsh environment encountered when imaging a vehicle 'head-on' in both day and night conditions. The image clarity obtained is ideally suited for automatic location and recognition of the vehicle license plate. This paper discusses the camera geometry, sensor characteristics and the image processing methods which permit consistent vehicle segmentation from a cluttered background allowing object oriented pattern recognition to be used for vehicle classification. The image capture of high resolution images and the image characteristics required for the LPRMs automatic reading of vehicle license plates, is also discussed. The results of field tests presented demonstrate that the vision based Safe-T-Cam system, currently installed on open highways, is capable of producing automatic classification of vehicle class and recording of vehicle numberplates with a success rate around 90 percent in a period of 24 hours.
Drechsler, Axel; Helling, Tobias; Steinfartz, Sebastian
2015-01-01
Capture–mark–recapture (CMR) approaches are the backbone of many studies in population ecology to gain insight on the life cycle, migration, habitat use, and demography of target species. The reliable and repeatable recognition of an individual throughout its lifetime is the basic requirement of a CMR study. Although invasive techniques are available to mark individuals permanently, noninvasive methods for individual recognition mainly rest on photographic identification of external body markings, which are unique at the individual level. The re-identification of an individual based on comparing shape patterns of photographs by eye is commonly used. Automated processes for photographic re-identification have been recently established, but their performance in large datasets (i.e., > 1000 individuals) has rarely been tested thoroughly. Here, we evaluated the performance of the program AMPHIDENT, an automatic algorithm to identify individuals on the basis of ventral spot patterns in the great crested newt (Triturus cristatus) versus the genotypic fingerprint of individuals based on highly polymorphic microsatellite loci using GENECAP. Between 2008 and 2010, we captured, sampled and photographed adult newts and calculated for 1648 samples/photographs recapture rates for both approaches. Recapture rates differed slightly with 8.34% for GENECAP and 9.83% for AMPHIDENT. With an estimated rate of 2% false rejections (FRR) and 0.00% false acceptances (FAR), AMPHIDENT proved to be a highly reliable algorithm for CMR studies of large datasets. We conclude that the application of automatic recognition software of individual photographs can be a rather powerful and reliable tool in noninvasive CMR studies for a large number of individuals. Because the cross-correlation of standardized shape patterns is generally applicable to any pattern that provides enough information, this algorithm is capable of becoming a single application with broad use in CMR studies for many species. PMID:25628871
Automatic segmentation and supervised learning-based selection of nuclei in cancer tissue images.
Nandy, Kaustav; Gudla, Prabhakar R; Amundsen, Ryan; Meaburn, Karen J; Misteli, Tom; Lockett, Stephen J
2012-09-01
Analysis of preferential localization of certain genes within the cell nuclei is emerging as a new technique for the diagnosis of breast cancer. Quantitation requires accurate segmentation of 100-200 cell nuclei in each tissue section to draw a statistically significant result. Thus, for large-scale analysis, manual processing is too time consuming and subjective. Fortuitously, acquired images generally contain many more nuclei than are needed for analysis. Therefore, we developed an integrated workflow that selects, following automatic segmentation, a subpopulation of accurately delineated nuclei for positioning of fluorescence in situ hybridization-labeled genes of interest. Segmentation was performed by a multistage watershed-based algorithm and screening by an artificial neural network-based pattern recognition engine. The performance of the workflow was quantified in terms of the fraction of automatically selected nuclei that were visually confirmed as well segmented and by the boundary accuracy of the well-segmented nuclei relative to a 2D dynamic programming-based reference segmentation method. Application of the method was demonstrated for discriminating normal and cancerous breast tissue sections based on the differential positioning of the HES5 gene. Automatic results agreed with manual analysis in 11 out of 14 cancers, all four normal cases, and all five noncancerous breast disease cases, thus showing the accuracy and robustness of the proposed approach. Published 2012 Wiley Periodicals, Inc.
Study on recognition algorithm for paper currency numbers based on neural network
NASA Astrophysics Data System (ADS)
Li, Xiuyan; Liu, Tiegen; Li, Yuanyao; Zhang, Zhongchuan; Deng, Shichao
2008-12-01
Based on the unique characteristic, the paper currency numbers can be put into record and the automatic identification equipment for paper currency numbers is supplied to currency circulation market in order to provide convenience for financial sectors to trace the fiduciary circulation socially and provide effective supervision on paper currency. Simultaneously it is favorable for identifying forged notes, blacklisting the forged notes numbers and solving the major social problems, such as armor cash carrier robbery, money laundering. For the purpose of recognizing the paper currency numbers, a recognition algorithm based on neural network is presented in the paper. Number lines in original paper currency images can be draw out through image processing, such as image de-noising, skew correction, segmentation, and image normalization. According to the different characteristics between digits and letters in serial number, two kinds of classifiers are designed. With the characteristics of associative memory, optimization-compute and rapid convergence, the Discrete Hopfield Neural Network (DHNN) is utilized to recognize the letters; with the characteristics of simple structure, quick learning and global optimum, the Radial-Basis Function Neural Network (RBFNN) is adopted to identify the digits. Then the final recognition results are obtained by combining the two kinds of recognition results in regular sequence. Through the simulation tests, it is confirmed by simulation results that the recognition algorithm of combination of two kinds of recognition methods has such advantages as high recognition rate and faster recognition simultaneously, which is worthy of broad application prospect.
Self-organized Evaluation of Dynamic Hand Gestures for Sign Language Recognition
NASA Astrophysics Data System (ADS)
Buciu, Ioan; Pitas, Ioannis
Two main theories exist with respect to face encoding and representation in the human visual system (HVS). The first one refers to the dense (holistic) representation of the face, where faces have "holon"-like appearance. The second one claims that a more appropriate face representation is given by a sparse code, where only a small fraction of the neural cells corresponding to face encoding is activated. Theoretical and experimental evidence suggest that the HVS performs face analysis (encoding, storing, face recognition, facial expression recognition) in a structured and hierarchical way, where both representations have their own contribution and goal. According to neuropsychological experiments, it seems that encoding for face recognition, relies on holistic image representation, while a sparse image representation is used for facial expression analysis and classification. From the computer vision perspective, the techniques developed for automatic face and facial expression recognition fall into the same two representation types. Like in Neuroscience, the techniques which perform better for face recognition yield a holistic image representation, while those techniques suitable for facial expression recognition use a sparse or local image representation. The proposed mathematical models of image formation and encoding try to simulate the efficient storing, organization and coding of data in the human cortex. This is equivalent with embedding constraints in the model design regarding dimensionality reduction, redundant information minimization, mutual information minimization, non-negativity constraints, class information, etc. The presented techniques are applied as a feature extraction step followed by a classification method, which also heavily influences the recognition results.
Automatic micropropagation of plants--the vision-system: graph rewriting as pattern recognition
NASA Astrophysics Data System (ADS)
Schwanke, Joerg; Megnet, Roland; Jensch, Peter F.
1993-03-01
The automation of plant-micropropagation is necessary to produce high amounts of biomass. Plants have to be dissected on particular cutting-points. A vision-system is needed for the recognition of the cutting-points on the plants. With this background, this contribution is directed to the underlying formalism to determine cutting-points on abstract-plant models. We show the usefulness of pattern recognition by graph-rewriting along with some examples in this context.
Deep learning architecture for recognition of abnormal activities
NASA Astrophysics Data System (ADS)
Khatrouch, Marwa; Gnouma, Mariem; Ejbali, Ridha; Zaied, Mourad
2018-04-01
The video surveillance is one of the key areas in computer vision researches. The scientific challenge in this field involves the implementation of automatic systems to obtain detailed information about individuals and groups behaviors. In particular, the detection of abnormal movements of groups or individuals requires a fine analysis of frames in the video stream. In this article, we propose a new method to detect anomalies in crowded scenes. We try to categorize the video in a supervised mode accompanied by unsupervised learning using the principle of the autoencoder. In order to construct an informative concept for the recognition of these behaviors, we use a technique of representation based on the superposition of human silhouettes. The evaluation of the UMN dataset demonstrates the effectiveness of the proposed approach.
Identifying images of handwritten digits using deep learning in H2O
NASA Astrophysics Data System (ADS)
Sadhasivam, Jayakumar; Charanya, R.; Kumar, S. Harish; Srinivasan, A.
2017-11-01
Automatic digit recognition is of popular interest today. Deep learning techniques make it possible for object recognition in image data. Perceiving the digit has turned into a fundamental part as far as certifiable applications. Since, digits are composed in various styles in this way to distinguish the digit it is important to perceive and arrange it with the assistance of machine learning methods. This exploration depends on supervised learning vector quantization neural system arranged under counterfeit artificial neural network. The pictures of digits are perceived, prepared and tried. After the system is made digits are prepared utilizing preparing dataset vectors and testing is connected to the pictures of digits which are separated to each other by fragmenting the picture and resizing the digit picture as needs be for better precision.
Error Rates in Users of Automatic Face Recognition Software
White, David; Dunn, James D.; Schmid, Alexandra C.; Kemp, Richard I.
2015-01-01
In recent years, wide deployment of automatic face recognition systems has been accompanied by substantial gains in algorithm performance. However, benchmarking tests designed to evaluate these systems do not account for the errors of human operators, who are often an integral part of face recognition solutions in forensic and security settings. This causes a mismatch between evaluation tests and operational accuracy. We address this by measuring user performance in a face recognition system used to screen passport applications for identity fraud. Experiment 1 measured target detection accuracy in algorithm-generated ‘candidate lists’ selected from a large database of passport images. Accuracy was notably poorer than in previous studies of unfamiliar face matching: participants made over 50% errors for adult target faces, and over 60% when matching images of children. Experiment 2 then compared performance of student participants to trained passport officers–who use the system in their daily work–and found equivalent performance in these groups. Encouragingly, a group of highly trained and experienced “facial examiners” outperformed these groups by 20 percentage points. We conclude that human performance curtails accuracy of face recognition systems–potentially reducing benchmark estimates by 50% in operational settings. Mere practise does not attenuate these limits, but superior performance of trained examiners suggests that recruitment and selection of human operators, in combination with effective training and mentorship, can improve the operational accuracy of face recognition systems. PMID:26465631
Position estimation and driving of an autonomous vehicle by monocular vision
NASA Astrophysics Data System (ADS)
Hanan, Jay C.; Kayathi, Pavan; Hughlett, Casey L.
2007-04-01
Automatic adaptive tracking in real-time for target recognition provided autonomous control of a scale model electric truck. The two-wheel drive truck was modified as an autonomous rover test-bed for vision based guidance and navigation. Methods were implemented to monitor tracking error and ensure a safe, accurate arrival at the intended science target. Some methods are situation independent relying only on the confidence error of the target recognition algorithm. Other methods take advantage of the scenario of combined motion and tracking to filter out anomalies. In either case, only a single calibrated camera was needed for position estimation. Results from real-time autonomous driving tests on the JPL simulated Mars yard are presented. Recognition error was often situation dependent. For the rover case, the background was in motion and may be characterized to provide visual cues on rover travel such as rate, pitch, roll, and distance to objects of interest or hazards. Objects in the scene may be used as landmarks, or waypoints, for such estimations. As objects are approached, their scale increases and their orientation may change. In addition, particularly on rough terrain, these orientation and scale changes may be unpredictable. Feature extraction combined with the neural network algorithm was successful in providing visual odometry in the simulated Mars environment.
Low-Rank Tensor Subspace Learning for RGB-D Action Recognition.
Jia, Chengcheng; Fu, Yun
2016-07-09
Since RGB-D action data inherently equip with extra depth information compared with RGB data, recently many works employ RGB-D data in a third-order tensor representation containing spatio-temporal structure to find a subspace for action recognition. However, there are two main challenges of these methods. First, the dimension of subspace is usually fixed manually. Second, preserving local information by finding intraclass and inter-class neighbors from a manifold is highly timeconsuming. In this paper, we learn a tensor subspace, whose dimension is learned automatically by low-rank learning, for RGB-D action recognition. Particularly, the tensor samples are factorized to obtain three Projection Matrices (PMs) by Tucker Decomposition, where all the PMs are performed by nuclear norm in a close-form to obtain the tensor ranks which are used as tensor subspace dimension. Additionally, we extract the discriminant and local information from a manifold using a graph constraint. This graph preserves the local knowledge inherently, which is faster than the previous way by calculating both the intra-class and inter-class neighbors of each sample. We evaluate the proposed method on four widely used RGB-D action datasets including MSRDailyActivity3D, MSRActionPairs, MSRActionPairs skeleton and UTKinect-Action3D datasets, and the experimental results show higher accuracy and efficiency of the proposed method.
Three-dimensional imaging of artificial fingerprint by optical coherence tomography
NASA Astrophysics Data System (ADS)
Larin, Kirill V.; Cheng, Yezeng
2008-03-01
Fingerprint recognition is one of the popular used methods of biometrics. However, due to the surface topography limitation, fingerprint recognition scanners are easily been spoofed, e.g. using artificial fingerprint dummies. Thus, biometric fingerprint identification devices need to be more accurate and secure to deal with different fraudulent methods including dummy fingerprints. Previously, we demonstrated that Optical Coherence Tomography (OCT) images revealed the presence of the artificial fingerprints (made from different household materials, such as cement and liquid silicone rubber) at all times, while the artificial fingerprints easily spoofed the commercial fingerprint reader. Also we demonstrated that an analysis of the autocorrelation of the OCT images could be used in automatic recognition systems. Here, we exploited the three-dimensional (3D) imaging of the artificial fingerprint by OCT to generate vivid 3D image for both the artificial fingerprint layer and the real fingerprint layer beneath. With the reconstructed 3D image, it could not only point out whether there exists an artificial material, which is intended to spoof the scanner, above the real finger, but also could provide the hacker's fingerprint. The results of these studies suggested that Optical Coherence Tomography could be a powerful real-time noninvasive method for accurate identification of artificial fingerprints real fingerprints as well.
EMG-based speech recognition using hidden markov models with global control variables.
Lee, Ki-Seung
2008-03-01
It is well known that a strong relationship exists between human voices and the movement of articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The sequence of EMG signals for each word is modelled by a hidden Markov model (HMM) framework. The main objective of the work involves building a model for state observation density when multichannel observation sequences are given. The proposed model reflects the dependencies between each of the EMG signals, which are described by introducing a global control variable. We also develop an efficient model training method, based on a maximum likelihood criterion. In a preliminary study, 60 isolated words were used as recognition variables. EMG signals were acquired from three articulatory facial muscles. The findings indicate that such a system may have the capacity to recognize speech signals with an accuracy of up to 87.07%, which is superior to the independent probabilistic model.
Task-Dependent Masked Priming Effects in Visual Word Recognition
Kinoshita, Sachiko; Norris, Dennis
2012-01-01
A method used widely to study the first 250 ms of visual word recognition is masked priming: These studies have yielded a rich set of data concerning the processes involved in recognizing letters and words. In these studies, there is an implicit assumption that the early processes in word recognition tapped by masked priming are automatic, and masked priming effects should therefore be invariant across tasks. Contrary to this assumption, masked priming effects are modulated by the task goal: For example, only word targets show priming in the lexical decision task, but both words and non-words do in the same-different task; semantic priming effects are generally weak in the lexical decision task but are robust in the semantic categorization task. We explain how such task dependence arises within the Bayesian Reader account of masked priming (Norris and Kinoshita, 2008), and how the task dissociations can be used to understand the early processes in lexical access. PMID:22675316
Automatic gang graffiti recognition and interpretation
NASA Astrophysics Data System (ADS)
Parra, Albert; Boutin, Mireille; Delp, Edward J.
2017-09-01
One of the roles of emergency first responders (e.g., police and fire departments) is to prevent and protect against events that can jeopardize the safety and well-being of a community. In the case of criminal gang activity, tools are needed for finding, documenting, and taking the necessary actions to mitigate the problem or issue. We describe an integrated mobile-based system capable of using location-based services, combined with image analysis, to track and analyze gang activity through the acquisition, indexing, and recognition of gang graffiti images. This approach uses image analysis methods for color recognition, image segmentation, and image retrieval and classification. A database of gang graffiti images is described that includes not only the images but also metadata related to the images, such as date and time, geoposition, gang, gang member, colors, and symbols. The user can then query the data in a useful manner. We have implemented these features both as applications for Android and iOS hand-held devices and as a web-based interface.
Tone classification of syllable-segmented Thai speech based on multilayer perception
NASA Astrophysics Data System (ADS)
Satravaha, Nuttavudh; Klinkhachorn, Powsiri; Lass, Norman
2002-05-01
Thai is a monosyllabic tonal language that uses tone to convey lexical information about the meaning of a syllable. Thus to completely recognize a spoken Thai syllable, a speech recognition system not only has to recognize a base syllable but also must correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system. Thai has five distinctive tones (``mid,'' ``low,'' ``falling,'' ``high,'' and ``rising'') and each tone is represented by a single fundamental frequency (F0) pattern. However, several factors, including tonal coarticulation, stress, intonation, and speaker variability, affect the F0 pattern of a syllable in continuous Thai speech. In this study, an efficient method for tone classification of syllable-segmented Thai speech, which incorporates the effects of tonal coarticulation, stress, and intonation, as well as a method to perform automatic syllable segmentation, were developed. Acoustic parameters were used as the main discriminating parameters. The F0 contour of a segmented syllable was normalized by using a z-score transformation before being presented to a tone classifier. The proposed system was evaluated on 920 test utterances spoken by 8 speakers. A recognition rate of 91.36% was achieved by the proposed system.
Cho, Woon; Jang, Jinbeum; Koschan, Andreas; Abidi, Mongi A; Paik, Joonki
2016-11-28
A fundamental limitation of hyperspectral imaging is the inter-band misalignment correlated with subject motion during data acquisition. One way of resolving this problem is to assess the alignment quality of hyperspectral image cubes derived from the state-of-the-art alignment methods. In this paper, we present an automatic selection framework for the optimal alignment method to improve the performance of face recognition. Specifically, we develop two qualitative prediction models based on: 1) a principal curvature map for evaluating the similarity index between sequential target bands and a reference band in the hyperspectral image cube as a full-reference metric; and 2) the cumulative probability of target colors in the HSV color space for evaluating the alignment index of a single sRGB image rendered using all of the bands of the hyperspectral image cube as a no-reference metric. We verify the efficacy of the proposed metrics on a new large-scale database, demonstrating a higher prediction accuracy in determining improved alignment compared to two full-reference and five no-reference image quality metrics. We also validate the ability of the proposed framework to improve hyperspectral face recognition.
A voice-input voice-output communication aid for people with severe speech impairment.
Hawley, Mark S; Cunningham, Stuart P; Green, Phil D; Enderby, Pam; Palmer, Rebecca; Sehgal, Siddharth; O'Neill, Peter
2013-01-01
A new form of augmentative and alternative communication (AAC) device for people with severe speech impairment-the voice-input voice-output communication aid (VIVOCA)-is described. The VIVOCA recognizes the disordered speech of the user and builds messages, which are converted into synthetic speech. System development was carried out employing user-centered design and development methods, which identified and refined key requirements for the device. A novel methodology for building small vocabulary, speaker-dependent automatic speech recognizers with reduced amounts of training data, was applied. Experiments showed that this method is successful in generating good recognition performance (mean accuracy 96%) on highly disordered speech, even when recognition perplexity is increased. The selected message-building technique traded off various factors including speed of message construction and range of available message outputs. The VIVOCA was evaluated in a field trial by individuals with moderate to severe dysarthria and confirmed that they can make use of the device to produce intelligible speech output from disordered speech input. The trial highlighted some issues which limit the performance and usability of the device when applied in real usage situations, with mean recognition accuracy of 67% in these circumstances. These limitations will be addressed in future work.
Biological object recognition in μ-radiography images
NASA Astrophysics Data System (ADS)
Prochazka, A.; Dammer, J.; Weyda, F.; Sopko, V.; Benes, J.; Zeman, J.; Jandejsek, I.
2015-03-01
This study presents an applicability of real-time microradiography to biological objects, namely to horse chestnut leafminer, Cameraria ohridella (Insecta: Lepidoptera, Gracillariidae) and following image processing focusing on image segmentation and object recognition. The microradiography of insects (such as horse chestnut leafminer) provides a non-invasive imaging that leaves the organisms alive. The imaging requires a high spatial resolution (micrometer scale) radiographic system. Our radiographic system consists of a micro-focus X-ray tube and two types of detectors. The first is a charge integrating detector (Hamamatsu flat panel), the second is a pixel semiconductor detector (Medipix2 detector). The latter allows detection of single quantum photon of ionizing radiation. We obtained numerous horse chestnuts leafminer pupae in several microradiography images easy recognizable in automatic mode using the image processing methods. We implemented an algorithm that is able to count a number of dead and alive pupae in images. The algorithm was based on two methods: 1) noise reduction using mathematical morphology filters, 2) Canny edge detection. The accuracy of the algorithm is higher for the Medipix2 (average recall for detection of alive pupae =0.99, average recall for detection of dead pupae =0.83), than for the flat panel (average recall for detection of alive pupae =0.99, average recall for detection of dead pupae =0.77). Therefore, we conclude that Medipix2 has lower noise and better displays contours (edges) of biological objects. Our method allows automatic selection and calculation of dead and alive chestnut leafminer pupae. It leads to faster monitoring of the population of one of the world's important insect pest.
Recognition of plant parts with problem-specific algorithms
NASA Astrophysics Data System (ADS)
Schwanke, Joerg; Brendel, Thorsten; Jensch, Peter F.; Megnet, Roland
1994-06-01
Automatic micropropagation is necessary to produce cost-effective high amounts of biomass. Juvenile plants are dissected in clean- room environment on particular points on the stem or the leaves. A vision-system detects possible cutting points and controls a specialized robot. This contribution is directed to the pattern- recognition algorithms to detect structural parts of the plant.
User Experience of a Mobile Speaking Application with Automatic Speech Recognition for EFL Learning
ERIC Educational Resources Information Center
Ahn, Tae youn; Lee, Sangmin-Michelle
2016-01-01
With the spread of mobile devices, mobile phones have enormous potential regarding their pedagogical use in language education. The goal of this study is to analyse user experience of a mobile-based learning system that is enhanced by speech recognition technology for the improvement of EFL (English as a foreign language) learners' speaking…
2011-11-04
environmen- tal lighting conditions that one can actually come across. L7 and L8 are also cases of low illumination intensity. To produce our experimental...Graphics (Proceedings of ACM SIGGRAPH), 26(3). [9] Riklin- Raviv T., Shashua A., (1999). The quotient image: class based recognition and synthesis under
ERIC Educational Resources Information Center
Franco, Horacio; Bratt, Harry; Rossier, Romain; Rao Gadde, Venkata; Shriberg, Elizabeth; Abrash, Victor; Precoda, Kristin
2010-01-01
SRI International's EduSpeak[R] system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology. Automatic pronunciation scoring allows the computer to provide feedback on the overall quality of pronunciation and to point to…
Thermographic techniques and adapted algorithms for automatic detection of foreign bodies in food
NASA Astrophysics Data System (ADS)
Meinlschmidt, Peter; Maergner, Volker
2003-04-01
At the moment foreign substances in food are detected mainly by using mechanical and optical methods as well as ultrasonic technique and than they are removed from the further process. These techniques detect a large portion of the foreign substances due to their different mass (mechanical sieving), their different colour (optical method) and their different surface density (ultrasonic detection). Despite the numerous different methods a considerable portion of the foreign substances remain undetected. In order to recognise materials still undetected, a complementary detection method would be desirable removing the foreign substances not registered by the a.m. methods from the production process. In a project with 13 partner from the food industry, the Fraunhofer - Institut für Holzforschung (WKI) and the Technische Unsiversität are trying to adapt thermography for the detection of foreign bodies in the food industry. After the initial tests turned out to be very promising for the differentiation of food stuffs and foreign substances, more and detailed investigation were carried out to develop suitable algorithms for automatic detection of foreign bodies. In order to achieve -besides the mere visual detection of foreign substances- also an automatic detection under production conditions, numerous experiences in image processing and pattern recognition are exploited. Results for the detection of foreign bodies will be presented at the conference showing the different advantages and disadvantages of using grey - level, statistical and morphological image processing techniques.
Optimal pattern synthesis for speech recognition based on principal component analysis
NASA Astrophysics Data System (ADS)
Korsun, O. N.; Poliyev, A. V.
2018-02-01
The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
Hybrid generative-discriminative approach to age-invariant face recognition
NASA Astrophysics Data System (ADS)
Sajid, Muhammad; Shafique, Tamoor
2018-03-01
Age-invariant face recognition is still a challenging research problem due to the complex aging process involving types of facial tissues, skin, fat, muscles, and bones. Most of the related studies that have addressed the aging problem are focused on generative representation (aging simulation) or discriminative representation (feature-based approaches). Designing an appropriate hybrid approach taking into account both the generative and discriminative representations for age-invariant face recognition remains an open problem. We perform a hybrid matching to achieve robustness to aging variations. This approach automatically segments the eyes, nose-bridge, and mouth regions, which are relatively less sensitive to aging variations compared with the rest of the facial regions that are age-sensitive. The aging variations of age-sensitive facial parts are compensated using a demographic-aware generative model based on a bridged denoising autoencoder. The age-insensitive facial parts are represented by pixel average vector-based local binary patterns. Deep convolutional neural networks are used to extract relative features of age-sensitive and age-insensitive facial parts. Finally, the feature vectors of age-sensitive and age-insensitive facial parts are fused to achieve the recognition results. Extensive experimental results on morphological face database II (MORPH II), face and gesture recognition network (FG-NET), and Verification Subset of cross-age celebrity dataset (CACD-VS) demonstrate the effectiveness of the proposed method for age-invariant face recognition well.
Seyeddain, Orang; Kraker, Hannes; Redlberger, Andreas; Dexl, Alois K; Grabner, Günther; Emesz, Martin
2014-01-01
To investigate the reliability of a biometric iris recognition system for personal authentication after cataract surgery or iatrogenic pupil dilation. This was a prospective, nonrandomized, single-center, cohort study for evaluating the performance of an iris recognition system 2-24 hours after phacoemulsification and intraocular lens implantation (group 1) and before and after iatrogenic pupil dilation (group 2). Of the 173 eyes that could be enrolled before cataract surgery, 164 (94.8%) were easily recognized postoperatively, whereas in 9 (5.2%) this was not possible. However, these 9 eyes could be reenrolled and afterwards recognized successfully. In group 2, of a total of 184 eyes that were enrolled in miosis, a total of 22 (11.9%) could not be recognized in mydriasis and therefore needed reenrollment. No single case of false-positive acceptance occurred in either group. The results of this trial indicate that standard cataract surgery seems not to be a limiting factor for iris recognition in the large majority of cases. Some patients (5.2% in this study) might need "reenrollment" after cataract surgery. Iris recognition was primarily successful in eyes with medically dilated pupils in nearly 9 out of 10 eyes. No single case of false-positive acceptance occurred in either group in this trial. It seems therefore that iris recognition is a valid biometric method in the majority of cases after cataract surgery or after pupil dilation.
Automatic Modulation Classification Based on Deep Learning for Unmanned Aerial Vehicles.
Zhang, Duona; Ding, Wenrui; Zhang, Baochang; Xie, Chunyu; Li, Hongguang; Liu, Chunhui; Han, Jungong
2018-03-20
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication signal recognition and remains challenging for traditional methods due to complex disturbance from other sources. This paper proposes a heterogeneous deep model fusion (HDMF) method to solve the problem in a unified framework. The contributions include the following: (1) a convolutional neural network (CNN) and long short-term memory (LSTM) are combined by two different ways without prior knowledge involved; (2) a large database, including eleven types of single-carrier modulation signals with various noises as well as a fading channel, is collected with various signal-to-noise ratios (SNRs) based on a real geographical environment; and (3) experimental results demonstrate that HDMF is very capable of coping with the AMC problem, and achieves much better performance when compared with the independent network.
Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech
NASA Astrophysics Data System (ADS)
Furui, Sadaoki
This paper presents our recent work in regard to building Large Vocabulary Continuous Speech Recognition (LVCSR) systems for the Thai, Indonesian, and Chinese languages. For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and applied topic and speaking style adaptation to the language model to recognize spoken-style utterances. For Indonesian, we have applied proper noun-specific adaptation to acoustic modeling, and rule-based English-to-Indonesian phoneme mapping to solve the problem of large variation in proper noun and English word pronunciation in a spoken-query information retrieval system. In spoken Chinese, long organization names are frequently abbreviated, and abbreviated utterances cannot be recognized if the abbreviations are not included in the dictionary. We have proposed a new method for automatically generating Chinese abbreviations, and by expanding the vocabulary using the generated abbreviations, we have significantly improved the performance of spoken query-based search.
Automatic Modulation Classification Based on Deep Learning for Unmanned Aerial Vehicles
Ding, Wenrui; Zhang, Baochang; Xie, Chunyu; Li, Hongguang; Liu, Chunhui; Han, Jungong
2018-01-01
Deep learning has recently attracted much attention due to its excellent performance in processing audio, image, and video data. However, few studies are devoted to the field of automatic modulation classification (AMC). It is one of the most well-known research topics in communication signal recognition and remains challenging for traditional methods due to complex disturbance from other sources. This paper proposes a heterogeneous deep model fusion (HDMF) method to solve the problem in a unified framework. The contributions include the following: (1) a convolutional neural network (CNN) and long short-term memory (LSTM) are combined by two different ways without prior knowledge involved; (2) a large database, including eleven types of single-carrier modulation signals with various noises as well as a fading channel, is collected with various signal-to-noise ratios (SNRs) based on a real geographical environment; and (3) experimental results demonstrate that HDMF is very capable of coping with the AMC problem, and achieves much better performance when compared with the independent network. PMID:29558434
Shinozaki, Takahiro
2018-01-01
Human-computer interface systems whose input is based on eye movements can serve as a means of communication for patients with locked-in syndrome. Eye-writing is one such system; users can input characters by moving their eyes to follow the lines of the strokes corresponding to characters. Although this input method makes it easy for patients to get started because of their familiarity with handwriting, existing eye-writing systems suffer from slow input rates because they require a pause between input characters to simplify the automatic recognition process. In this paper, we propose a continuous eye-writing recognition system that achieves a rapid input rate because it accepts characters eye-written continuously, with no pauses. For recognition purposes, the proposed system first detects eye movements using electrooculography (EOG), and then a hidden Markov model (HMM) is applied to model the EOG signals and recognize the eye-written characters. Additionally, this paper investigates an EOG adaptation that uses a deep neural network (DNN)-based HMM. Experiments with six participants showed an average input speed of 27.9 character/min using Japanese Katakana as the input target characters. A Katakana character-recognition error rate of only 5.0% was achieved using 13.8 minutes of adaptation data. PMID:29425248
Robust Tomato Recognition for Robotic Harvesting Using Feature Images Fusion
Zhao, Yuanshen; Gong, Liang; Huang, Yixiang; Liu, Chengliang
2016-01-01
Automatic recognition of mature fruits in a complex agricultural environment is still a challenge for an autonomous harvesting robot due to various disturbances existing in the background of the image. The bottleneck to robust fruit recognition is reducing influence from two main disturbances: illumination and overlapping. In order to recognize the tomato in the tree canopy using a low-cost camera, a robust tomato recognition algorithm based on multiple feature images and image fusion was studied in this paper. Firstly, two novel feature images, the a*-component image and the I-component image, were extracted from the L*a*b* color space and luminance, in-phase, quadrature-phase (YIQ) color space, respectively. Secondly, wavelet transformation was adopted to fuse the two feature images at the pixel level, which combined the feature information of the two source images. Thirdly, in order to segment the target tomato from the background, an adaptive threshold algorithm was used to get the optimal threshold. The final segmentation result was processed by morphology operation to reduce a small amount of noise. In the detection tests, 93% target tomatoes were recognized out of 200 overall samples. It indicates that the proposed tomato recognition method is available for robotic tomato harvesting in the uncontrolled environment with low cost. PMID:26840313
Pattern recognition for passive polarimetric data using nonparametric classifiers
NASA Astrophysics Data System (ADS)
Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.
2005-08-01
Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.
Towards a smart glove: arousal recognition based on textile Electrodermal Response.
Valenza, Gaetano; Lanata, Antonio; Scilingo, Enzo Pasquale; De Rossi, Danilo
2010-01-01
This paper investigates the possibility of using Electrodermal Response, acquired by a sensing fabric glove with embedded textile electrodes, as reliable means for emotion recognition. Here, all the essential steps for an automatic recognition system are described, from the recording of physiological data set to a feature-based multiclass classification. Data were collected from 35 healthy volunteers during arousal elicitation by means of International Affective Picture System (IAPS) pictures. Experimental results show high discrimination after twenty steps of cross validation.
Analysis of facial expressions in parkinson's disease through video-based automatic methods.
Bandini, Andrea; Orlandi, Silvia; Escalante, Hugo Jair; Giovannelli, Fabio; Cincotta, Massimo; Reyes-Garcia, Carlos A; Vanni, Paola; Zaccara, Gaetano; Manfredi, Claudia
2017-04-01
The automatic analysis of facial expressions is an evolving field that finds several clinical applications. One of these applications is the study of facial bradykinesia in Parkinson's disease (PD), which is a major motor sign of this neurodegenerative illness. Facial bradykinesia consists in the reduction/loss of facial movements and emotional facial expressions called hypomimia. In this work we propose an automatic method for studying facial expressions in PD patients relying on video-based METHODS: 17 Parkinsonian patients and 17 healthy control subjects were asked to show basic facial expressions, upon request of the clinician and after the imitation of a visual cue on a screen. Through an existing face tracker, the Euclidean distance of the facial model from a neutral baseline was computed in order to quantify the changes in facial expressivity during the tasks. Moreover, an automatic facial expressions recognition algorithm was trained in order to study how PD expressions differed from the standard expressions. Results show that control subjects reported on average higher distances than PD patients along the tasks. This confirms that control subjects show larger movements during both posed and imitated facial expressions. Moreover, our results demonstrate that anger and disgust are the two most impaired expressions in PD patients. Contactless video-based systems can be important techniques for analyzing facial expressions also in rehabilitation, in particular speech therapy, where patients could get a definite advantage from a real-time feedback about the proper facial expressions/movements to perform. Copyright © 2017 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Warmington, Meesha; Hulme, Charles
2012-01-01
This study examines the concurrent relationships between phoneme awareness, visual-verbal paired-associate learning, rapid automatized naming (RAN), and reading skills in 7- to 11-year-old children. Path analyses showed that visual-verbal paired-associate learning and RAN, but not phoneme awareness, were unique predictors of word recognition,…
Assessing Children's Home Language Environments Using Automatic Speech Recognition Technology
ERIC Educational Resources Information Center
Greenwood, Charles R.; Thiemann-Bourque, Kathy; Walker, Dale; Buzhardt, Jay; Gilkerson, Jill
2011-01-01
The purpose of this research was to replicate and extend some of the findings of Hart and Risley using automatic speech processing instead of human transcription of language samples. The long-term goal of this work is to make the current approach to speech processing possible by researchers and clinicians working on a daily basis with families and…
Hayes, Scott M.; Baena, Elsa; Truong, Trong-Kha; Cabeza, Roberto
2011-01-01
Although people do not normally try to remember associations between faces and physical contexts, these associations are established automatically, as indicated by the difficulty of recognizing familiar faces in different contexts (“butcher-on-the-bus” phenomenon). The present functional MRI (fMRI) study investigated the automatic binding of faces and scenes. In the Face-Face (F-F) condition, faces were presented alone during both encoding and retrieval, whereas in the Face/Scene-Face (FS-F) condition, they were presented overlaid on scenes during encoding but alone during retrieval (context change). Although participants were instructed to focus only on the faces during both encoding and retrieval, recognition performance was worse in the FS-F than the F-F condition (“context shift decrement”—CSD), confirming automatic face-scene binding during encoding. This binding was mediated by the hippocampus as indicated by greater subsequent memory effects (remembered > forgotten) in this region for the FS-F than the F-F condition. Scene memory was mediated by the right parahippocampal cortex, which was reactivated during successful retrieval when the faces were associated with a scene during encoding (FS-F condition). Analyses using the CSD as a regressor yielded a clear hemispheric asymmetry in medial temporal lobe activity during encoding: left hippocampal and parahippocampal activity was associated with a smaller CSD, indicating more flexible memory representations immune to context changes, whereas right hippocampal/rhinal activity was associated with a larger CSD, indicating less flexible representations sensitive to context change. Taken together, the results clarify the neural mechanisms of context effects on face recognition. PMID:19925208
Computer aided analysis of gait patterns in patients with acute anterior cruciate ligament injury.
Christian, Josef; Kröll, Josef; Strutzenberger, Gerda; Alexander, Nathalie; Ofner, Michael; Schwameder, Hermann
2016-03-01
Gait analysis is a useful tool to evaluate the functional status of patients with anterior cruciate ligament injury. Pattern recognition methods can be used to automatically assess walking patterns and objectively support clinical decisions. This study aimed to test a pattern recognition system for analyzing kinematic gait patterns of recently anterior cruciate ligament injured patients and for evaluating the effects of a therapeutic treatment. Gait kinematics of seven male patients with an acute unilateral anterior cruciate ligament rupture and seven healthy males were recorded. A support vector machine was trained to distinguish the groups. Principal component analysis and recursive feature elimination were used to extract features from 3D marker trajectories. A Classifier Oriented Gait Score was defined as a measure of gait quality. Visualizations were used to allow functional interpretations of characteristic group differences. The injured group was evaluated by the system after a therapeutic treatment. The results were compared against a clinical rating of the patients' gait. Cross validation yielded 100% accuracy. After the treatment the score improved significantly (P<0.01) as well as the clinical rating (P<0.05). The visualizations revealed characteristic kinematic features, which differentiated between the groups. The results show that gait alterations in the early phase after anterior cruciate ligament injury can be detected automatically. The results of the automatic analysis are comparable with the clinical rating and support the validity of the system. The visualizations allow interpretations on discriminatory features and can facilitate the integration of the results into the diagnostic process. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Toma, Eiji
2018-06-01
In recent years, as the weight of IT equipment has been reduced, the demand for motor fans for cooling the interior of electronic equipment is on the rise. Sensory test technique by inspectors is the mainstream for quality inspection of motor fans in the field. This sensory test requires a lot of experience to accurately diagnose differences in subtle sounds (sound pressures) of the fans, and the judgment varies depending on the condition of the inspector and the environment. In order to solve these quality problems, development of an analysis method capable of quantitatively and automatically diagnosing the sound/vibration level of a fan is required. In this study, it was clarified that the analysis method applying the MT system based on the waveform information of noise and vibration is more effective than the conventional frequency analysis method for the discrimination diagnosis technology of normal and abnormal items. Furthermore, it was found that due to the automation of the vibration waveform analysis system, there was a factor influencing the discrimination accuracy in relation between the fan installation posture and the vibration waveform.
Assessing the performance of a covert automatic target recognition algorithm
NASA Astrophysics Data System (ADS)
Ehrman, Lisa M.; Lanterman, Aaron D.
2005-05-01
Passive radar systems exploit illuminators of opportunity, such as TV and FM radio, to illuminate potential targets. Doing so allows them to operate covertly and inexpensively. Our research seeks to enhance passive radar systems by adding automatic target recognition (ATR) capabilities. In previous papers we proposed conducting ATR by comparing the radar cross section (RCS) of aircraft detected by a passive radar system to the precomputed RCS of aircraft in the target class. To effectively model the low-frequency setting, the comparison is made via a Rician likelihood model. Monte Carlo simulations indicate that the approach is viable. This paper builds on that work by developing a method for quickly assessing the potential performance of the ATR algorithm without using exhaustive Monte Carlo trials. This method exploits the relation between the probability of error in a binary hypothesis test under the Bayesian framework to the Chernoff information. Since the data are well-modeled as Rician, we begin by deriving a closed-form approximation for the Chernoff information between two Rician densities. This leads to an approximation for the probability of error in the classification algorithm that is a function of the number of available measurements. We conclude with an application that would be particularly cumbersome to accomplish via Monte Carlo trials, but that can be quickly addressed using the Chernoff information approach. This application evaluates the length of time that an aircraft must be tracked before the probability of error in the ATR algorithm drops below a desired threshold.
Research and Development of Fully Automatic Alien Smoke Stack and Packaging System
NASA Astrophysics Data System (ADS)
Yang, Xudong; Ge, Qingkuan; Peng, Tao; Zuo, Ping; Dong, Weifu
2017-12-01
The problem of low efficiency of manual sorting packaging for the current tobacco distribution center, which developed a set of safe efficient and automatic type of alien smoke stack and packaging system. The functions of fully automatic alien smoke stack and packaging system adopt PLC control technology, servo control technology, robot technology, image recognition technology and human-computer interaction technology. The characteristics, principles, control process and key technology of the system are discussed in detail. Through the installation and commissioning fully automatic alien smoke stack and packaging system has a good performance and has completed the requirements for shaped cigarette.
NASA Astrophysics Data System (ADS)
Polsterer, K. L.; Gieseke, F.; Igel, C.
2015-09-01
In the last decades more and more all-sky surveys created an enormous amount of data which is publicly available on the Internet. Crowd-sourcing projects such as Galaxy-Zoo and Radio-Galaxy-Zoo used encouraged users from all over the world to manually conduct various classification tasks. The combination of the pattern-recognition capabilities of thousands of volunteers enabled scientists to finish the data analysis within acceptable time. For up-coming surveys with billions of sources, however, this approach is not feasible anymore. In this work, we present an unsupervised method that can automatically process large amounts of galaxy data and which generates a set of prototypes. This resulting model can be used to both visualize the given galaxy data as well as to classify so far unseen images.
Automatic classification of seismic events within a regional seismograph network
NASA Astrophysics Data System (ADS)
Tiira, Timo; Kortström, Jari; Uski, Marja
2015-04-01
A fully automatic method for seismic event classification within a sparse regional seismograph network is presented. The tool is based on a supervised pattern recognition technique, Support Vector Machine (SVM), trained here to distinguish weak local earthquakes from a bulk of human-made or spurious seismic events. The classification rules rely on differences in signal energy distribution between natural and artificial seismic sources. Seismic records are divided into four windows, P, P coda, S, and S coda. For each signal window STA is computed in 20 narrow frequency bands between 1 and 41 Hz. The 80 discrimination parameters are used as a training data for the SVM. The SVM models are calculated for 19 on-line seismic stations in Finland. The event data are compiled mainly from fully automatic event solutions that are manually classified after automatic location process. The station-specific SVM training events include 11-302 positive (earthquake) and 227-1048 negative (non-earthquake) examples. The best voting rules for combining results from different stations are determined during an independent testing period. Finally, the network processing rules are applied to an independent evaluation period comprising 4681 fully automatic event determinations, of which 98 % have been manually identified as explosions or noise and 2 % as earthquakes. The SVM method correctly identifies 94 % of the non-earthquakes and all the earthquakes. The results imply that the SVM tool can identify and filter out blasts and spurious events from fully automatic event solutions with a high level of confidence. The tool helps to reduce work-load in manual seismic analysis by leaving only ~5 % of the automatic event determinations, i.e. the probable earthquakes for more detailed seismological analysis. The approach presented is easy to adjust to requirements of a denser or wider high-frequency network, once enough training examples for building a station-specific data set are available.
NASA Astrophysics Data System (ADS)
Yang, Jing; Wang, Cheng; Cai, Gan; Dong, Xiaona
2016-10-01
The incidence and mortality rate of the primary liver cancer are very high and its postoperative metastasis and recurrence have become important factors to the prognosis of patients. Circulating tumor cells (CTC), as a new tumor marker, play important roles in the early diagnosis and individualized treatment. This paper presents an effective method to distinguish liver cancer based on the cellular scattering spectrum, which is a non-fluorescence technique based on the fiber confocal microscopic spectrometer. Combining the principal component analysis (PCA) with back propagation (BP) neural network were utilized to establish an automatic recognition model for backscatter spectrum of the liver cancer cells from blood cell. PCA was applied to reduce the dimension of the scattering spectral data which obtained by the fiber confocal microscopic spectrometer. After dimensionality reduction by PCA, a neural network pattern recognition model with 2 input layer nodes, 11 hidden layer nodes, 3 output nodes was established. We trained the network with 66 samples and also tested it. Results showed that the recognition rate of the three types of cells is more than 90%, the relative standard deviation is only 2.36%. The experimental results showed that the fiber confocal microscopic spectrometer combining with the algorithm of PCA and BP neural network can automatically identify the liver cancer cell from the blood cells. This will provide a better tool for investigating the metastasis of liver cancers in vivo, the biology metabolic characteristics of liver cancers and drug transportation. Additionally, it is obviously referential in practical application.
Automatic casting surface defect recognition and classification
NASA Astrophysics Data System (ADS)
Wong, Boon K.; Elliot, M. P.; Rapley, C. W.
1995-03-01
High integrity castings require surfaces free from defects to reduce, if not eliminate, vulnerability to component failure from such as physical or thermal fatigue or corrosion attack. Previous studies have shown that defects on casting surfaces can be optically enhanced from the surrounding randomly textured surface by liquid penetrants, magnetic particle and other methods. However, very little has been reported on recognition and classification of the defects. The basic problem is one of shape recognition and classification, where the shape can vary in size and orientation as well as in actual shape generally within an envelope that classifies it as a particular defect. The initial work done towards this has focused on recognizing and classifying standard shapes such as the circle, square, rectangle and triangle. Various approaches were tried and this led eventually to a series of fuzzy logic based algorithms from which very good results were obtained. From this work fuzzy logic memberships were generated for the detection of defects found on casting surfaces. Simulated model shapes of such as the quench crack, mechanical crack and hole have been used to test the generated algorithm and the results for recognition and classification are very encouraging.
Syntax-directed content analysis of videotext: application to a map detection recognition system
NASA Astrophysics Data System (ADS)
Aradhye, Hrishikesh; Herson, James A.; Myers, Gregory
2003-01-01
Video is an increasingly important and ever-growing source of information to the intelligence and homeland defense analyst. A capability to automatically identify the contents of video imagery would enable the analyst to index relevant foreign and domestic news videos in a convenient and meaningful way. To this end, the proposed system aims to help determine the geographic focus of a news story directly from video imagery by detecting and geographically localizing political maps from news broadcasts, using the results of videotext recognition in lieu of a computationally expensive, scale-independent shape recognizer. Our novel method for the geographic localization of a map is based on the premise that the relative placement of text superimposed on a map roughly corresponds to the geographic coordinates of the locations the text represents. Our scheme extracts and recognizes videotext, and iteratively identifies the geographic area, while allowing for OCR errors and artistic freedom. The fast and reliable recognition of such maps by our system may provide valuable context and supporting evidence for other sources, such as speech recognition transcripts. The concepts of syntax-directed content analysis of videotext presented here can be extended to other content analysis systems.
2014-01-01
For building a new iris template, this paper proposes a strategy to fuse different portions of iris based on machine learning method to evaluate local quality of iris. There are three novelties compared to previous work. Firstly, the normalized segmented iris is divided into multitracks and then each track is estimated individually to analyze the recognition accuracy rate (RAR). Secondly, six local quality evaluation parameters are adopted to analyze texture information of each track. Besides, particle swarm optimization (PSO) is employed to get the weights of these evaluation parameters and corresponding weighted coefficients of different tracks. Finally, all tracks' information is fused according to the weights of different tracks. The experimental results based on subsets of three public and one private iris image databases demonstrate three contributions of this paper. (1) Our experimental results prove that partial iris image cannot completely replace the entire iris image for iris recognition system in several ways. (2) The proposed quality evaluation algorithm is a self-adaptive algorithm, and it can automatically optimize the parameters according to iris image samples' own characteristics. (3) Our feature information fusion strategy can effectively improve the performance of iris recognition system. PMID:24693243
Chen, Ying; Liu, Yuanning; Zhu, Xiaodong; Chen, Huiling; He, Fei; Pang, Yutong
2014-01-01
For building a new iris template, this paper proposes a strategy to fuse different portions of iris based on machine learning method to evaluate local quality of iris. There are three novelties compared to previous work. Firstly, the normalized segmented iris is divided into multitracks and then each track is estimated individually to analyze the recognition accuracy rate (RAR). Secondly, six local quality evaluation parameters are adopted to analyze texture information of each track. Besides, particle swarm optimization (PSO) is employed to get the weights of these evaluation parameters and corresponding weighted coefficients of different tracks. Finally, all tracks' information is fused according to the weights of different tracks. The experimental results based on subsets of three public and one private iris image databases demonstrate three contributions of this paper. (1) Our experimental results prove that partial iris image cannot completely replace the entire iris image for iris recognition system in several ways. (2) The proposed quality evaluation algorithm is a self-adaptive algorithm, and it can automatically optimize the parameters according to iris image samples' own characteristics. (3) Our feature information fusion strategy can effectively improve the performance of iris recognition system.
NASA Astrophysics Data System (ADS)
Abdullah, Nurul Azma; Saidi, Md. Jamri; Rahman, Nurul Hidayah Ab; Wen, Chuah Chai; Hamid, Isredza Rahmi A.
2017-10-01
In practice, identification of criminal in Malaysia is done through thumbprint identification. However, this type of identification is constrained as most of criminal nowadays getting cleverer not to leave their thumbprint on the scene. With the advent of security technology, cameras especially CCTV have been installed in many public and private areas to provide surveillance activities. The footage of the CCTV can be used to identify suspects on scene. However, because of limited software developed to automatically detect the similarity between photo in the footage and recorded photo of criminals, the law enforce thumbprint identification. In this paper, an automated facial recognition system for criminal database was proposed using known Principal Component Analysis approach. This system will be able to detect face and recognize face automatically. This will help the law enforcements to detect or recognize suspect of the case if no thumbprint present on the scene. The results show that about 80% of input photo can be matched with the template data.
Local Navon letter processing affects skilled behavior: a golf-putting experiment.
Lewis, Michael B; Dawkins, Gemma
2015-04-01
Expert or skilled behaviors (for example, face recognition or sporting performance) are typically performed automatically and with little conscious awareness. Previous studies, in various domains of performance, have shown that activities immediately prior to a task demanding a learned skill can affect performance. In sport, describing the to-be-performed action is detrimental, whereas in face recognition, describing a face or reading local Navon letters is detrimental. Two golf-putting experiments are presented that compare the effects that these three tasks have on experienced and novice golfers. Experiment 1 found a Navon effect on golf performance for experienced players. Experiment 2 found, for experienced players only, that performance was impaired following the three tasks described above, when compared with reading or global Navon tasks. It is suggested that the three tasks affect skilled performance by provoking a shift from automatic behavior to a more analytic style. By demonstrating similarities between effects in face recognition and sporting behavior, it is hoped to better understand concepts in both fields.
Classification of time-series images using deep convolutional neural networks
NASA Astrophysics Data System (ADS)
Hatami, Nima; Gavet, Yann; Debayle, Johan
2018-04-01
Convolutional Neural Networks (CNN) has achieved a great success in image recognition task by automatically learning a hierarchical feature representation from raw data. While the majority of Time-Series Classification (TSC) literature is focused on 1D signals, this paper uses Recurrence Plots (RP) to transform time-series into 2D texture images and then take advantage of the deep CNN classifier. Image representation of time-series introduces different feature types that are not available for 1D signals, and therefore TSC can be treated as texture image recognition task. CNN model also allows learning different levels of representations together with a classifier, jointly and automatically. Therefore, using RP and CNN in a unified framework is expected to boost the recognition rate of TSC. Experimental results on the UCR time-series classification archive demonstrate competitive accuracy of the proposed approach, compared not only to the existing deep architectures, but also to the state-of-the art TSC algorithms.
Automated phenotype pattern recognition of zebrafish for high-throughput screening.
Schutera, Mark; Dickmeis, Thomas; Mione, Marina; Peravali, Ravindra; Marcato, Daniel; Reischl, Markus; Mikut, Ralf; Pylatiuk, Christian
2016-07-03
Over the last years, the zebrafish (Danio rerio) has become a key model organism in genetic and chemical screenings. A growing number of experiments and an expanding interest in zebrafish research makes it increasingly essential to automatize the distribution of embryos and larvae into standard microtiter plates or other sample holders for screening, often according to phenotypical features. Until now, such sorting processes have been carried out by manually handling the larvae and manual feature detection. Here, a prototype platform for image acquisition together with a classification software is presented. Zebrafish embryos and larvae and their features such as pigmentation are detected automatically from the image. Zebrafish of 4 different phenotypes can be classified through pattern recognition at 72 h post fertilization (hpf), allowing the software to classify an embryo into 2 distinct phenotypic classes: wild-type versus variant. The zebrafish phenotypes are classified with an accuracy of 79-99% without any user interaction. A description of the prototype platform and of the algorithms for image processing and pattern recognition is presented.
On the recognition of emotional vocal expressions: motivations for a holistic approach.
Esposito, Anna; Esposito, Antonietta M
2012-10-01
Human beings seem to be able to recognize emotions from speech very well and information communication technology aims to implement machines and agents that can do the same. However, to be able to automatically recognize affective states from speech signals, it is necessary to solve two main technological problems. The former concerns the identification of effective and efficient processing algorithms capable of capturing emotional acoustic features from speech sentences. The latter focuses on finding computational models able to classify, with an approximation as good as human listeners, a given set of emotional states. This paper will survey these topics and provide some insights for a holistic approach to the automatic analysis, recognition and synthesis of affective states.
Automatic welding detection by an intelligent tool pipe inspection
NASA Astrophysics Data System (ADS)
Arizmendi, C. J.; Garcia, W. L.; Quintero, M. A.
2015-07-01
This work provide a model based on machine learning techniques in welds recognition, based on signals obtained through in-line inspection tool called “smart pig” in Oil and Gas pipelines. The model uses a signal noise reduction phase by means of pre-processing algorithms and attribute-selection techniques. The noise reduction techniques were selected after a literature review and testing with survey data. Subsequently, the model was trained using recognition and classification algorithms, specifically artificial neural networks and support vector machines. Finally, the trained model was validated with different data sets and the performance was measured with cross validation and ROC analysis. The results show that is possible to identify welding automatically with an efficiency between 90 and 98 percent.
Instrument-independent analysis of music by means of the continuous wavelet transform
NASA Astrophysics Data System (ADS)
Olmo, Gabriella; Dovis, Fabio; Benotto, Paolo; Calosso, Claudio; Passaro, Pierluigi
1999-10-01
This paper deals with the problem of automatic recognition of music. Segments of digitized music are processed by means of a Continuous Wavelet Transform, properly chosen so as to match the spectral characteristics of the signal. In order to achieve a good time-scale representation of the signal components a novel wavelet has been designed suited to the musical signal features. particular care has been devoted towards an efficient implementation, which operates in the frequency domain, and includes proper segmentation and aliasing reduction techniques to make the analysis of long signals feasible. The method achieves very good performance in terms of both time and frequency selectivity, and can yield the estimate and the localization in time of both the fundamental frequency and the main harmonics of each tone. The analysis is used as a preprocessing step for a recognition algorithm, which we show to be almost independent on the instrument reproducing the sounds. Simulations are provided to demonstrate the effectiveness of the proposed method.
Techniques for generation of control and guidance signals derived from optical fields, part 2
NASA Technical Reports Server (NTRS)
Hemami, H.; Mcghee, R. B.; Gardner, S. R.
1971-01-01
The development is reported of a high resolution technique for the detection and identification of landmarks from spacecraft optical fields. By making use of nonlinear regression analysis, a method is presented whereby a sequence of synthetic images produced by a digital computer can be automatically adjusted to provide a least squares approximation to a real image. The convergence of the method is demonstrated by means of a computer simulation for both elliptical and rectangular patterns. Statistical simulation studies with elliptical and rectangular patterns show that the computational techniques developed are able to at least match human pattern recognition capabilities, even in the presence of large amounts of noise. Unlike most pattern recognition techniques, this ability is unaffected by arbitrary pattern rotation, translation, and scale change. Further development of the basic approach may eventually allow a spacecraft or robot vehicle to be provided with an ability to very accurately determine its spatial relationship to arbitrary known objects within its optical field of view.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, Daniela I.; Brumby, Steven P.; Rowland, Joel C.
Neuromimetic machine vision and pattern recognition algorithms are of great interest for landscape characterization and change detection in satellite imagery in support of global climate change science and modeling. We present results from an ongoing effort to extend machine vision methods to the environmental sciences, using adaptive sparse signal processing combined with machine learning. A Hebbian learning rule is used to build multispectral, multiresolution dictionaries from regional satellite normalized band difference index data. Land cover labels are automatically generated via our CoSA algorithm: Clustering of Sparse Approximations, using a clustering distance metric that combines spectral and spatial textural characteristics tomore » help separate geologic, vegetative, and hydrologie features. We demonstrate our method on example Worldview-2 satellite images of an Arctic region, and use CoSA labels to detect seasonal surface changes. In conclusion, our results suggest that neuroscience-based models are a promising approach to practical pattern recognition and change detection problems in remote sensing.« less
Moody, Daniela I.; Brumby, Steven P.; Rowland, Joel C.; ...
2014-10-01
Neuromimetic machine vision and pattern recognition algorithms are of great interest for landscape characterization and change detection in satellite imagery in support of global climate change science and modeling. We present results from an ongoing effort to extend machine vision methods to the environmental sciences, using adaptive sparse signal processing combined with machine learning. A Hebbian learning rule is used to build multispectral, multiresolution dictionaries from regional satellite normalized band difference index data. Land cover labels are automatically generated via our CoSA algorithm: Clustering of Sparse Approximations, using a clustering distance metric that combines spectral and spatial textural characteristics tomore » help separate geologic, vegetative, and hydrologie features. We demonstrate our method on example Worldview-2 satellite images of an Arctic region, and use CoSA labels to detect seasonal surface changes. In conclusion, our results suggest that neuroscience-based models are a promising approach to practical pattern recognition and change detection problems in remote sensing.« less
Real-time road detection in infrared imagery
NASA Astrophysics Data System (ADS)
Andre, Haritini E.; McCoy, Keith
1990-09-01
Automatic road detection is an important part in many scene recognition applications. The extraction of roads provides a means of navigation and position update for remotely piloted vehicles or autonomous vehicles. Roads supply strong contextual information which can be used to improve the performance of automatic target recognition (ATh) systems by directing the search for targets and adjusting target classification confidences. This paper will describe algorithmic techniques for labeling roads in high-resolution infrared imagery. In addition, realtime implementation of this structural approach using a processor array based on the Martin Marietta Geometric Arithmetic Parallel Processor (GAPPTh) chip will be addressed. The algorithm described is based on the hypothesis that a road consists of pairs of line segments separated by a distance "d" with opposite gradient directions (antiparallel). The general nature of the algorithm, in addition to its parallel implementation in a single instruction, multiple data (SIMD) machine, are improvements to existing work. The algorithm seeks to identify line segments meeting the road hypothesis in a manner that performs well, even when the side of the road is fragmented due to occlusion or intersections. The use of geometrical relationships between line segments is a powerful yet flexible method of road classification which is independent of orientation. In addition, this approach can be used to nominate other types of objects with minor parametric changes.
A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
NASA Astrophysics Data System (ADS)
Oh, Yoo Rhee; Kim, Hong Kook
In this paper, we propose a hybrid model adaptation approach in which pronunciation and acoustic models are adapted by incorporating the pronunciation and acoustic variabilities of non-native speech in order to improve the performance of non-native automatic speech recognition (ASR). Specifically, the proposed hybrid model adaptation can be performed at either the state-tying or triphone-modeling level, depending at which acoustic model adaptation is performed. In both methods, we first analyze the pronunciation variant rules of non-native speakers and then classify each rule as either a pronunciation variant or an acoustic variant. The state-tying level hybrid method then adapts pronunciation models and acoustic models by accommodating the pronunciation variants in the pronunciation dictionary and by clustering the states of triphone acoustic models using the acoustic variants, respectively. On the other hand, the triphone-modeling level hybrid method initially adapts pronunciation models in the same way as in the state-tying level hybrid method; however, for the acoustic model adaptation, the triphone acoustic models are then re-estimated based on the adapted pronunciation models and the states of the re-estimated triphone acoustic models are clustered using the acoustic variants. From the Korean-spoken English speech recognition experiments, it is shown that ASR systems employing the state-tying and triphone-modeling level adaptation methods can relatively reduce the average word error rates (WERs) by 17.1% and 22.1% for non-native speech, respectively, when compared to a baseline ASR system.
Processing Strategy and PI Effects in Recognition Memory of Word Lists.
ERIC Educational Resources Information Center
Hodge, Milton H.; Britton, Bruce K.
Previous research by A. I. Schulman argued that an observed systematic decline in recognition memory in long word lists was due to the build-up of input and output proactive interference (PI). It also suggested that input PI resulted from process automatization; that is, each list item was processed or encoded in much the same way, producing a set…
Tree-structured sensor fusion architecture for distributed sensor networks
NASA Astrophysics Data System (ADS)
Iyengar, S. Sitharama; Kashyap, Rangasami L.; Madan, Rabinder N.; Thomas, Daryl D.
1990-10-01
An assessment of numerous activities in the field of multisensor target recognition reveals several trends and conditions which are cause for concern. .These concerns are analyzed in terms of their potential impact on the ultimate employment of automatic target recognition in military systems. Suggestions for additional investigation and guidance for current activities are presented with respect to some of the identified concerns.
26 CFR 1.338(h)(10)-1 - Deemed asset sale and liquidation.
Code of Federal Regulations, 2014 CFR
2014-04-01
...)(iii) of this section, K recognizes no gain or loss, and K's basis in its T stock remains at $5,000... section 338(h)(10) election for T are as follows: (1) P. P is automatically deemed to have made a gain recognition election for its nonrecently purchased T stock, if any. The effect of a gain recognition election...
26 CFR 1.338(h)(10)-1 - Deemed asset sale and liquidation.
Code of Federal Regulations, 2012 CFR
2012-04-01
...)(iii) of this section, K recognizes no gain or loss, and K's basis in its T stock remains at $5,000... section 338(h)(10) election for T are as follows: (1) P. P is automatically deemed to have made a gain recognition election for its nonrecently purchased T stock, if any. The effect of a gain recognition election...
26 CFR 1.338(h)(10)-1 - Deemed asset sale and liquidation.
Code of Federal Regulations, 2013 CFR
2013-04-01
...)(iii) of this section, K recognizes no gain or loss, and K's basis in its T stock remains at $5,000... section 338(h)(10) election for T are as follows: (1) P. P is automatically deemed to have made a gain recognition election for its nonrecently purchased T stock, if any. The effect of a gain recognition election...
An automatic target recognition system based on SAR image
NASA Astrophysics Data System (ADS)
Li, Qinfu; Wang, Jinquan; Zhao, Bo; Luo, Furen; Xu, Xiaojian
2009-10-01
In this paper, an automatic target recognition (ATR) system based on synthetic aperture radar (SAR) is proposed. This ATR system can play an important role in the simulation of up-to-data battlefield environment and be used in ATR research. To establish an integral and available system, the processing of SAR image was divided into four main stages which are de-noise, detection, cluster-discrimination and segment-recognition, respectively. The first three stages are used for searching region of interest (ROI). Once the ROIs are extracted, the recognition stage will be taken to compute the similarity between the ROIs and the templates in the electromagnetic simulation software National Electromagnetic Scattering Code (NESC). Due to the lack of the SAR raw data, the electromagnetic simulated images are added to the measured SAR background to simulate the battlefield environment8. The purpose of the system is to find the ROIs which can be the artificial military targets such as tanks, armored cars and so on and to categorize the ROIs into the right classes according to the existing templates. From the results we can see that the proposed system achieves a satisfactory result.
Automatic recognition of surface landmarks of anatomical structures of back and posture
NASA Astrophysics Data System (ADS)
Michoński, Jakub; Glinkowski, Wojciech; Witkowski, Marcin; Sitnik, Robert
2012-05-01
Faulty postures, scoliosis and sagittal plane deformities should be detected as early as possible to apply preventive and treatment measures against major clinical consequences. To support documentation of the severity of deformity and diminish x-ray exposures, several solutions utilizing analysis of back surface topography data were introduced. A novel approach to automatic recognition and localization of anatomical landmarks of the human back is presented that may provide more repeatable results and speed up the whole procedure. The algorithm was designed as a two-step process involving a statistical model built upon expert knowledge and analysis of three-dimensional back surface shape data. Voronoi diagram is used to connect mean geometric relations, which provide a first approximation of the positions, with surface curvature distribution, which further guides the recognition process and gives final locations of landmarks. Positions obtained using the developed algorithms are validated with respect to accuracy of manual landmark indication by experts. Preliminary validation proved that the landmarks were localized correctly, with accuracy depending mostly on the characteristics of a given structure. It was concluded that recognition should mainly take into account the shape of the back surface, putting as little emphasis on the statistical approximation as possible.
NASA Astrophysics Data System (ADS)
Zafar, I.; Edirisinghe, E. A.; Acar, S.; Bez, H. E.
2007-02-01
Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic License Plate Recognition (ALPR) systems. Several car MMR systems have been proposed in literature. However these approaches are based on feature detection algorithms that can perform sub-optimally under adverse lighting and/or occlusion conditions. In this paper we propose a real time, appearance based, car MMR approach using Two Dimensional Linear Discriminant Analysis that is capable of addressing this limitation. We provide experimental results to analyse the proposed algorithm's robustness under varying illumination and occlusions conditions. We have shown that the best performance with the proposed 2D-LDA based car MMR approach is obtained when the eigenvectors of lower significance are ignored. For the given database of 200 car images of 25 different make-model classifications, a best accuracy of 91% was obtained with the 2D-LDA approach. We use a direct Principle Component Analysis (PCA) based approach as a benchmark to compare and contrast the performance of the proposed 2D-LDA approach to car MMR. We conclude that in general the 2D-LDA based algorithm supersedes the performance of the PCA based approach.
Automated location detection of injection site for preclinical stereotactic neurosurgery procedure
NASA Astrophysics Data System (ADS)
Abbaszadeh, Shiva; Wu, Hemmings C. H.
2017-03-01
Currently, during stereotactic neurosurgery procedures, the manual task of locating the proper area for needle insertion or implantation of electrode/cannula/optic fiber can be time consuming. The requirement of the task is to quickly and accurately find the location for insertion. In this study we investigate an automated method to locate the entry point of region of interest. This method leverages a digital image capture system, pattern recognition, and motorized stages. Template matching of known anatomical identifiable regions is used to find regions of interest (e.g. Bregma) in rodents. For our initial study, we tackle the problem of automatically detecting the entry point.
Automated target recognition and tracking using an optical pattern recognition neural network
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin
1991-01-01
The on-going development of an automatic target recognition and tracking system at the Jet Propulsion Laboratory is presented. This system is an optical pattern recognition neural network (OPRNN) that is an integration of an innovative optical parallel processor and a feature extraction based neural net training algorithm. The parallel optical processor provides high speed and vast parallelism as well as full shift invariance. The neural network algorithm enables simultaneous discrimination of multiple noisy targets in spite of their scales, rotations, perspectives, and various deformations. This fully developed OPRNN system can be effectively utilized for the automated spacecraft recognition and tracking that will lead to success in the Automated Rendezvous and Capture (AR&C) of the unmanned Cargo Transfer Vehicle (CTV). One of the most powerful optical parallel processors for automatic target recognition is the multichannel correlator. With the inherent advantages of parallel processing capability and shift invariance, multiple objects can be simultaneously recognized and tracked using this multichannel correlator. This target tracking capability can be greatly enhanced by utilizing a powerful feature extraction based neural network training algorithm such as the neocognitron. The OPRNN, currently under investigation at JPL, is constructed with an optical multichannel correlator where holographic filters have been prepared using the neocognitron training algorithm. The computation speed of the neocognitron-type OPRNN is up to 10(exp 14) analog connections/sec that enabling the OPRNN to outperform its state-of-the-art electronics counterpart by at least two orders of magnitude.
Emotion and language: Valence and arousal affect word recognition
Brysbaert, Marc; Warriner, Amy Beth
2014-01-01
Emotion influences most aspects of cognition and behavior, but emotional factors are conspicuously absent from current models of word recognition. The influence of emotion on word recognition has mostly been reported in prior studies on the automatic vigilance for negative stimuli, but the precise nature of this relationship is unclear. Various models of automatic vigilance have claimed that the effect of valence on response times is categorical, an inverted-U, or interactive with arousal. The present study used a sample of 12,658 words, and included many lexical and semantic control factors, to determine the precise nature of the effects of arousal and valence on word recognition. Converging empirical patterns observed in word-level and trial-level data from lexical decision and naming indicate that valence and arousal exert independent monotonic effects: Negative words are recognized more slowly than positive words, and arousing words are recognized more slowly than calming words. Valence explained about 2% of the variance in word recognition latencies, whereas the effect of arousal was smaller. Valence and arousal do not interact, but both interact with word frequency, such that valence and arousal exert larger effects among low-frequency words than among high-frequency words. These results necessitate a new model of affective word processing whereby the degree of negativity monotonically and independently predicts the speed of responding. This research also demonstrates that incorporating emotional factors, especially valence, improves the performance of models of word recognition. PMID:24490848
Wang, Jiang-Ning; Chen, Xiao-Lin; Hou, Xin-Wen; Zhou, Li-Bing; Zhu, Chao-Dong; Ji, Li-Qiang
2017-07-01
Many species of Tephritidae are damaging to fruit, which might negatively impact international fruit trade. Automatic or semi-automatic identification of fruit flies are greatly needed for diagnosing causes of damage and quarantine protocols for economically relevant insects. A fruit fly image identification system named AFIS1.0 has been developed using 74 species belonging to six genera, which include the majority of pests in the Tephritidae. The system combines automated image identification and manual verification, balancing operability and accuracy. AFIS1.0 integrates image analysis and expert system into a content-based image retrieval framework. In the the automatic identification module, AFIS1.0 gives candidate identification results. Afterwards users can do manual selection based on comparing unidentified images with a subset of images corresponding to the automatic identification result. The system uses Gabor surface features in automated identification and yielded an overall classification success rate of 87% to the species level by Independent Multi-part Image Automatic Identification Test. The system is useful for users with or without specific expertise on Tephritidae in the task of rapid and effective identification of fruit flies. It makes the application of computer vision technology to fruit fly recognition much closer to production level. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Pires, Ivan Miguel; Garcia, Nuno M; Pombo, Nuno; Flórez-Revuelta, Francisco; Spinsante, Susanna
2018-02-21
Sensors available on mobile devices allow the automatic identification of Activities of Daily Living (ADL). This paper describes an approach for the creation of a framework for the identification of ADL, taking into account several concepts, including data acquisition, data processing, data fusion, and pattern recognition. These concepts can be mapped onto different modules of the framework. The proposed framework should perform the identification of ADL without Internet connection, performing these tasks locally on the mobile device, taking in account the hardware and software limitations of these devices. The main purpose of this paper is to present a new approach for the creation of a framework for the recognition of ADL, analyzing the allowed sensors available in the mobile devices, and the existing methods available in the literature.
Pombo, Nuno
2018-01-01
Sensors available on mobile devices allow the automatic identification of Activities of Daily Living (ADL). This paper describes an approach for the creation of a framework for the identification of ADL, taking into account several concepts, including data acquisition, data processing, data fusion, and pattern recognition. These concepts can be mapped onto different modules of the framework. The proposed framework should perform the identification of ADL without Internet connection, performing these tasks locally on the mobile device, taking in account the hardware and software limitations of these devices. The main purpose of this paper is to present a new approach for the creation of a framework for the recognition of ADL, analyzing the allowed sensors available in the mobile devices, and the existing methods available in the literature. PMID:29466316
Unsupervised pattern recognition methods in ciders profiling based on GCE voltammetric signals.
Jakubowska, Małgorzata; Sordoń, Wanda; Ciepiela, Filip
2016-07-15
This work presents a complete methodology of distinguishing between different brands of cider and ageing degrees, based on voltammetric signals, utilizing dedicated data preprocessing procedures and unsupervised multivariate analysis. It was demonstrated that voltammograms recorded on glassy carbon electrode in Britton-Robinson buffer at pH 2 are reproducible for each brand. By application of clustering algorithms and principal component analysis visible homogenous clusters were obtained. Advanced signal processing strategy which included automatic baseline correction, interval scaling and continuous wavelet transform with dedicated mother wavelet, was a key step in the correct recognition of the objects. The results show that voltammetry combined with optimized univariate and multivariate data processing is a sufficient tool to distinguish between ciders from various brands and to evaluate their freshness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Impact of translation on named-entity recognition in radiology texts
Pedro, Vasco
2017-01-01
Abstract Radiology reports describe the results of radiography procedures and have the potential of being a useful source of information which can bring benefits to health care systems around the world. One way to automatically extract information from the reports is by using Text Mining tools. The problem is that these tools are mostly developed for English and reports are usually written in the native language of the radiologist, which is not necessarily English. This creates an obstacle to the sharing of Radiology information between different communities. This work explores the solution of translating the reports to English before applying the Text Mining tools, probing the question of what translation approach should be used. We created MRRAD (Multilingual Radiology Research Articles Dataset), a parallel corpus of Portuguese research articles related to Radiology and a number of alternative translations (human, automatic and semi-automatic) to English. This is a novel corpus which can be used to move forward the research on this topic. Using MRRAD we studied which kind of automatic or semi-automatic translation approach is more effective on the Named-entity recognition task of finding RadLex terms in the English version of the articles. Considering the terms extracted from human translations as our gold standard, we calculated how similar to this standard were the terms extracted using other translations. We found that a completely automatic translation approach using Google leads to F-scores (between 0.861 and 0.868, depending on the extraction approach) similar to the ones obtained through a more expensive semi-automatic translation approach using Unbabel (between 0.862 and 0.870). To better understand the results we also performed a qualitative analysis of the type of errors found in the automatic and semi-automatic translations. Database URL: https://github.com/lasigeBioTM/MRRAD PMID:29220455
Van Strien, Jan W; Glimmerveen, Johanna C; Franken, Ingmar H A; Martens, Vanessa E G; de Bruin, Eveline A
2011-09-01
To examine the development of recognition memory in primary-school children, 36 healthy younger children (8-9 years old) and 36 healthy older children (11-12 years old) participated in an ERP study with an extended continuous face recognition task (Study 1). Each face of a series of 30 faces was shown randomly six times interspersed with distracter faces. The children were required to make old vs. new decisions. Older children responded faster than younger children, but younger children exhibited a steeper decrease in latencies across the five repetitions. Older children exhibited better accuracy for new faces, but there were no age differences in recognition accuracy for repeated faces. For the N2, N400 and late positive complex (LPC), we analyzed the old/new effects (repetition 1 vs. new presentation) and the extended repetition effects (repetitions 1 through 5). Compared to older children, younger children exhibited larger frontocentral N2 and N400 old/new effects. For extended face repetitions, negativity of the N2 and N400 decreased in a linear fashion in both age groups. For the LPC, an ERP component thought to reflect recollection, no significant old/new or extended repetition effects were found. Employing the same face recognition paradigm in 20 adults (Study 2), we found a significant N400 old/new effect at lateral frontal sites and a significant LPC repetition effect at parietal sites, with LPC amplitudes increasing linearly with the number of repetitions. This study clearly demonstrates differential developmental courses for the N400 and LPC pertaining to recognition memory for faces. It is concluded that face recognition in children is mediated by early and probably more automatic than conscious recognition processes. In adults, the LPC extended repetition effect indicates that adult face recognition memory is related to a conscious and graded recollection process rather than to an automatic recognition process. © 2011 Blackwell Publishing Ltd.
NASA Astrophysics Data System (ADS)
Bouma, Henri; Baan, Jan; Burghouts, Gertjan J.; Eendebak, Pieter T.; van Huis, Jasper R.; Dijk, Judith; van Rest, Jeroen H. C.
2014-10-01
Proactive detection of incidents is required to decrease the cost of security incidents. This paper focusses on the automatic early detection of suspicious behavior of pickpockets with track-based features in a crowded shopping mall. Our method consists of several steps: pedestrian tracking, feature computation and pickpocket recognition. This is challenging because the environment is crowded, people move freely through areas which cannot be covered by a single camera, because the actual snatch is a subtle action, and because collaboration is complex social behavior. We carried out an experiment with more than 20 validated pickpocket incidents. We used a top-down approach to translate expert knowledge in features and rules, and a bottom-up approach to learn discriminating patterns with a classifier. The classifier was used to separate the pickpockets from normal passers-by who are shopping in the mall. We performed a cross validation to train and evaluate our system. In this paper, we describe our method, identify the most valuable features, and analyze the results that were obtained in the experiment. We estimate the quality of these features and the performance of automatic detection of (collaborating) pickpockets. The results show that many of the pickpockets can be detected at a low false alarm rate.
NASA Astrophysics Data System (ADS)
de Garidel-Thoron, T.; Marchant, R.; Soto, E.; Gally, Y.; Beaufort, L.; Bolton, C. T.; Bouslama, M.; Licari, L.; Mazur, J. C.; Brutti, J. M.; Norsa, F.
2017-12-01
Foraminifera tests are the main proxy carriers for paleoceanographic reconstructions. Both geochemical and taxonomical studies require large numbers of tests to achieve statistical relevance. To date, the extraction of foraminifera from the sediment coarse fraction is still done by hand and thus time-consuming. Moreover, the recognition of morphotypes, ecologically relevant, requires some taxonomical skills not easily taught. The automatic recognition and extraction of foraminifera would largely help paleoceanographers to overcome these issues. Recent advances in automatic image classification using machine learning opens the way to automatic extraction of foraminifera. Here we detail progress on the design of an automatic picking machine as part of the FIRST project. The machine handles 30 pre-sieved samples (100-1000µm), separating them into individual particles (including foraminifera) and imaging each in pseudo-3D. The particles are classified and specimens of interest are sorted either for Individual Foraminifera Analyses (44 per slide) and/or for classical multiple analyses (8 morphological classes per slide, up to 1000 individuals per hole). The classification is based on machine learning using Convolutional Neural Networks (CNNs), similar to the approach used in the coccolithophorid imaging system SYRACO. To prove its feasibility, we built two training image datasets of modern planktonic foraminifera containing approximately 2000 and 5000 images each, corresponding to 15 & 25 morphological classes. Using a CNN with a residual topology (ResNet) we achieve over 95% correct classification for each dataset. We tested the network on 160,000 images from 45 depths of a sediment core from the Pacific ocean, for which we have human counts. The current algorithm is able to reproduce the downcore variability in both Globigerinoides ruber and the fragmentation index (r2 = 0.58 and 0.88 respectively). The FIRST prototype yields some promising results for high-resolution paleoceanographic studies and evolutionary studies.
A transition-based joint model for disease named entity recognition and normalization.
Lou, Yinxia; Zhang, Yue; Qian, Tao; Li, Fei; Xiong, Shufeng; Ji, Donghong
2017-08-01
Disease named entities play a central role in many areas of biomedical research, and automatic recognition and normalization of such entities have received increasing attention in biomedical research communities. Existing methods typically used pipeline models with two independent phases: (i) a disease named entity recognition (DER) system is used to find the boundaries of mentions in text and (ii) a disease named entity normalization (DEN) system is used to connect the mentions recognized to concepts in a controlled vocabulary. The main problems of such models are: (i) there is error propagation from DER to DEN and (ii) DEN is useful for DER, but pipeline models cannot utilize this. We propose a transition-based model to jointly perform disease named entity recognition and normalization, casting the output construction process into an incremental state transition process, learning sequences of transition actions globally, which correspond to joint structural outputs. Beam search and online structured learning are used, with learning being designed to guide search. Compared with the only existing method for joint DEN and DER, our method allows non-local features to be used, which significantly improves the accuracies. We evaluate our model on two corpora: the BioCreative V Chemical Disease Relation (CDR) corpus and the NCBI disease corpus. Experiments show that our joint framework achieves significantly higher performances compared to competitive pipeline baselines. Our method compares favourably to other state-of-the-art approaches. Data and code are available at https://github.com/louyinxia/jointRN. dhji@whu.edu.cn. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Military personnel recognition system using texture, colour, and SURF features
NASA Astrophysics Data System (ADS)
Irhebhude, Martins E.; Edirisinghe, Eran A.
2014-06-01
This paper presents an automatic, machine vision based, military personnel identification and classification system. Classification is done using a Support Vector Machine (SVM) on sets of Army, Air Force and Navy camouflage uniform personnel datasets. In the proposed system, the arm of service of personnel is recognised by the camouflage of a persons uniform, type of cap and the type of badge/logo. The detailed analysis done include; camouflage cap and plain cap differentiation using gray level co-occurrence matrix (GLCM) texture feature; classification on Army, Air Force and Navy camouflaged uniforms using GLCM texture and colour histogram bin features; plain cap badge classification into Army, Air Force and Navy using Speed Up Robust Feature (SURF). The proposed method recognised camouflage personnel arm of service on sets of data retrieved from google images and selected military websites. Correlation-based Feature Selection (CFS) was used to improve recognition and reduce dimensionality, thereby speeding the classification process. With this method success rates recorded during the analysis include 93.8% for camouflage appearance category, 100%, 90% and 100% rates of plain cap and camouflage cap categories for Army, Air Force and Navy categories, respectively. Accurate recognition was recorded using SURF for the plain cap badge category. Substantial analysis has been carried out and results prove that the proposed method can correctly classify military personnel into various arms of service. We show that the proposed method can be integrated into a face recognition system, which will recognise personnel in addition to determining the arm of service which the personnel belong. Such a system can be used to enhance the security of a military base or facility.
The Extraction of Terrace in the Loess Plateau Based on radial method
NASA Astrophysics Data System (ADS)
Liu, W.; Li, F.
2016-12-01
The terrace of Loess Plateau, as a typical kind of artificial landform and an important measure of soil and water conservation, its positioning and automatic extraction will simplify the work of land use investigation. The existing methods of terrace extraction mainly include visual interpretation and automatic extraction. The manual method is used in land use investigation, but it is time-consuming and laborious. Researchers put forward some automatic extraction methods. For example, Fourier transform method can recognize terrace and find accurate position from frequency domain image, but it is more affected by the linear objects in the same direction of terrace; Texture analysis method is simple and have a wide range application of image processing. The disadvantage of texture analysis method is unable to recognize terraces' edge; Object-oriented is a new method of image classification, but when introduce it to terrace extracting, fracture polygons will be the most serious problem and it is difficult to explain its geological meaning. In order to positioning the terraces, we use high- resolution remote sensing image to extract and analyze the gray value of the pixels which the radial went through. During the recognition process, we firstly use the DEM data analysis or by manual selecting, to roughly confirm the position of peak points; secondly, take each of the peak points as the center to make radials in all directions; finally, extracting the gray values of the pixels which the radials went through, and analyzing its changing characteristics to confirm whether the terrace exists. For the purpose of getting accurate position of terrace, terraces' discontinuity, extension direction, ridge width, image processing algorithm, remote sensing image illumination and other influence factors were fully considered when designing the algorithms.
NASA Astrophysics Data System (ADS)
Lemoff, Brian E.; Martin, Robert B.; Sluch, Mikhail; Kafka, Kristopher M.; McCormick, William; Ice, Robert
2013-06-01
The capability to positively and covertly identify people at a safe distance, 24-hours per day, could provide a valuable advantage in protecting installations, both domestically and in an asymmetric warfare environment. This capability would enable installation security officers to identify known bad actors from a safe distance, even if they are approaching under cover of darkness. We will describe an active-SWIR imaging system being developed to automatically detect, track, and identify people at long range using computer face recognition. The system illuminates the target with an eye-safe and invisible SWIR laser beam, to provide consistent high-resolution imagery night and day. SWIR facial imagery produced by the system is matched against a watch-list of mug shots using computer face recognition algorithms. The current system relies on an operator to point the camera and to review and interpret the face recognition results. Automation software is being developed that will allow the system to be cued to a location by an external system, automatically detect a person, track the person as they move, zoom in on the face, select good facial images, and process the face recognition results, producing alarms and sharing data with other systems when people are detected and identified. Progress on the automation of this system will be presented along with experimental night-time face recognition results at distance.
Track-based event recognition in a realistic crowded environment
NASA Astrophysics Data System (ADS)
van Huis, Jasper R.; Bouma, Henri; Baan, Jan; Burghouts, Gertjan J.; Eendebak, Pieter T.; den Hollander, Richard J. M.; Dijk, Judith; van Rest, Jeroen H.
2014-10-01
Automatic detection of abnormal behavior in CCTV cameras is important to improve the security in crowded environments, such as shopping malls, airports and railway stations. This behavior can be characterized at different time scales, e.g., by small-scale subtle and obvious actions or by large-scale walking patterns and interactions between people. For example, pickpocketing can be recognized by the actual snatch (small scale), when he follows the victim, or when he interacts with an accomplice before and after the incident (longer time scale). This paper focusses on event recognition by detecting large-scale track-based patterns. Our event recognition method consists of several steps: pedestrian detection, object tracking, track-based feature computation and rule-based event classification. In the experiment, we focused on single track actions (walk, run, loiter, stop, turn) and track interactions (pass, meet, merge, split). The experiment includes a controlled setup, where 10 actors perform these actions. The method is also applied to all tracks that are generated in a crowded shopping mall in a selected time frame. The results show that most of the actions can be detected reliably (on average 90%) at a low false positive rate (1.1%), and that the interactions obtain lower detection rates (70% at 0.3% FP). This method may become one of the components that assists operators to find threatening behavior and enrich the selection of videos that are to be observed.
NASA Astrophysics Data System (ADS)
Harit, Aditya; Joshi, J. C., Col; Gupta, K. K.
2018-03-01
The paper proposed an automatic facial emotion recognition algorithm which comprises of two main components: feature extraction and expression recognition. The algorithm uses a Gabor filter bank on fiducial points to find the facial expression features. The resulting magnitudes of Gabor transforms, along with 14 chosen FAPs (Facial Animation Parameters), compose the feature space. There are two stages: the training phase and the recognition phase. Firstly, for the present 6 different emotions, the system classifies all training expressions in 6 different classes (one for each emotion) in the training stage. In the recognition phase, it recognizes the emotion by applying the Gabor bank to a face image, then finds the fiducial points, and then feeds it to the trained neural architecture.
Foreign Language Analysis and Recognition (FLARE) Progress
2015-02-01
Copies may be obtained from the Defense Technical Information Center (DTIC) (http://www.dtic.mil). AFRL- RH -WP-TR-2015-0007 HAS BEEN REVIEWED AND IS... retrieval (IR). 15. SUBJECT TERMS Automatic speech recognition (ASR), information retrieval (IR). 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...to the Haystack Multilingual Multimedia Information Extraction and Retrieval (MMIER) system that was initially developed under a prior work unit
Tuning time-frequency methods for the detection of metered HF speech
NASA Astrophysics Data System (ADS)
Nelson, Douglas J.; Smith, Lawrence H.
2002-12-01
Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. In radio communications, the CQ request, call sign and other codes are frequently metered. In tactical communications and air traffic control, location, heading and identification codes may be metered. Moreover metering may be expected to survive even in HF communications, which are corrupted by noise, interference and mistuning. For this environment, speech recognition and conventional machine-based methods are not effective. We describe Time-Frequency methods which have been adapted successfully to the problem of mitigation of HF signal conditions and detection of metered speech. These methods are based on modeled time and frequency correlation properties of nearly harmonic functions. We derive these properties and demonstrate a performance gain over conventional correlation and spectral methods. Finally, in addressing the problem of HF single sideband (SSB) communications, the problems of carrier mistuning, interfering signals, such as manual Morse, and fast automatic gain control (AGC) must be addressed. We demonstrate simple methods which may be used to blindly mitigate mistuning and narrowband interference, and effectively invert the fast automatic gain function.
SU-F-T-20: Novel Catheter Lumen Recognition Algorithm for Rapid Digitization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dise, J; McDonald, D; Ashenafi, M
Purpose: Manual catheter recognition remains a time-consuming aspect of high-dose-rate brachytherapy (HDR) treatment planning. In this work, a novel catheter lumen recognition algorithm was created for accurate and rapid digitization. Methods: MatLab v8.5 was used to create the catheter recognition algorithm. Initially, the algorithm searches the patient CT dataset using an intensity based k-means filter designed to locate catheters. Once the catheters have been located, seed points are manually selected to initialize digitization of each catheter. From each seed point, the algorithm searches locally in order to automatically digitize the remaining catheter. This digitization is accomplished by finding pixels withmore » similar image curvature and divergence parameters compared to the seed pixel. Newly digitized pixels are treated as new seed positions, and hessian image analysis is used to direct the algorithm toward neighboring catheter pixels, and to make the algorithm insensitive to adjacent catheters that are unresolvable on CT, air pockets, and high Z artifacts. The algorithm was tested using 11 HDR treatment plans, including the Syed template, tandem and ovoid applicator, and multi-catheter lung brachytherapy. Digitization error was calculated by comparing manually determined catheter positions to those determined by the algorithm. Results: he digitization error was 0.23 mm ± 0.14 mm axially and 0.62 mm ± 0.13 mm longitudinally at the tip. The time of digitization, following initial seed placement was less than 1 second per catheter. The maximum total time required to digitize all tested applicators was 4 minutes (Syed template with 15 needles). Conclusion: This algorithm successfully digitizes HDR catheters for a variety of applicators with or without CT markers. The minimal axial error demonstrates the accuracy of the algorithm, and its insensitivity to image artifacts and challenging catheter positioning. Future work to automatically place initial seed positions would improve the algorithm speed.« less
NASA Astrophysics Data System (ADS)
Zhai, Xiaojun; Bensaali, Faycal; Sotudeh, Reza
2013-01-01
Number plate (NP) binarization and adjustment are important preprocessing stages in automatic number plate recognition (ANPR) systems and are used to link the number plate localization (NPL) and character segmentation stages. Successfully linking these two stages will improve the performance of the entire ANPR system. We present two optimized low-complexity NP binarization and adjustment algorithms. Efficient area/speed architectures based on the proposed algorithms are also presented and have been successfully implemented and tested using the Mentor Graphics RC240 FPGA development board, which together require only 9% of the available on-chip resources of a Virtex-4 FPGA, run with a maximum frequency of 95.8 MHz and are capable of processing one image in 0.07 to 0.17 ms.