Sample records for automatic recognition system

  1. Practical automatic Arabic license plate recognition system

    NASA Astrophysics Data System (ADS)

    Mohammad, Khader; Agaian, Sos; Saleh, Hani

    2011-02-01

    Since 1970's, the need of an automatic license plate recognition system, sometimes referred as Automatic License Plate Recognition system, has been increasing. A license plate recognition system is an automatic system that is able to recognize a license plate number, extracted from image sensors. In specific, Automatic License Plate Recognition systems are being used in conjunction with various transportation systems in application areas such as law enforcement (e.g. speed limit enforcement) and commercial usages such as parking enforcement and automatic toll payment private and public entrances, border control, theft and vandalism control. Vehicle license plate recognition has been intensively studied in many countries. Due to the different types of license plates being used, the requirement of an automatic license plate recognition system is different for each country. [License plate detection using cluster run length smoothing algorithm ].Generally, an automatic license plate localization and recognition system is made up of three modules; license plate localization, character segmentation and optical character recognition modules. This paper presents an Arabic license plate recognition system that is insensitive to character size, font, shape and orientation with extremely high accuracy rate. The proposed system is based on a combination of enhancement, license plate localization, morphological processing, and feature vector extraction using the Haar transform. The performance of the system is fast due to classification of alphabet and numerals based on the license plate organization. Experimental results for license plates of two different Arab countries show an average of 99 % successful license plate localization and recognition in a total of more than 20 different images captured from a complex outdoor environment. The results run times takes less time compared to conventional and many states of art methods.

  2. Military applications of automatic speech recognition and future requirements

    NASA Technical Reports Server (NTRS)

    Beek, Bruno; Cupples, Edward J.

    1977-01-01

    An updated summary of the state-of-the-art of automatic speech recognition and its relevance to military applications is provided. A number of potential systems for military applications are under development. These include: (1) digital narrowband communication systems; (2) automatic speech verification; (3) on-line cartographic processing unit; (4) word recognition for militarized tactical data system; and (5) voice recognition and synthesis for aircraft cockpit.

  3. Support vector machine for automatic pain recognition

    NASA Astrophysics Data System (ADS)

    Monwar, Md Maruf; Rezaei, Siamak

    2009-02-01

    Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.

  4. Automatic speech recognition technology development at ITT Defense Communications Division

    NASA Technical Reports Server (NTRS)

    White, George M.

    1977-01-01

    An assessment of the applications of automatic speech recognition to defense communication systems is presented. Future research efforts include investigations into the following areas: (1) dynamic programming; (2) recognition of speech degraded by noise; (3) speaker independent recognition; (4) large vocabulary recognition; (5) word spotting and continuous speech recognition; and (6) isolated word recognition.

  5. A Limited-Vocabulary, Multi-Speaker Automatic Isolated Word Recognition System.

    ERIC Educational Resources Information Center

    Paul, James E., Jr.

    Techniques for automatic recognition of isolated words are investigated, and a computer simulation of a word recognition system is effected. Considered in detail are data acquisition and digitizing, word detection, amplitude and time normalization, short-time spectral estimation including spectral windowing, spectral envelope approximation,…

  6. Speaker-Machine Interaction in Automatic Speech Recognition. Technical Report.

    ERIC Educational Resources Information Center

    Makhoul, John I.

    The feasibility and limitations of speaker adaptation in improving the performance of a "fixed" (speaker-independent) automatic speech recognition system were examined. A fixed vocabulary of 55 syllables is used in the recognition system which contains 11 stops and fricatives and five tense vowels. The results of an experiment on speaker…

  7. Using Automatic Speech Recognition to Dictate Mathematical Expressions: The Development of the "TalkMaths" Application at Kingston University

    ERIC Educational Resources Information Center

    Wigmore, Angela; Hunter, Gordon; Pflugel, Eckhard; Denholm-Price, James; Binelli, Vincent

    2009-01-01

    Speech technology--especially automatic speech recognition--has now advanced to a level where it can be of great benefit both to able-bodied people and those with various disabilities. In this paper we describe an application "TalkMaths" which, using the output from a commonly-used conventional automatic speech recognition system,…

  8. Concept Recognition in an Automatic Text-Processing System for the Life Sciences.

    ERIC Educational Resources Information Center

    Vleduts-Stokolov, Natasha

    1987-01-01

    Describes a system developed for the automatic recognition of biological concepts in titles of scientific articles; reports results of several pilot experiments which tested the system's performance; analyzes typical ambiguity problems encountered by the system; describes a disambiguation technique that was developed; and discusses future plans…

  9. Container-code recognition system based on computer vision and deep neural networks

    NASA Astrophysics Data System (ADS)

    Liu, Yi; Li, Tianjian; Jiang, Li; Liang, Xiaoyao

    2018-04-01

    Automatic container-code recognition system becomes a crucial requirement for ship transportation industry in recent years. In this paper, an automatic container-code recognition system based on computer vision and deep neural networks is proposed. The system consists of two modules, detection module and recognition module. The detection module applies both algorithms based on computer vision and neural networks, and generates a better detection result through combination to avoid the drawbacks of the two methods. The combined detection results are also collected for online training of the neural networks. The recognition module exploits both character segmentation and end-to-end recognition, and outputs the recognition result which passes the verification. When the recognition module generates false recognition, the result will be corrected and collected for online training of the end-to-end recognition sub-module. By combining several algorithms, the system is able to deal with more situations, and the online training mechanism can improve the performance of the neural networks at runtime. The proposed system is able to achieve 93% of overall recognition accuracy.

  10. Automatic speech recognition in air traffic control

    NASA Technical Reports Server (NTRS)

    Karlsson, Joakim

    1990-01-01

    Automatic Speech Recognition (ASR) technology and its application to the Air Traffic Control system are described. The advantages of applying ASR to Air Traffic Control, as well as criteria for choosing a suitable ASR system are presented. Results from previous research and directions for future work at the Flight Transportation Laboratory are outlined.

  11. Evaluating Automatic Speech Recognition-Based Language Learning Systems: A Case Study

    ERIC Educational Resources Information Center

    van Doremalen, Joost; Boves, Lou; Colpaert, Jozef; Cucchiarini, Catia; Strik, Helmer

    2016-01-01

    The purpose of this research was to evaluate a prototype of an automatic speech recognition (ASR)-based language learning system that provides feedback on different aspects of speaking performance (pronunciation, morphology and syntax) to students of Dutch as a second language. We carried out usability reviews, expert reviews and user tests to…

  12. Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks

    ERIC Educational Resources Information Center

    Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya

    2016-01-01

    This study investigates the use of Automatic Speech Recognition (ASR) systems to epitomize second language (L2) listeners' problems in perception of TED talks. ASR-generated transcripts of videos often involve recognition errors, which may indicate difficult segments for L2 listeners. This paper aims to discover the root-causes of the ASR errors…

  13. Fuzzy Logic-Based Audio Pattern Recognition

    NASA Astrophysics Data System (ADS)

    Malcangi, M.

    2008-11-01

    Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.

  14. Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners with Simulated Age-Related Hearing Loss

    ERIC Educational Resources Information Center

    Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian

    2017-01-01

    Purpose: The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist…

  15. Computer Recognition of Facial Profiles

    DTIC Science & Technology

    1974-08-01

    facial recognition 20. ABSTRACT (Continue on reverse side It necessary and Identify by block number) A system for the recognition of human faces from...21 2.6 Classification Algorithms ........... ... 32 III FACIAL RECOGNITION AND AUTOMATIC TRAINING . . . 37 3.1 Facial Profile Recognition...provide a fair test of the classification system. The work of Goldstein, Harmon, and Lesk [81 indicates, however, that for facial recognition , a ten class

  16. Four-Channel Biosignal Analysis and Feature Extraction for Automatic Emotion Recognition

    NASA Astrophysics Data System (ADS)

    Kim, Jonghwa; André, Elisabeth

    This paper investigates the potential of physiological signals as a reliable channel for automatic recognition of user's emotial state. For the emotion recognition, little attention has been paid so far to physiological signals compared to audio-visual emotion channels such as facial expression or speech. All essential stages of automatic recognition system using biosignals are discussed, from recording physiological dataset up to feature-based multiclass classification. Four-channel biosensors are used to measure electromyogram, electrocardiogram, skin conductivity and respiration changes. A wide range of physiological features from various analysis domains, including time/frequency, entropy, geometric analysis, subband spectra, multiscale entropy, etc., is proposed in order to search the best emotion-relevant features and to correlate them with emotional states. The best features extracted are specified in detail and their effectiveness is proven by emotion recognition results.

  17. Automatic image database generation from CAD for 3D object recognition

    NASA Astrophysics Data System (ADS)

    Sardana, Harish K.; Daemi, Mohammad F.; Ibrahim, Mohammad K.

    1993-06-01

    The development and evaluation of Multiple-View 3-D object recognition systems is based on a large set of model images. Due to the various advantages of using CAD, it is becoming more and more practical to use existing CAD data in computer vision systems. Current PC- level CAD systems are capable of providing physical image modelling and rendering involving positional variations in cameras, light sources etc. We have formulated a modular scheme for automatic generation of various aspects (views) of the objects in a model based 3-D object recognition system. These views are generated at desired orientations on the unit Gaussian sphere. With a suitable network file sharing system (NFS), the images can directly be stored on a database located on a file server. This paper presents the image modelling solutions using CAD in relation to multiple-view approach. Our modular scheme for data conversion and automatic image database storage for such a system is discussed. We have used this approach in 3-D polyhedron recognition. An overview of the results, advantages and limitations of using CAD data and conclusions using such as scheme are also presented.

  18. Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

    NASA Astrophysics Data System (ADS)

    Heracleous, Panikos; Kaino, Tomomi; Saruwatari, Hiroshi; Shikano, Kiyohiro

    2006-12-01

    We present the use of stethoscope and silicon NAM (nonaudible murmur) microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible) speech, but also very quietly uttered speech (nonaudible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc.) for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a[InlineEquation not available: see fulltext.] word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.

  19. Cost/benefit analysis of electronic license plates

    DOT National Transportation Integrated Search

    2008-06-01

    The objective of this report is to determine whether electronic vehicle recognition systems (EVR) or automatic license plate recognition systems (ALPR) would be beneficial to the Arizona Department of Transportation (AZDOT). EVR uses radio frequency ...

  20. Automatic recognition and analysis of synapses. [in brain tissue

    NASA Technical Reports Server (NTRS)

    Ungerleider, J. A.; Ledley, R. S.; Bloom, F. E.

    1976-01-01

    An automatic system for recognizing synaptic junctions would allow analysis of large samples of tissue for the possible classification of specific well-defined sets of synapses based upon structural morphometric indices. In this paper the three steps of our system are described: (1) cytochemical tissue preparation to allow easy recognition of the synaptic junctions; (2) transmitting the tissue information to a computer; and (3) analyzing each field to recognize the synapses and make measurements on them.

  1. Automatic violence detection in digital movies

    NASA Astrophysics Data System (ADS)

    Fischer, Stephan

    1996-11-01

    Research on computer-based recognition of violence is scant. We are working on the automatic recognition of violence in digital movies, a first step towards the goal of a computer- assisted system capable of protecting children against TV programs containing a great deal of violence. In the video domain a collision detection and a model-mapping to locate human figures are run, while the creation and comparison of fingerprints to find certain events are run int he audio domain. This article centers on the recognition of fist- fights in the video domain and on the recognition of shots, explosions and cries in the audio domain.

  2. Hybrid neuro-fuzzy approach for automatic vehicle license plate recognition

    NASA Astrophysics Data System (ADS)

    Lee, Hsi-Chieh; Jong, Chung-Shi

    1998-03-01

    Most currently available vehicle identification systems use techniques such as R.F., microwave, or infrared to help identifying the vehicle. Transponders are usually installed in the vehicle in order to transmit the corresponding information to the sensory system. It is considered expensive to install a transponder in each vehicle and the malfunction of the transponder will result in the failure of the vehicle identification system. In this study, novel hybrid approach is proposed for automatic vehicle license plate recognition. A system prototype is built which can be used independently or cooperating with current vehicle identification system in identifying a vehicle. The prototype consists of four major modules including the module for license plate region identification, the module for character extraction from the license plate, the module for character recognition, and the module for the SimNet neuro-fuzzy system. To test the performance of the proposed system, three hundred and eighty vehicle image samples are taken by a digital camera. The license plate recognition success rate of the prototype is approximately 91% while the character recognition success rate of the prototype is approximately 97%.

  3. Adaptive method of recognition of signals for one and two-frequency signal system in the telephony on the background of speech

    NASA Astrophysics Data System (ADS)

    Kuznetsov, Michael V.

    2006-05-01

    For reliable teamwork of various systems of automatic telecommunication including transferring systems of optical communication networks it is necessary authentic recognition of signals for one- or two-frequency service signal system. The analysis of time parameters of an accepted signal allows increasing reliability of detection and recognition of the service signal system on a background of speech.

  4. New technique for real-time distortion-invariant multiobject recognition and classification

    NASA Astrophysics Data System (ADS)

    Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan

    2001-04-01

    A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.

  5. Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images

    PubMed Central

    Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

    2018-01-01

    Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. PMID:29786665

  6. Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images.

    PubMed

    Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

    2018-05-22

    Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition.

  7. iFER: facial expression recognition using automatically selected geometric eye and eyebrow features

    NASA Astrophysics Data System (ADS)

    Oztel, Ismail; Yolcu, Gozde; Oz, Cemil; Kazan, Serap; Bunyak, Filiz

    2018-03-01

    Facial expressions have an important role in interpersonal communications and estimation of emotional states or intentions. Automatic recognition of facial expressions has led to many practical applications and became one of the important topics in computer vision. We present a facial expression recognition system that relies on geometry-based features extracted from eye and eyebrow regions of the face. The proposed system detects keypoints on frontal face images and forms a feature set using geometric relationships among groups of detected keypoints. Obtained feature set is refined and reduced using the sequential forward selection (SFS) algorithm and fed to a support vector machine classifier to recognize five facial expression classes. The proposed system, iFER (eye-eyebrow only facial expression recognition), is robust to lower face occlusions that may be caused by beards, mustaches, scarves, etc. and lower face motion during speech production. Preliminary experiments on benchmark datasets produced promising results outperforming previous facial expression recognition studies using partial face features, and comparable results to studies using whole face information, only slightly lower by ˜ 2.5 % compared to the best whole face facial recognition system while using only ˜ 1 / 3 of the facial region.

  8. Face averages enhance user recognition for smartphone security.

    PubMed

    Robertson, David J; Kramer, Robin S S; Burton, A Mike

    2015-01-01

    Our recognition of familiar faces is excellent, and generalises across viewing conditions. However, unfamiliar face recognition is much poorer. For this reason, automatic face recognition systems might benefit from incorporating the advantages of familiarity. Here we put this to the test using the face verification system available on a popular smartphone (the Samsung Galaxy). In two experiments we tested the recognition performance of the smartphone when it was encoded with an individual's 'face-average'--a representation derived from theories of human face perception. This technique significantly improved performance for both unconstrained celebrity images (Experiment 1) and for real faces (Experiment 2): users could unlock their phones more reliably when the device stored an average of the user's face than when they stored a single image. This advantage was consistent across a wide variety of everyday viewing conditions. Furthermore, the benefit did not reduce the rejection of imposter faces. This benefit is brought about solely by consideration of suitable representations for automatic face recognition, and we argue that this is just as important as development of matching algorithms themselves. We propose that this representation could significantly improve recognition rates in everyday settings.

  9. Automatic face recognition in HDR imaging

    NASA Astrophysics Data System (ADS)

    Pereira, Manuela; Moreno, Juan-Carlos; Proença, Hugo; Pinheiro, António M. G.

    2014-05-01

    The gaining popularity of the new High Dynamic Range (HDR) imaging systems is raising new privacy issues caused by the methods used for visualization. HDR images require tone mapping methods for an appropriate visualization on conventional and non-expensive LDR displays. These visualization methods might result in completely different visualization raising several issues on privacy intrusion. In fact, some visualization methods result in a perceptual recognition of the individuals, while others do not even show any identity. Although perceptual recognition might be possible, a natural question that can rise is how computer based recognition will perform using tone mapping generated images? In this paper, a study where automatic face recognition using sparse representation is tested with images that result from common tone mapping operators applied to HDR images. Its ability for the face identity recognition is described. Furthermore, typical LDR images are used for the face recognition training.

  10. Object Occlusion Detection Using Automatic Camera Calibration for a Wide-Area Video Surveillance System

    PubMed Central

    Jung, Jaehoon; Yoon, Inhye; Paik, Joonki

    2016-01-01

    This paper presents an object occlusion detection algorithm using object depth information that is estimated by automatic camera calibration. The object occlusion problem is a major factor to degrade the performance of object tracking and recognition. To detect an object occlusion, the proposed algorithm consists of three steps: (i) automatic camera calibration using both moving objects and a background structure; (ii) object depth estimation; and (iii) detection of occluded regions. The proposed algorithm estimates the depth of the object without extra sensors but with a generic red, green and blue (RGB) camera. As a result, the proposed algorithm can be applied to improve the performance of object tracking and object recognition algorithms for video surveillance systems. PMID:27347978

  11. Automatic Mexican sign language and digits recognition using normalized central moments

    NASA Astrophysics Data System (ADS)

    Solís, Francisco; Martínez, David; Espinosa, Oscar; Toxqui, Carina

    2016-09-01

    This work presents a framework for automatic Mexican sign language and digits recognition based on computer vision system using normalized central moments and artificial neural networks. Images are captured by digital IP camera, four LED reflectors and a green background in order to reduce computational costs and prevent the use of special gloves. 42 normalized central moments are computed per frame and used in a Multi-Layer Perceptron to recognize each database. Four versions per sign and digit were used in training phase. 93% and 95% of recognition rates were achieved for Mexican sign language and digits respectively.

  12. Key features for ATA / ATR database design in missile systems

    NASA Astrophysics Data System (ADS)

    Özertem, Kemal Arda

    2017-05-01

    Automatic target acquisition (ATA) and automatic target recognition (ATR) are two vital tasks for missile systems, and having a robust detection and recognition algorithm is crucial for overall system performance. In order to have a robust target detection and recognition algorithm, an extensive image database is required. Automatic target recognition algorithms use the database of images in training and testing steps of algorithm. This directly affects the recognition performance, since the training accuracy is driven by the quality of the image database. In addition, the performance of an automatic target detection algorithm can be measured effectively by using an image database. There are two main ways for designing an ATA / ATR database. The first and easy way is by using a scene generator. A scene generator can model the objects by considering its material information, the atmospheric conditions, detector type and the territory. Designing image database by using a scene generator is inexpensive and it allows creating many different scenarios quickly and easily. However the major drawback of using a scene generator is its low fidelity, since the images are created virtually. The second and difficult way is designing it using real-world images. Designing image database with real-world images is a lot more costly and time consuming; however it offers high fidelity, which is critical for missile algorithms. In this paper, critical concepts in ATA / ATR database design with real-world images are discussed. Each concept is discussed in the perspective of ATA and ATR separately. For the implementation stage, some possible solutions and trade-offs for creating the database are proposed, and all proposed approaches are compared to each other with regards to their pros and cons.

  13. Personalization algorithm for real-time activity recognition using PDA, wireless motion bands, and binary decision tree.

    PubMed

    Pärkkä, Juha; Cluitmans, Luc; Ermes, Miikka

    2010-09-01

    Inactive and sedentary lifestyle is a major problem in many industrialized countries today. Automatic recognition of type of physical activity can be used to show the user the distribution of his daily activities and to motivate him into more active lifestyle. In this study, an automatic activity-recognition system consisting of wireless motion bands and a PDA is evaluated. The system classifies raw sensor data into activity types online. It uses a decision tree classifier, which has low computational cost and low battery consumption. The classifier parameters can be personalized online by performing a short bout of an activity and by telling the system which activity is being performed. Data were collected with seven volunteers during five everyday activities: lying, sitting/standing, walking, running, and cycling. The online system can detect these activities with overall 86.6% accuracy and with 94.0% accuracy after classifier personalization.

  14. Automatic forensic face recognition from digital images.

    PubMed

    Peacock, C; Goode, A; Brett, A

    2004-01-01

    Digital image evidence is now widely available from criminal investigations and surveillance operations, often captured by security and surveillance CCTV. This has resulted in a growing demand from law enforcement agencies for automatic person-recognition based on image data. In forensic science, a fundamental requirement for such automatic face recognition is to evaluate the weight that can justifiably be attached to this recognition evidence in a scientific framework. This paper describes a pilot study carried out by the Forensic Science Service (UK) which explores the use of digital facial images in forensic investigation. For the purpose of the experiment a specific software package was chosen (Image Metrics Optasia). The paper does not describe the techniques used by the software to reach its decision of probabilistic matches to facial images, but accepts the output of the software as though it were a 'black box'. In this way, the paper lays a foundation for how face recognition systems can be compared in a forensic framework. The aim of the paper is to explore how reliably and under what conditions digital facial images can be presented in evidence.

  15. Research and Development of Fully Automatic Alien Smoke Stack and Packaging System

    NASA Astrophysics Data System (ADS)

    Yang, Xudong; Ge, Qingkuan; Peng, Tao; Zuo, Ping; Dong, Weifu

    2017-12-01

    The problem of low efficiency of manual sorting packaging for the current tobacco distribution center, which developed a set of safe efficient and automatic type of alien smoke stack and packaging system. The functions of fully automatic alien smoke stack and packaging system adopt PLC control technology, servo control technology, robot technology, image recognition technology and human-computer interaction technology. The characteristics, principles, control process and key technology of the system are discussed in detail. Through the installation and commissioning fully automatic alien smoke stack and packaging system has a good performance and has completed the requirements for shaped cigarette.

  16. Speech Recognition as a Support Service for Deaf and Hard of Hearing Students: Adaptation and Evaluation. Final Report to Spencer Foundation.

    ERIC Educational Resources Information Center

    Stinson, Michael; Elliot, Lisa; McKee, Barbara; Coyne, Gina

    This report discusses a project that adapted new automatic speech recognition (ASR) technology to provide real-time speech-to-text transcription as a support service for students who are deaf and hard of hearing (D/HH). In this system, as the teacher speaks, a hearing intermediary, or captionist, dictates into the speech recognition system in a…

  17. Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes.

    PubMed

    Meyer, Bernd T; Brand, Thomas; Kollmeier, Birger

    2011-01-01

    The aim of this study is to quantify the gap between the recognition performance of human listeners and an automatic speech recognition (ASR) system with special focus on intrinsic variations of speech, such as speaking rate and effort, altered pitch, and the presence of dialect and accent. Second, it is investigated if the most common ASR features contain all information required to recognize speech in noisy environments by using resynthesized ASR features in listening experiments. For the phoneme recognition task, the ASR system achieved the human performance level only when the signal-to-noise ratio (SNR) was increased by 15 dB, which is an estimate for the human-machine gap in terms of the SNR. The major part of this gap is attributed to the feature extraction stage, since human listeners achieve comparable recognition scores when the SNR difference between unaltered and resynthesized utterances is 10 dB. Intrinsic variabilities result in strong increases of error rates, both in human speech recognition (HSR) and ASR (with a relative increase of up to 120%). An analysis of phoneme duration and recognition rates indicates that human listeners are better able to identify temporal cues than the machine at low SNRs, which suggests incorporating information about the temporal dynamics of speech into ASR systems.

  18. Computer-Aided Authoring System (AUTHOR) User's Guide. Volume I. Final Report.

    ERIC Educational Resources Information Center

    Guitard, Charles R.

    This user's guide for AUTHOR, an automatic authoring system which produces programmed texts for teaching symbol recognition, provides detailed instructions to help the user construct and enter the information needed to create the programmed text, run the AUTHOR program, and edit the automatically composed paper. Major sections describe steps in…

  19. Face Averages Enhance User Recognition for Smartphone Security

    PubMed Central

    Robertson, David J.; Kramer, Robin S. S.; Burton, A. Mike

    2015-01-01

    Our recognition of familiar faces is excellent, and generalises across viewing conditions. However, unfamiliar face recognition is much poorer. For this reason, automatic face recognition systems might benefit from incorporating the advantages of familiarity. Here we put this to the test using the face verification system available on a popular smartphone (the Samsung Galaxy). In two experiments we tested the recognition performance of the smartphone when it was encoded with an individual’s ‘face-average’ – a representation derived from theories of human face perception. This technique significantly improved performance for both unconstrained celebrity images (Experiment 1) and for real faces (Experiment 2): users could unlock their phones more reliably when the device stored an average of the user’s face than when they stored a single image. This advantage was consistent across a wide variety of everyday viewing conditions. Furthermore, the benefit did not reduce the rejection of imposter faces. This benefit is brought about solely by consideration of suitable representations for automatic face recognition, and we argue that this is just as important as development of matching algorithms themselves. We propose that this representation could significantly improve recognition rates in everyday settings. PMID:25807251

  20. Assessing the impact of graphical quality on automatic text recognition in digital maps

    NASA Astrophysics Data System (ADS)

    Chiang, Yao-Yi; Leyk, Stefan; Honarvar Nazari, Narges; Moghaddam, Sima; Tan, Tian Xiang

    2016-08-01

    Converting geographic features (e.g., place names) in map images into a vector format is the first step for incorporating cartographic information into a geographic information system (GIS). With the advancement in computational power and algorithm design, map processing systems have been considerably improved over the last decade. However, the fundamental map processing techniques such as color image segmentation, (map) layer separation, and object recognition are sensitive to minor variations in graphical properties of the input image (e.g., scanning resolution). As a result, most map processing results would not meet user expectations if the user does not "properly" scan the map of interest, pre-process the map image (e.g., using compression or not), and train the processing system, accordingly. These issues could slow down the further advancement of map processing techniques as such unsuccessful attempts create a discouraged user community, and less sophisticated tools would be perceived as more viable solutions. Thus, it is important to understand what kinds of maps are suitable for automatic map processing and what types of results and process-related errors can be expected. In this paper, we shed light on these questions by using a typical map processing task, text recognition, to discuss a number of map instances that vary in suitability for automatic processing. We also present an extensive experiment on a diverse set of scanned historical maps to provide measures of baseline performance of a standard text recognition tool under varying map conditions (graphical quality) and text representations (that can vary even within the same map sheet). Our experimental results help the user understand what to expect when a fully or semi-automatic map processing system is used to process a scanned map with certain (varying) graphical properties and complexities in map content.

  1. Error Rates in Users of Automatic Face Recognition Software

    PubMed Central

    White, David; Dunn, James D.; Schmid, Alexandra C.; Kemp, Richard I.

    2015-01-01

    In recent years, wide deployment of automatic face recognition systems has been accompanied by substantial gains in algorithm performance. However, benchmarking tests designed to evaluate these systems do not account for the errors of human operators, who are often an integral part of face recognition solutions in forensic and security settings. This causes a mismatch between evaluation tests and operational accuracy. We address this by measuring user performance in a face recognition system used to screen passport applications for identity fraud. Experiment 1 measured target detection accuracy in algorithm-generated ‘candidate lists’ selected from a large database of passport images. Accuracy was notably poorer than in previous studies of unfamiliar face matching: participants made over 50% errors for adult target faces, and over 60% when matching images of children. Experiment 2 then compared performance of student participants to trained passport officers–who use the system in their daily work–and found equivalent performance in these groups. Encouragingly, a group of highly trained and experienced “facial examiners” outperformed these groups by 20 percentage points. We conclude that human performance curtails accuracy of face recognition systems–potentially reducing benchmark estimates by 50% in operational settings. Mere practise does not attenuate these limits, but superior performance of trained examiners suggests that recruitment and selection of human operators, in combination with effective training and mentorship, can improve the operational accuracy of face recognition systems. PMID:26465631

  2. Automatic recognition of lactating sow behaviors through depth image processing

    USDA-ARS?s Scientific Manuscript database

    Manual observation and classification of animal behaviors is laborious, time-consuming, and of limited ability to process large amount of data. A computer vision-based system was developed that automatically recognizes sow behaviors (lying, sitting, standing, kneeling, feeding, drinking, and shiftin...

  3. An automatic target recognition system based on SAR image

    NASA Astrophysics Data System (ADS)

    Li, Qinfu; Wang, Jinquan; Zhao, Bo; Luo, Furen; Xu, Xiaojian

    2009-10-01

    In this paper, an automatic target recognition (ATR) system based on synthetic aperture radar (SAR) is proposed. This ATR system can play an important role in the simulation of up-to-data battlefield environment and be used in ATR research. To establish an integral and available system, the processing of SAR image was divided into four main stages which are de-noise, detection, cluster-discrimination and segment-recognition, respectively. The first three stages are used for searching region of interest (ROI). Once the ROIs are extracted, the recognition stage will be taken to compute the similarity between the ROIs and the templates in the electromagnetic simulation software National Electromagnetic Scattering Code (NESC). Due to the lack of the SAR raw data, the electromagnetic simulated images are added to the measured SAR background to simulate the battlefield environment8. The purpose of the system is to find the ROIs which can be the artificial military targets such as tanks, armored cars and so on and to categorize the ROIs into the right classes according to the existing templates. From the results we can see that the proposed system achieves a satisfactory result.

  4. Automated night/day standoff detection, tracking, and identification of personnel for installation protection

    NASA Astrophysics Data System (ADS)

    Lemoff, Brian E.; Martin, Robert B.; Sluch, Mikhail; Kafka, Kristopher M.; McCormick, William; Ice, Robert

    2013-06-01

    The capability to positively and covertly identify people at a safe distance, 24-hours per day, could provide a valuable advantage in protecting installations, both domestically and in an asymmetric warfare environment. This capability would enable installation security officers to identify known bad actors from a safe distance, even if they are approaching under cover of darkness. We will describe an active-SWIR imaging system being developed to automatically detect, track, and identify people at long range using computer face recognition. The system illuminates the target with an eye-safe and invisible SWIR laser beam, to provide consistent high-resolution imagery night and day. SWIR facial imagery produced by the system is matched against a watch-list of mug shots using computer face recognition algorithms. The current system relies on an operator to point the camera and to review and interpret the face recognition results. Automation software is being developed that will allow the system to be cued to a location by an external system, automatically detect a person, track the person as they move, zoom in on the face, select good facial images, and process the face recognition results, producing alarms and sharing data with other systems when people are detected and identified. Progress on the automation of this system will be presented along with experimental night-time face recognition results at distance.

  5. RecceMan: an interactive recognition assistance for image-based reconnaissance: synergistic effects of human perception and computational methods for object recognition, identification, and infrastructure analysis

    NASA Astrophysics Data System (ADS)

    El Bekri, Nadia; Angele, Susanne; Ruckhäberle, Martin; Peinsipp-Byma, Elisabeth; Haelke, Bruno

    2015-10-01

    This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR (automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely. State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human perception and computational methods in a synergistic way, both are unified in an interactive assistance system. RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain object type based on the object features that originated from the image signatures. The infrastructure analysis mode pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system offers him the possibility to assign the image signatures to features given by sample images. The underlying data set contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible solution set of objects is automatically reduced and matches only the objects that contain the selected features. Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly automated methods to extract image signatures and assign them to the corresponding features. This research includes methods for automatically determining the orientation of an object and geometric features like width and length of the object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive recognition assistance system.

  6. Early Detection of Severe Apnoea through Voice Analysis and Automatic Speaker Recognition Techniques

    NASA Astrophysics Data System (ADS)

    Fernández, Ruben; Blanco, Jose Luis; Díaz, David; Hernández, Luis A.; López, Eduardo; Alcázar, José

    This study is part of an on-going collaborative effort between the medical and the signal processing communities to promote research on applying voice analysis and Automatic Speaker Recognition techniques (ASR) for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based diagnosis could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we present and discuss the possibilities of using generative Gaussian Mixture Models (GMMs), generally used in ASR systems, to model distinctive apnoea voice characteristics (i.e. abnormal nasalization). Finally, we present experimental findings regarding the discriminative power of speaker recognition techniques applied to severe apnoea detection. We have achieved an 81.25 % correct classification rate, which is very promising and underpins the interest in this line of inquiry.

  7. Automatic Speech Recognition in Air Traffic Control: a Human Factors Perspective

    NASA Technical Reports Server (NTRS)

    Karlsson, Joakim

    1990-01-01

    The introduction of Automatic Speech Recognition (ASR) technology into the Air Traffic Control (ATC) system has the potential to improve overall safety and efficiency. However, because ASR technology is inherently a part of the man-machine interface between the user and the system, the human factors issues involved must be addressed. Here, some of the human factors problems are identified and related methods of investigation are presented. Research at M.I.T.'s Flight Transportation Laboratory is being conducted from a human factors perspective, focusing on intelligent parser design, presentation of feedback, error correction strategy design, and optimal choice of input modalities.

  8. Contour matching for a fish recognition and migration-monitoring system

    NASA Astrophysics Data System (ADS)

    Lee, Dah-Jye; Schoenberger, Robert B.; Shiozawa, Dennis; Xu, Xiaoqian; Zhan, Pengcheng

    2004-12-01

    Fish migration is being monitored year round to provide valuable information for the study of behavioral responses of fish to environmental variations. However, currently all monitoring is done by human observers. An automatic fish recognition and migration monitoring system is more efficient and can provide more accurate data. Such a system includes automatic fish image acquisition, contour extraction, fish categorization, and data storage. Shape is a very important characteristic and shape analysis and shape matching are studied for fish recognition. Previous work focused on finding critical landmark points on fish shape using curvature function analysis. Fish recognition based on landmark points has shown satisfying results. However, the main difficulty of this approach is that landmark points sometimes cannot be located very accurately. Whole shape matching is used for fish recognition in this paper. Several shape descriptors, such as Fourier descriptors, polygon approximation and line segments, are tested. A power cepstrum technique has been developed in order to improve the categorization speed using contours represented in tangent space with normalized length. Design and integration including image acquisition, contour extraction and fish categorization are discussed in this paper. Fish categorization results based on shape analysis and shape matching are also included.

  9. Advances in image compression and automatic target recognition; Proceedings of the Meeting, Orlando, FL, Mar. 30, 31, 1989

    NASA Technical Reports Server (NTRS)

    Tescher, Andrew G. (Editor)

    1989-01-01

    Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.

  10. Human Activity Recognition in AAL Environments Using Random Projections.

    PubMed

    Damaševičius, Robertas; Vasiljevas, Mindaugas; Šalkevičius, Justas; Woźniak, Marcin

    2016-01-01

    Automatic human activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors attached to the subject's body and permit continuous monitoring of numerous physiological signals reflecting the state of human actions. Successful identification of human activities can be immensely useful in healthcare applications for Ambient Assisted Living (AAL), for automatic and intelligent activity monitoring systems developed for elderly and disabled people. In this paper, we propose the method for activity recognition and subject identification based on random projections from high-dimensional feature space to low-dimensional projection space, where the classes are separated using the Jaccard distance between probability density functions of projected data. Two HAR domain tasks are considered: activity identification and subject identification. The experimental results using the proposed method with Human Activity Dataset (HAD) data are presented.

  11. Human Activity Recognition in AAL Environments Using Random Projections

    PubMed Central

    Damaševičius, Robertas; Vasiljevas, Mindaugas; Šalkevičius, Justas; Woźniak, Marcin

    2016-01-01

    Automatic human activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors attached to the subject's body and permit continuous monitoring of numerous physiological signals reflecting the state of human actions. Successful identification of human activities can be immensely useful in healthcare applications for Ambient Assisted Living (AAL), for automatic and intelligent activity monitoring systems developed for elderly and disabled people. In this paper, we propose the method for activity recognition and subject identification based on random projections from high-dimensional feature space to low-dimensional projection space, where the classes are separated using the Jaccard distance between probability density functions of projected data. Two HAR domain tasks are considered: activity identification and subject identification. The experimental results using the proposed method with Human Activity Dataset (HAD) data are presented. PMID:27413392

  12. Automatic micropropagation of plants--the vision-system: graph rewriting as pattern recognition

    NASA Astrophysics Data System (ADS)

    Schwanke, Joerg; Megnet, Roland; Jensch, Peter F.

    1993-03-01

    The automation of plant-micropropagation is necessary to produce high amounts of biomass. Plants have to be dissected on particular cutting-points. A vision-system is needed for the recognition of the cutting-points on the plants. With this background, this contribution is directed to the underlying formalism to determine cutting-points on abstract-plant models. We show the usefulness of pattern recognition by graph-rewriting along with some examples in this context.

  13. Difficulties in Automatic Speech Recognition of Dysarthric Speakers and Implications for Speech-Based Applications Used by the Elderly: A Literature Review

    ERIC Educational Resources Information Center

    Young, Victoria; Mihailidis, Alex

    2010-01-01

    Despite their growing presence in home computer applications and various telephony services, commercial automatic speech recognition technologies are still not easily employed by everyone; especially individuals with speech disorders. In addition, relatively little research has been conducted on automatic speech recognition performance with older…

  14. A Compact Methodology to Understand, Evaluate, and Predict the Performance of Automatic Target Recognition

    PubMed Central

    Li, Yanpeng; Li, Xiang; Wang, Hongqiang; Chen, Yiping; Zhuang, Zhaowen; Cheng, Yongqiang; Deng, Bin; Wang, Liandong; Zeng, Yonghu; Gao, Lei

    2014-01-01

    This paper offers a compacted mechanism to carry out the performance evaluation work for an automatic target recognition (ATR) system: (a) a standard description of the ATR system's output is suggested, a quantity to indicate the operating condition is presented based on the principle of feature extraction in pattern recognition, and a series of indexes to assess the output in different aspects are developed with the application of statistics; (b) performance of the ATR system is interpreted by a quality factor based on knowledge of engineering mathematics; (c) through a novel utility called “context-probability” estimation proposed based on probability, performance prediction for an ATR system is realized. The simulation result shows that the performance of an ATR system can be accounted for and forecasted by the above-mentioned measures. Compared to existing technologies, the novel method can offer more objective performance conclusions for an ATR system. These conclusions may be helpful in knowing the practical capability of the tested ATR system. At the same time, the generalization performance of the proposed method is good. PMID:24967605

  15. A System for Mailpiece ZIP Code Assignment through Contextual Analysis. Phase 2

    DTIC Science & Technology

    1991-03-01

    Segmentation Address Block Interpretation Automatic Feature Generation Word Recognition Feature Detection Word Verification Optical Character Recognition Directory...in the Phase III effort. 1.1 Motivation The United States Postal Service (USPS) deploys large numbers of optical character recognition (OCR) machines...4):208-218, November 1986. [2] Gronmeyer, L. K., Ruffin, B. W., Lybanon, M. A., Neely, P. L., and Pierce, S. E. An Overview of Optical Character Recognition (OCR

  16. Speech recognition for embedded automatic positioner for laparoscope

    NASA Astrophysics Data System (ADS)

    Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin

    2014-07-01

    In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.

  17. Automatic recognition of ship types from infrared images using superstructure moment invariants

    NASA Astrophysics Data System (ADS)

    Li, Heng; Wang, Xinyu

    2007-11-01

    Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.

  18. Use of pattern recognition and neural networks for non-metric sex diagnosis from lateral shape of calvarium: an innovative model for computer-aided diagnosis in forensic and physical anthropology.

    PubMed

    Cavalli, Fabio; Lusnig, Luca; Trentin, Edmondo

    2017-05-01

    Sex determination on skeletal remains is one of the most important diagnosis in forensic cases and in demographic studies on ancient populations. Our purpose is to realize an automatic operator-independent method to determine the sex from the bone shape and to test an intelligent, automatic pattern recognition system in an anthropological domain. Our multiple-classifier system is based exclusively on the morphological variants of a curve that represents the sagittal profile of the calvarium, modeled via artificial neural networks, and yields an accuracy higher than 80 %. The application of this system to other bone profiles is expected to further improve the sensibility of the methodology.

  19. Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System.

    PubMed

    Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu

    2016-10-20

    Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias.

  20. Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System

    PubMed Central

    Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu

    2016-01-01

    Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias. PMID:27775596

  1. Localized contourlet features in vehicle make and model recognition

    NASA Astrophysics Data System (ADS)

    Zafar, I.; Edirisinghe, E. A.; Acar, B. S.

    2009-02-01

    Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic Number Plate Recognition (ANPR) systems. Several vehicle MMR systems have been proposed in literature. In parallel to this, the usefulness of multi-resolution based feature analysis techniques leading to efficient object classification algorithms have received close attention from the research community. To this effect, Contourlet transforms that can provide an efficient directional multi-resolution image representation has recently been introduced. Already an attempt has been made in literature to use Curvelet/Contourlet transforms in vehicle MMR. In this paper we propose a novel localized feature detection method in Contourlet transform domain that is capable of increasing the classification rates up to 4%, as compared to the previously proposed Contourlet based vehicle MMR approach in which the features are non-localized and thus results in sub-optimal classification. Further we show that the proposed algorithm can achieve the increased classification accuracy of 96% at significantly lower computational complexity due to the use of Two Dimensional Linear Discriminant Analysis (2DLDA) for dimensionality reduction by preserving the features with high between-class variance and low inter-class variance.

  2. Two-dimensional statistical linear discriminant analysis for real-time robust vehicle-type recognition

    NASA Astrophysics Data System (ADS)

    Zafar, I.; Edirisinghe, E. A.; Acar, S.; Bez, H. E.

    2007-02-01

    Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic License Plate Recognition (ALPR) systems. Several car MMR systems have been proposed in literature. However these approaches are based on feature detection algorithms that can perform sub-optimally under adverse lighting and/or occlusion conditions. In this paper we propose a real time, appearance based, car MMR approach using Two Dimensional Linear Discriminant Analysis that is capable of addressing this limitation. We provide experimental results to analyse the proposed algorithm's robustness under varying illumination and occlusions conditions. We have shown that the best performance with the proposed 2D-LDA based car MMR approach is obtained when the eigenvectors of lower significance are ignored. For the given database of 200 car images of 25 different make-model classifications, a best accuracy of 91% was obtained with the 2D-LDA approach. We use a direct Principle Component Analysis (PCA) based approach as a benchmark to compare and contrast the performance of the proposed 2D-LDA approach to car MMR. We conclude that in general the 2D-LDA based algorithm supersedes the performance of the PCA based approach.

  3. Towards a smart glove: arousal recognition based on textile Electrodermal Response.

    PubMed

    Valenza, Gaetano; Lanata, Antonio; Scilingo, Enzo Pasquale; De Rossi, Danilo

    2010-01-01

    This paper investigates the possibility of using Electrodermal Response, acquired by a sensing fabric glove with embedded textile electrodes, as reliable means for emotion recognition. Here, all the essential steps for an automatic recognition system are described, from the recording of physiological data set to a feature-based multiclass classification. Data were collected from 35 healthy volunteers during arousal elicitation by means of International Affective Picture System (IAPS) pictures. Experimental results show high discrimination after twenty steps of cross validation.

  4. User acceptance of intelligent avionics: A study of automatic-aided target recognition

    NASA Technical Reports Server (NTRS)

    Becker, Curtis A.; Hayes, Brian C.; Gorman, Patrick C.

    1991-01-01

    User acceptance of new support systems typically was evaluated after the systems were specified, designed, and built. The current study attempts to assess user acceptance of an Automatic-Aided Target Recognition (ATR) system using an emulation of such a proposed system. The detection accuracy and false alarm level of the ATR system were varied systematically, and subjects rated the tactical value of systems exhibiting different performance levels. Both detection accuracy and false alarm level affected the subjects' ratings. The data from two experiments suggest a cut-off point in ATR performance below which the subjects saw little tactical value in the system. An ATR system seems to have obvious tactical value only if it functions at a correct detection rate of 0.7 or better with a false alarm level of 0.167 false alarms per square degree or fewer.

  5. Automatic Speech Recognition from Neural Signals: A Focused Review.

    PubMed

    Herff, Christian; Schultz, Tanja

    2016-01-01

    Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.

  6. Human abdomen recognition using camera and force sensor in medical robot system for automatic ultrasound scan.

    PubMed

    Bin Mustafa, Ammar Safwan; Ishii, Takashi; Matsunaga, Yoshiki; Nakadate, Ryu; Ishii, Hiroyuki; Ogawa, Kouji; Saito, Akiko; Sugawara, Motoaki; Niki, Kiyomi; Takanishi, Atsuo

    2013-01-01

    Physicians use ultrasound scans to obtain real-time images of internal organs, because such scans are safe and inexpensive. However, people in remote areas face difficulties to be scanned due to aging society and physician's shortage. Hence, it is important to develop an autonomous robotic system to perform remote ultrasound scans. Previously, we developed a robotic system for automatic ultrasound scan focusing on human's liver. In order to make it a completely autonomous system, we present in this paper a way to autonomously localize the epigastric region as the starting position for the automatic ultrasound scan. An image processing algorithm marks the umbilicus and mammary papillae on a digital photograph of the patient's abdomen. Then, we made estimation for the location of the epigastric region using the distances between these landmarks. A supporting algorithm distinguishes rib position from epigastrium using the relationship between force and displacement. We implemented these algorithms with the automatic scanning system into an apparatus: a Mitsubishi Electric's MELFA RV-1 six axis manipulator. Tests on 14 healthy male subjects showed the apparatus located the epigastric region with a success rate of 94%. The results suggest that image recognition was effective in localizing a human body part.

  7. Measurement Marker Recognition In A Time Sequence Of Infrared Images For Biomedical Applications

    NASA Astrophysics Data System (ADS)

    Fiorini, A. R.; Fumero, R.; Marchesi, R.

    1986-03-01

    In thermographic measurements, quantitative surface temperature evaluation is often uncertain. The main reason is in the lack of available reference points in transient conditions. Reflective markers were used for automatic marker recognition and pixel coordinate computations. An algorithm selects marker icons to match marker references where particular luminance conditions are satisfied. Automatic marker recognition allows luminance compensation and temperature calibration of recorded infrared images. A biomedical application is presented: the dynamic behaviour of the surface temperature distributions is investigated in order to study the performance of two different pumping systems for extracorporeal circulation. Sequences of images are compared and results are discussed. Finally, the algorithm allows to monitor the experimental environment and to alert for the presence of unusual experimental conditions.

  8. Field programmable gate arrays-based number plate binarization and adjustment for automatic number plate recognition systems

    NASA Astrophysics Data System (ADS)

    Zhai, Xiaojun; Bensaali, Faycal; Sotudeh, Reza

    2013-01-01

    Number plate (NP) binarization and adjustment are important preprocessing stages in automatic number plate recognition (ANPR) systems and are used to link the number plate localization (NPL) and character segmentation stages. Successfully linking these two stages will improve the performance of the entire ANPR system. We present two optimized low-complexity NP binarization and adjustment algorithms. Efficient area/speed architectures based on the proposed algorithms are also presented and have been successfully implemented and tested using the Mentor Graphics RC240 FPGA development board, which together require only 9% of the available on-chip resources of a Virtex-4 FPGA, run with a maximum frequency of 95.8 MHz and are capable of processing one image in 0.07 to 0.17 ms.

  9. Some effects of stress on users of a voice recognition system: A preliminary inquiry

    NASA Astrophysics Data System (ADS)

    French, B. A.

    1983-03-01

    Recent work with Automatic Speech Recognition has focused on applications and productivity considerations in the man-machine interface. This thesis is an attempt to see if placing users of such equipment under time-induced stress has an effect on their percent correct recognition rates. Subjects were given a message-handling task of fixed length and allowed progressively shorter times to attempt to complete it. Questionnaire responses indicate stress levels increased with decreased time-allowance; recognition rates decreased as time was reduced.

  10. Automatically Log Off Upon Disappearance of Facial Image

    DTIC Science & Technology

    2005-03-01

    log off a PC when the user’s face disappears for an adjustable time interval. Among the fundamental technologies of biometrics, facial recognition is... facial recognition products. In this report, a brief overview of face detection technologies is provided. The particular neural network-based face...ensure that the user logging onto the system is the same person. Among the fundamental technologies of biometrics, facial recognition is the only

  11. Signal recognition and parameter estimation of BPSK-LFM combined modulation

    NASA Astrophysics Data System (ADS)

    Long, Chao; Zhang, Lin; Liu, Yu

    2015-07-01

    Intra-pulse analysis plays an important role in electronic warfare. Intra-pulse feature abstraction focuses on primary parameters such as instantaneous frequency, modulation, and symbol rate. In this paper, automatic modulation recognition and feature extraction for combined BPSK-LFM modulation signals based on decision theoretic approach is studied. The simulation results show good recognition effect and high estimation precision, and the system is easy to be realized.

  12. Automatic recognition of fundamental tissues on histology images of the human cardiovascular system.

    PubMed

    Mazo, Claudia; Trujillo, Maria; Alegre, Enrique; Salazar, Liliana

    2016-10-01

    Cardiovascular disease is the leading cause of death worldwide. Therefore, techniques for improving diagnosis and treatment in this field have become key areas for research. In particular, approaches for tissue image processing may support education system and medical practice. In this paper, an approach to automatic recognition and classification of fundamental tissues, using morphological information is presented. Taking a 40× or 10× histological image as input, three clusters are created with the k-means algorithm using a structural tensor and the red and the green channels. Loose connective tissue, light regions and cell nuclei are recognised on 40× images. Then, the cell nuclei's features - shape and spatial projection - and light regions are used to recognise and classify epithelial cells and tissue into flat, cubic and cylindrical. In a similar way, light regions, loose connective and muscle tissues are recognised on 10× images. Finally, the tissue's function and composition are used to refine muscle tissue recognition. Experimental validation is then carried out by histologist following expert criteria, along with manually annotated images that are used as a ground-truth. The results revealed that the proposed approach classified the fundamental tissues in a similar way to the conventional method employed by histologists. The proposed automatic recognition approach provides for epithelial tissues a sensitivity of 0.79 for cubic, 0.85 for cylindrical and 0.91 for flat. Furthermore, the experts gave our method an average score of 4.85 out of 5 in the recognition of loose connective tissue and 4.82 out of 5 for muscle tissue recognition. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Unification of automatic target tracking and automatic target recognition

    NASA Astrophysics Data System (ADS)

    Schachter, Bruce J.

    2014-06-01

    The subject being addressed is how an automatic target tracker (ATT) and an automatic target recognizer (ATR) can be fused together so tightly and so well that their distinctiveness becomes lost in the merger. This has historically not been the case outside of biology and a few academic papers. The biological model of ATT∪ATR arises from dynamic patterns of activity distributed across many neural circuits and structures (including retina). The information that the brain receives from the eyes is "old news" at the time that it receives it. The eyes and brain forecast a tracked object's future position, rather than relying on received retinal position. Anticipation of the next moment - building up a consistent perception - is accomplished under difficult conditions: motion (eyes, head, body, scene background, target) and processing limitations (neural noise, delays, eye jitter, distractions). Not only does the human vision system surmount these problems, but it has innate mechanisms to exploit motion in support of target detection and classification. Biological vision doesn't normally operate on snapshots. Feature extraction, detection and recognition are spatiotemporal. When vision is viewed as a spatiotemporal process, target detection, recognition, tracking, event detection and activity recognition, do not seem as distinct as they are in current ATT and ATR designs. They appear as similar mechanism taking place at varying time scales. A framework is provided for unifying ATT and ATR.

  14. Human Activity Recognition from Smart-Phone Sensor Data using a Multi-Class Ensemble Learning in Home Monitoring.

    PubMed

    Ghose, Soumya; Mitra, Jhimli; Karunanithi, Mohan; Dowling, Jason

    2015-01-01

    Home monitoring of chronically ill or elderly patient can reduce frequent hospitalisations and hence provide improved quality of care at a reduced cost to the community, therefore reducing the burden on the healthcare system. Activity recognition of such patients is of high importance in such a design. In this work, a system for automatic human physical activity recognition from smart-phone inertial sensors data is proposed. An ensemble of decision trees framework is adopted to train and predict the multi-class human activity system. A comparison of our proposed method with a multi-class traditional support vector machine shows significant improvement in activity recognition accuracies.

  15. Robot Command Interface Using an Audio-Visual Speech Recognition System

    NASA Astrophysics Data System (ADS)

    Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy

    In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.

  16. Higher-order neural network software for distortion invariant object recognition

    NASA Technical Reports Server (NTRS)

    Reid, Max B.; Spirkovska, Lilly

    1991-01-01

    The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.

  17. Automatic anatomy recognition on CT images with pathology

    NASA Astrophysics Data System (ADS)

    Huang, Lidong; Udupa, Jayaram K.; Tong, Yubing; Odhner, Dewey; Torigian, Drew A.

    2016-03-01

    Body-wide anatomy recognition on CT images with pathology becomes crucial for quantifying body-wide disease burden. This, however, is a challenging problem because various diseases result in various abnormalities of objects such as shape and intensity patterns. We previously developed an automatic anatomy recognition (AAR) system [1] whose applicability was demonstrated on near normal diagnostic CT images in different body regions on 35 organs. The aim of this paper is to investigate strategies for adapting the previous AAR system to diagnostic CT images of patients with various pathologies as a first step toward automated body-wide disease quantification. The AAR approach consists of three main steps - model building, object recognition, and object delineation. In this paper, within the broader AAR framework, we describe a new strategy for object recognition to handle abnormal images. In the model building stage an optimal threshold interval is learned from near-normal training images for each object. This threshold is optimally tuned to the pathological manifestation of the object in the test image. Recognition is performed following a hierarchical representation of the objects. Experimental results for the abdominal body region based on 50 near-normal images used for model building and 20 abnormal images used for object recognition show that object localization accuracy within 2 voxels for liver and spleen and 3 voxels for kidney can be achieved with the new strategy.

  18. License Plate Recognition System for Indian Vehicles

    NASA Astrophysics Data System (ADS)

    Sanap, P. R.; Narote, S. P.

    2010-11-01

    We consider the task of recognition of Indian vehicle number plates (also called license plates or registration plates in other countries). A system for Indian number plate recognition must cope with wide variations in the appearance of the plates. Each state uses its own range of designs with font variations between the designs. Also, vehicle owners may place the plates inside glass covered frames or use plates made of nonstandard materials. These issues compound the complexity of automatic number plate recognition, making existing approaches inadequate. We have developed a system that incorporates a novel combination of image processing and artificial neural network technologies to successfully locate and read Indian vehicle number plates in digital images. Commercial application of the system is envisaged.

  19. Call recognition and individual identification of fish vocalizations based on automatic speech recognition: An example with the Lusitanian toadfish.

    PubMed

    Vieira, Manuel; Fonseca, Paulo J; Amorim, M Clara P; Teixeira, Carlos J C

    2015-12-01

    The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.

  20. On compensation of mismatched recording conditions in the Bayesian approach for forensic automatic speaker recognition.

    PubMed

    Botti, F; Alexander, A; Drygajlo, A

    2004-12-02

    This paper deals with a procedure to compensate for mismatched recording conditions in forensic speaker recognition, using a statistical score normalization. Bayesian interpretation of the evidence in forensic automatic speaker recognition depends on three sets of recordings in order to perform forensic casework: reference (R) and control (C) recordings of the suspect, and a potential population database (P), as well as a questioned recording (QR) . The requirement of similar recording conditions between suspect control database (C) and the questioned recording (QR) is often not satisfied in real forensic cases. The aim of this paper is to investigate a procedure of normalization of scores, which is based on an adaptation of the Test-normalization (T-norm) [2] technique used in the speaker verification domain, to compensate for the mismatch. Polyphone IPSC-02 database and ASPIC (an automatic speaker recognition system developed by EPFL and IPS-UNIL in Lausanne, Switzerland) were used in order to test the normalization procedure. Experimental results for three different recording condition scenarios are presented using Tippett plots and the effect of the compensation on the evaluation of the strength of the evidence is discussed.

  1. Image quality assessment for video stream recognition systems

    NASA Astrophysics Data System (ADS)

    Chernov, Timofey S.; Razumnuy, Nikita P.; Kozharinov, Alexander S.; Nikolaev, Dmitry P.; Arlazarov, Vladimir V.

    2018-04-01

    Recognition and machine vision systems have long been widely used in many disciplines to automate various processes of life and industry. Input images of optical recognition systems can be subjected to a large number of different distortions, especially in uncontrolled or natural shooting conditions, which leads to unpredictable results of recognition systems, making it impossible to assess their reliability. For this reason, it is necessary to perform quality control of the input data of recognition systems, which is facilitated by modern progress in the field of image quality evaluation. In this paper, we investigate the approach to designing optical recognition systems with built-in input image quality estimation modules and feedback, for which the necessary definitions are introduced and a model for describing such systems is constructed. The efficiency of this approach is illustrated by the example of solving the problem of selecting the best frames for recognition in a video stream for a system with limited resources. Experimental results are presented for the system for identity documents recognition, showing a significant increase in the accuracy and speed of the system under simulated conditions of automatic camera focusing, leading to blurring of frames.

  2. Can soft biometric traits assist user recognition?

    NASA Astrophysics Data System (ADS)

    Jain, Anil K.; Dass, Sarat C.; Nandakumar, Karthik

    2004-08-01

    Biometrics is rapidly gaining acceptance as the technology that can meet the ever increasing need for security in critical applications. Biometric systems automatically recognize individuals based on their physiological and behavioral characteristics. Hence, the fundamental requirement of any biometric recognition system is a human trait having several desirable properties like universality, distinctiveness, permanence, collectability, acceptability, and resistance to circumvention. However, a human characteristic that possesses all these properties has not yet been identified. As a result, none of the existing biometric systems provide perfect recognition and there is a scope for improving the performance of these systems. Although characteristics like gender, ethnicity, age, height, weight and eye color are not unique and reliable, they provide some information about the user. We refer to these characteristics as "soft" biometric traits and argue that these traits can complement the identity information provided by the primary biometric identifiers like fingerprint and face. This paper presents the motivation for utilizing soft biometric information and analyzes how the soft biometric traits can be automatically extracted and incorporated in the decision making process of the primary biometric system. Preliminary experiments were conducted on a fingerprint database of 160 users by synthetically generating soft biometric traits like gender, ethnicity, and height based on known statistics. The results show that the use of additional soft biometric user information significantly improves (approximately 6%) the recognition performance of the fingerprint biometric system.

  3. Activity Recognition for Personal Time Management

    NASA Astrophysics Data System (ADS)

    Prekopcsák, Zoltán; Soha, Sugárka; Henk, Tamás; Gáspár-Papanek, Csaba

    We describe an accelerometer based activity recognition system for mobile phones with a special focus on personal time management. We compare several data mining algorithms for the automatic recognition task in the case of single user and multiuser scenario, and improve accuracy with heuristics and advanced data mining methods. The results show that daily activities can be recognized with high accuracy and the integration with the RescueTime software can give good insights for personal time management.

  4. Performance of a Working Face Recognition Machine using Cortical Thought Theory

    DTIC Science & Technology

    1984-12-04

    been considered (2). Recommendations from Bledsoe’s study included research on facial - recognition systems that are "completely automatic (remove the...C. L. Location of some facial features . computer, Palo Alto: Panoramic Research, Aug 1966. 2. Bledsoe, W. W. Man-machine facial recognition : Is...34 image?" It would seem - that the location and size of the features left in this contrast-expanded image contain the essential information of facial

  5. Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

    NASA Astrophysics Data System (ADS)

    Scharenborg, Odette; ten Bosch, Louis; Boves, Lou; Norris, Dennis

    2003-12-01

    This letter evaluates potential benefits of combining human speech recognition (HSR) and automatic speech recognition by building a joint model of an automatic phone recognizer (APR) and a computational model of HSR, viz., Shortlist [Norris, Cognition 52, 189-234 (1994)]. Experiments based on ``real-life'' speech highlight critical limitations posed by some of the simplifying assumptions made in models of human speech recognition. These limitations could be overcome by avoiding hard phone decisions at the output side of the APR, and by using a match between the input and the internal lexicon that flexibly copes with deviations from canonical phonemic representations.

  6. Practical vision based degraded text recognition system

    NASA Astrophysics Data System (ADS)

    Mohammad, Khader; Agaian, Sos; Saleh, Hani

    2011-02-01

    Rapid growth and progress in the medical, industrial, security and technology fields means more and more consideration for the use of camera based optical character recognition (OCR) Applying OCR to scanned documents is quite mature, and there are many commercial and research products available on this topic. These products achieve acceptable recognition accuracy and reasonable processing times especially with trained software, and constrained text characteristics. Even though the application space for OCR is huge, it is quite challenging to design a single system that is capable of performing automatic OCR for text embedded in an image irrespective of the application. Challenges for OCR systems include; images are taken under natural real world conditions, Surface curvature, text orientation, font, size, lighting conditions, and noise. These and many other conditions make it extremely difficult to achieve reasonable character recognition. Performance for conventional OCR systems drops dramatically as the degradation level of the text image quality increases. In this paper, a new recognition method is proposed to recognize solid or dotted line degraded characters. The degraded text string is localized and segmented using a new algorithm. The new method was implemented and tested using a development framework system that is capable of performing OCR on camera captured images. The framework allows parameter tuning of the image-processing algorithm based on a training set of camera-captured text images. Novel methods were used for enhancement, text localization and the segmentation algorithm which enables building a custom system that is capable of performing automatic OCR which can be used for different applications. The developed framework system includes: new image enhancement, filtering, and segmentation techniques which enabled higher recognition accuracies, faster processing time, and lower energy consumption, compared with the best state of the art published techniques. The system successfully produced impressive OCR accuracies (90% -to- 93%) using customized systems generated by our development framework in two industrial OCR applications: water bottle label text recognition and concrete slab plate text recognition. The system was also trained for the Arabic language alphabet, and demonstrated extremely high recognition accuracy (99%) for Arabic license name plate text recognition with processing times of 10 seconds. The accuracy and run times of the system were compared to conventional and many states of art methods, the proposed system shows excellent results.

  7. Application of image recognition-based automatic hyphae detection in fungal keratitis.

    PubMed

    Wu, Xuelian; Tao, Yuan; Qiu, Qingchen; Wu, Xinyi

    2018-03-01

    The purpose of this study is to evaluate the accuracy of two methods in diagnosis of fungal keratitis, whereby one method is automatic hyphae detection based on images recognition and the other method is corneal smear. We evaluate the sensitivity and specificity of the method in diagnosis of fungal keratitis, which is automatic hyphae detection based on image recognition. We analyze the consistency of clinical symptoms and the density of hyphae, and perform quantification using the method of automatic hyphae detection based on image recognition. In our study, 56 cases with fungal keratitis (just single eye) and 23 cases with bacterial keratitis were included. All cases underwent the routine inspection of slit lamp biomicroscopy, corneal smear examination, microorganism culture and the assessment of in vivo confocal microscopy images before starting medical treatment. Then, we recognize the hyphae images of in vivo confocal microscopy by using automatic hyphae detection based on image recognition to evaluate its sensitivity and specificity and compare with the method of corneal smear. The next step is to use the index of density to assess the severity of infection, and then find the correlation with the patients' clinical symptoms and evaluate consistency between them. The accuracy of this technology was superior to corneal smear examination (p < 0.05). The sensitivity of the technology of automatic hyphae detection of image recognition was 89.29%, and the specificity was 95.65%. The area under the ROC curve was 0.946. The correlation coefficient between the grading of the severity in the fungal keratitis by the automatic hyphae detection based on image recognition and the clinical grading is 0.87. The technology of automatic hyphae detection based on image recognition was with high sensitivity and specificity, able to identify fungal keratitis, which is better than the method of corneal smear examination. This technology has the advantages when compared with the conventional artificial identification of confocal microscope corneal images, of being accurate, stable and does not rely on human expertise. It was the most useful to the medical experts who are not familiar with fungal keratitis. The technology of automatic hyphae detection based on image recognition can quantify the hyphae density and grade this property. Being noninvasive, it can provide an evaluation criterion to fungal keratitis in a timely, accurate, objective and quantitative manner.

  8. Automatic identification of species with neural networks.

    PubMed

    Hernández-Serna, Andrés; Jiménez-Segura, Luz Fernanda

    2014-01-01

    A new automatic identification system using photographic images has been designed to recognize fish, plant, and butterfly species from Europe and South America. The automatic classification system integrates multiple image processing tools to extract the geometry, morphology, and texture of the images. Artificial neural networks (ANNs) were used as the pattern recognition method. We tested a data set that included 740 species and 11,198 individuals. Our results show that the system performed with high accuracy, reaching 91.65% of true positive fish identifications, 92.87% of plants and 93.25% of butterflies. Our results highlight how the neural networks are complementary to species identification.

  9. Automated target recognition and tracking using an optical pattern recognition neural network

    NASA Technical Reports Server (NTRS)

    Chao, Tien-Hsin

    1991-01-01

    The on-going development of an automatic target recognition and tracking system at the Jet Propulsion Laboratory is presented. This system is an optical pattern recognition neural network (OPRNN) that is an integration of an innovative optical parallel processor and a feature extraction based neural net training algorithm. The parallel optical processor provides high speed and vast parallelism as well as full shift invariance. The neural network algorithm enables simultaneous discrimination of multiple noisy targets in spite of their scales, rotations, perspectives, and various deformations. This fully developed OPRNN system can be effectively utilized for the automated spacecraft recognition and tracking that will lead to success in the Automated Rendezvous and Capture (AR&C) of the unmanned Cargo Transfer Vehicle (CTV). One of the most powerful optical parallel processors for automatic target recognition is the multichannel correlator. With the inherent advantages of parallel processing capability and shift invariance, multiple objects can be simultaneously recognized and tracked using this multichannel correlator. This target tracking capability can be greatly enhanced by utilizing a powerful feature extraction based neural network training algorithm such as the neocognitron. The OPRNN, currently under investigation at JPL, is constructed with an optical multichannel correlator where holographic filters have been prepared using the neocognitron training algorithm. The computation speed of the neocognitron-type OPRNN is up to 10(exp 14) analog connections/sec that enabling the OPRNN to outperform its state-of-the-art electronics counterpart by at least two orders of magnitude.

  10. A multilingual gold-standard corpus for biomedical concept recognition: the Mantra GSC

    PubMed Central

    Clematide, Simon; Akhondi, Saber A; van Mulligen, Erik M; Rebholz-Schuhmann, Dietrich

    2015-01-01

    Objective To create a multilingual gold-standard corpus for biomedical concept recognition. Materials and methods We selected text units from different parallel corpora (Medline abstract titles, drug labels, biomedical patent claims) in English, French, German, Spanish, and Dutch. Three annotators per language independently annotated the biomedical concepts, based on a subset of the Unified Medical Language System and covering a wide range of semantic groups. To reduce the annotation workload, automatically generated preannotations were provided. Individual annotations were automatically harmonized and then adjudicated, and cross-language consistency checks were carried out to arrive at the final annotations. Results The number of final annotations was 5530. Inter-annotator agreement scores indicate good agreement (median F-score 0.79), and are similar to those between individual annotators and the gold standard. The automatically generated harmonized annotation set for each language performed equally well as the best annotator for that language. Discussion The use of automatic preannotations, harmonized annotations, and parallel corpora helped to keep the manual annotation efforts manageable. The inter-annotator agreement scores provide a reference standard for gauging the performance of automatic annotation techniques. Conclusion To our knowledge, this is the first gold-standard corpus for biomedical concept recognition in languages other than English. Other distinguishing features are the wide variety of semantic groups that are being covered, and the diversity of text genres that were annotated. PMID:25948699

  11. Automatic Facial Expression Recognition and Operator Functional State

    NASA Technical Reports Server (NTRS)

    Blanson, Nina

    2012-01-01

    The prevalence of human error in safety-critical occupations remains a major challenge to mission success despite increasing automation in control processes. Although various methods have been proposed to prevent incidences of human error, none of these have been developed to employ the detection and regulation of Operator Functional State (OFS), or the optimal condition of the operator while performing a task, in work environments due to drawbacks such as obtrusiveness and impracticality. A video-based system with the ability to infer an individual's emotional state from facial feature patterning mitigates some of the problems associated with other methods of detecting OFS, like obtrusiveness and impracticality in integration with the mission environment. This paper explores the utility of facial expression recognition as a technology for inferring OFS by first expounding on the intricacies of OFS and the scientific background behind emotion and its relationship with an individual's state. Then, descriptions of the feedback loop and the emotion protocols proposed for the facial recognition program are explained. A basic version of the facial expression recognition program uses Haar classifiers and OpenCV libraries to automatically locate key facial landmarks during a live video stream. Various methods of creating facial expression recognition software are reviewed to guide future extensions of the program. The paper concludes with an examination of the steps necessary in the research of emotion and recommendations for the creation of an automatic facial expression recognition program for use in real-time, safety-critical missions

  12. Automatic Facial Expression Recognition and Operator Functional State

    NASA Technical Reports Server (NTRS)

    Blanson, Nina

    2011-01-01

    The prevalence of human error in safety-critical occupations remains a major challenge to mission success despite increasing automation in control processes. Although various methods have been proposed to prevent incidences of human error, none of these have been developed to employ the detection and regulation of Operator Functional State (OFS), or the optimal condition of the operator while performing a task, in work environments due to drawbacks such as obtrusiveness and impracticality. A video-based system with the ability to infer an individual's emotional state from facial feature patterning mitigates some of the problems associated with other methods of detecting OFS, like obtrusiveness and impracticality in integration with the mission environment. This paper explores the utility of facial expression recognition as a technology for inferring OFS by first expounding on the intricacies of OFS and the scientific background behind emotion and its relationship with an individual's state. Then, descriptions of the feedback loop and the emotion protocols proposed for the facial recognition program are explained. A basic version of the facial expression recognition program uses Haar classifiers and OpenCV libraries to automatically locate key facial landmarks during a live video stream. Various methods of creating facial expression recognition software are reviewed to guide future extensions of the program. The paper concludes with an examination of the steps necessary in the research of emotion and recommendations for the creation of an automatic facial expression recognition program for use in real-time, safety-critical missions.

  13. Intelligent Image Analysis for Image-Guided Laser Hair Removal and Skin Therapy

    NASA Technical Reports Server (NTRS)

    Walker, Brian; Lu, Thomas; Chao, Tien-Hsin

    2012-01-01

    We present the development of advanced automatic target recognition (ATR) algorithms for the hair follicles identification in digital skin images to accurately direct the laser beam to remove the hair. The ATR system first performs a wavelet filtering to enhance the contrast of the hair features in the image. The system then extracts the unique features of the targets and sends the features to an Adaboost based classifier for training and recognition operations. The ATR system automatically classifies the hair, moles, or other skin lesion and provides the accurate coordinates of the intended hair follicle locations. The coordinates can be used to guide a scanning laser to focus energy only on the hair follicles. The intended benefit would be to protect the skin from unwanted laser exposure and to provide more effective skin therapy.

  14. Recognition of plant parts with problem-specific algorithms

    NASA Astrophysics Data System (ADS)

    Schwanke, Joerg; Brendel, Thorsten; Jensch, Peter F.; Megnet, Roland

    1994-06-01

    Automatic micropropagation is necessary to produce cost-effective high amounts of biomass. Juvenile plants are dissected in clean- room environment on particular points on the stem or the leaves. A vision-system detects possible cutting points and controls a specialized robot. This contribution is directed to the pattern- recognition algorithms to detect structural parts of the plant.

  15. User Experience of a Mobile Speaking Application with Automatic Speech Recognition for EFL Learning

    ERIC Educational Resources Information Center

    Ahn, Tae youn; Lee, Sangmin-Michelle

    2016-01-01

    With the spread of mobile devices, mobile phones have enormous potential regarding their pedagogical use in language education. The goal of this study is to analyse user experience of a mobile-based learning system that is enhanced by speech recognition technology for the improvement of EFL (English as a foreign language) learners' speaking…

  16. EduSpeak[R]: A Speech Recognition and Pronunciation Scoring Toolkit for Computer-Aided Language Learning Applications

    ERIC Educational Resources Information Center

    Franco, Horacio; Bratt, Harry; Rossier, Romain; Rao Gadde, Venkata; Shriberg, Elizabeth; Abrash, Victor; Precoda, Kristin

    2010-01-01

    SRI International's EduSpeak[R] system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology. Automatic pronunciation scoring allows the computer to provide feedback on the overall quality of pronunciation and to point to…

  17. Hierarchical Recognition Scheme for Human Facial Expression Recognition Systems

    PubMed Central

    Siddiqi, Muhammad Hameed; Lee, Sungyoung; Lee, Young-Koo; Khan, Adil Mehmood; Truc, Phan Tran Ho

    2013-01-01

    Over the last decade, human facial expressions recognition (FER) has emerged as an important research area. Several factors make FER a challenging research problem. These include varying light conditions in training and test images; need for automatic and accurate face detection before feature extraction; and high similarity among different expressions that makes it difficult to distinguish these expressions with a high accuracy. This work implements a hierarchical linear discriminant analysis-based facial expressions recognition (HL-FER) system to tackle these problems. Unlike the previous systems, the HL-FER uses a pre-processing step to eliminate light effects, incorporates a new automatic face detection scheme, employs methods to extract both global and local features, and utilizes a HL-FER to overcome the problem of high similarity among different expressions. Unlike most of the previous works that were evaluated using a single dataset, the performance of the HL-FER is assessed using three publicly available datasets under three different experimental settings: n-fold cross validation based on subjects for each dataset separately; n-fold cross validation rule based on datasets; and, finally, a last set of experiments to assess the effectiveness of each module of the HL-FER separately. Weighted average recognition accuracy of 98.7% across three different datasets, using three classifiers, indicates the success of employing the HL-FER for human FER. PMID:24316568

  18. Neural network for intelligent query of an FBI forensic database

    NASA Astrophysics Data System (ADS)

    Uvanni, Lee A.; Rainey, Timothy G.; Balasubramanian, Uma; Brettle, Dean W.; Weingard, Fred; Sibert, Robert W.; Birnbaum, Eric

    1997-02-01

    Examiner is an automated fired cartridge case identification system utilizing a dual-use neural network pattern recognition technology, called the statistical-multiple object detection and location system (S-MODALS) developed by Booz(DOT)Allen & Hamilton, Inc. in conjunction with Rome Laboratory. S-MODALS was originally designed for automatic target recognition (ATR) of tactical and strategic military targets using multisensor fusion [electro-optical (EO), infrared (IR), and synthetic aperture radar (SAR)] sensors. Since S-MODALS is a learning system readily adaptable to problem domains other than automatic target recognition, the pattern matching problem of microscopic marks for firearms evidence was analyzed using S-MODALS. The physics; phenomenology; discrimination and search strategies; robustness requirements; error level and confidence level propagation that apply to the pattern matching problem of military targets were found to be applicable to the ballistic domain as well. The Examiner system uses S-MODALS to rank a set of queried cartridge case images from the most similar to the least similar image in reference to an investigative fired cartridge case image. The paper presents three independent tests and evaluation studies of the Examiner system utilizing the S-MODALS technology for the Federal Bureau of Investigation.

  19. Face recognition for criminal identification: An implementation of principal component analysis for face recognition

    NASA Astrophysics Data System (ADS)

    Abdullah, Nurul Azma; Saidi, Md. Jamri; Rahman, Nurul Hidayah Ab; Wen, Chuah Chai; Hamid, Isredza Rahmi A.

    2017-10-01

    In practice, identification of criminal in Malaysia is done through thumbprint identification. However, this type of identification is constrained as most of criminal nowadays getting cleverer not to leave their thumbprint on the scene. With the advent of security technology, cameras especially CCTV have been installed in many public and private areas to provide surveillance activities. The footage of the CCTV can be used to identify suspects on scene. However, because of limited software developed to automatically detect the similarity between photo in the footage and recorded photo of criminals, the law enforce thumbprint identification. In this paper, an automated facial recognition system for criminal database was proposed using known Principal Component Analysis approach. This system will be able to detect face and recognize face automatically. This will help the law enforcements to detect or recognize suspect of the case if no thumbprint present on the scene. The results show that about 80% of input photo can be matched with the template data.

  20. Dietary Assessment on a Mobile Phone Using Image Processing and Pattern Recognition Techniques: Algorithm Design and System Prototyping.

    PubMed

    Probst, Yasmine; Nguyen, Duc Thanh; Tran, Minh Khoi; Li, Wanqing

    2015-07-27

    Dietary assessment, while traditionally based on pen-and-paper, is rapidly moving towards automatic approaches. This study describes an Australian automatic food record method and its prototype for dietary assessment via the use of a mobile phone and techniques of image processing and pattern recognition. Common visual features including scale invariant feature transformation (SIFT), local binary patterns (LBP), and colour are used for describing food images. The popular bag-of-words (BoW) model is employed for recognizing the images taken by a mobile phone for dietary assessment. Technical details are provided together with discussions on the issues and future work.

  1. Facial recognition in education system

    NASA Astrophysics Data System (ADS)

    Krithika, L. B.; Venkatesh, K.; Rathore, S.; Kumar, M. Harish

    2017-11-01

    Human beings exploit emotions comprehensively for conveying messages and their resolution. Emotion detection and face recognition can provide an interface between the individuals and technologies. The most successful applications of recognition analysis are recognition of faces. Many different techniques have been used to recognize the facial expressions and emotion detection handle varying poses. In this paper, we approach an efficient method to recognize the facial expressions to track face points and distances. This can automatically identify observer face movements and face expression in image. This can capture different aspects of emotion and facial expressions.

  2. 3D automatic anatomy recognition based on iterative graph-cut-ASM

    NASA Astrophysics Data System (ADS)

    Chen, Xinjian; Udupa, Jayaram K.; Bagci, Ulas; Alavi, Abass; Torigian, Drew A.

    2010-02-01

    We call the computerized assistive process of recognizing, delineating, and quantifying organs and tissue regions in medical imaging, occurring automatically during clinical image interpretation, automatic anatomy recognition (AAR). The AAR system we are developing includes five main parts: model building, object recognition, object delineation, pathology detection, and organ system quantification. In this paper, we focus on the delineation part. For the modeling part, we employ the active shape model (ASM) strategy. For recognition and delineation, we integrate several hybrid strategies of combining purely image based methods with ASM. In this paper, an iterative Graph-Cut ASM (IGCASM) method is proposed for object delineation. An algorithm called GC-ASM was presented at this symposium last year for object delineation in 2D images which attempted to combine synergistically ASM and GC. Here, we extend this method to 3D medical image delineation. The IGCASM method effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. We propose a new GC cost function, which effectively integrates the specific image information with the ASM shape model information. The proposed methods are tested on a clinical abdominal CT data set. The preliminary results show that: (a) it is feasible to explicitly bring prior 3D statistical shape information into the GC framework; (b) the 3D IGCASM delineation method improves on ASM and GC and can provide practical operational time on clinical images.

  3. System integration of pattern recognition, adaptive aided, upper limb prostheses

    NASA Technical Reports Server (NTRS)

    Lyman, J.; Freedy, A.; Solomonow, M.

    1975-01-01

    The requirements for successful integration of a computer aided control system for multi degree of freedom artificial arms are discussed. Specifications are established for a system which shares control between a human amputee and an automatic control subsystem. The approach integrates the following subsystems: (1) myoelectric pattern recognition, (2) adaptive computer aiding; (3) local reflex control; (4) prosthetic sensory feedback; and (5) externally energized arm with the functions of prehension, wrist rotation, elbow extension and flexion and humeral rotation.

  4. Automatically Detecting Likely Edits in Clinical Notes Created Using Automatic Speech Recognition

    PubMed Central

    Lybarger, Kevin; Ostendorf, Mari; Yetisgen, Meliha

    2017-01-01

    The use of automatic speech recognition (ASR) to create clinical notes has the potential to reduce costs associated with note creation for electronic medical records, but at current system accuracy levels, post-editing by practitioners is needed to ensure note quality. Aiming to reduce the time required to edit ASR transcripts, this paper investigates novel methods for automatic detection of edit regions within the transcripts, including both putative ASR errors but also regions that are targets for cleanup or rephrasing. We create detection models using logistic regression and conditional random field models, exploring a variety of text-based features that consider the structure of clinical notes and exploit the medical context. Different medical text resources are used to improve feature extraction. Experimental results on a large corpus of practitioner-edited clinical notes show that 67% of sentence-level edits and 45% of word-level edits can be detected with a false detection rate of 15%. PMID:29854187

  5. Neural networks: Alternatives to conventional techniques for automatic docking

    NASA Technical Reports Server (NTRS)

    Vinz, Bradley L.

    1994-01-01

    Automatic docking of orbiting spacecraft is a crucial operation involving the identification of vehicle orientation as well as complex approach dynamics. The chaser spacecraft must be able to recognize the target spacecraft within a scene and achieve accurate closing maneuvers. In a video-based system, a target scene must be captured and transformed into a pattern of pixels. Successful recognition lies in the interpretation of this pattern. Due to their powerful pattern recognition capabilities, artificial neural networks offer a potential role in interpretation and automatic docking processes. Neural networks can reduce the computational time required by existing image processing and control software. In addition, neural networks are capable of recognizing and adapting to changes in their dynamic environment, enabling enhanced performance, redundancy, and fault tolerance. Most neural networks are robust to failure, capable of continued operation with a slight degradation in performance after minor failures. This paper discusses the particular automatic docking tasks neural networks can perform as viable alternatives to conventional techniques.

  6. Automatic detection and recognition of signs from natural scenes.

    PubMed

    Chen, Xilin; Yang, Jie; Zhang, Jing; Waibel, Alex

    2004-01-01

    In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to recover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English.

  7. Integrated approach for automatic target recognition using a network of collaborative sensors.

    PubMed

    Mahalanobis, Abhijit; Van Nevel, Alan

    2006-10-01

    We introduce what is believed to be a novel concept by which several sensors with automatic target recognition (ATR) capability collaborate to recognize objects. Such an approach would be suitable for netted systems in which the sensors and platforms can coordinate to optimize end-to-end performance. We use correlation filtering techniques to facilitate the development of the concept, although other ATR algorithms may be easily substituted. Essentially, a self-configuring geometry of netted platforms is proposed that positions the sensors optimally with respect to each other, and takes into account the interactions among the sensor, the recognition algorithms, and the classes of the objects to be recognized. We show how such a paradigm optimizes overall performance, and illustrate the collaborative ATR scheme for recognizing targets in synthetic aperture radar imagery by using viewing position as a sensor parameter.

  8. Application of automatic threshold in dynamic target recognition with low contrast

    NASA Astrophysics Data System (ADS)

    Miao, Hua; Guo, Xiaoming; Chen, Yu

    2014-11-01

    Hybrid photoelectric joint transform correlator can realize automatic real-time recognition with high precision through the combination of optical devices and electronic devices. When recognizing targets with low contrast using photoelectric joint transform correlator, because of the difference of attitude, brightness and grayscale between target and template, only four to five frames of dynamic targets can be recognized without any processing. CCD camera is used to capture the dynamic target images and the capturing speed of CCD is 25 frames per second. Automatic threshold has many advantages like fast processing speed, effectively shielding noise interference, enhancing diffraction energy of useful information and better reserving outline of target and template, so this method plays a very important role in target recognition with optical correlation method. However, the automatic obtained threshold by program can not achieve the best recognition results for dynamic targets. The reason is that outline information is broken to some extent. Optimal threshold is obtained by manual intervention in most cases. Aiming at the characteristics of dynamic targets, the processing program of improved automatic threshold is finished by multiplying OTSU threshold of target and template by scale coefficient of the processed image, and combining with mathematical morphology. The optimal threshold can be achieved automatically by improved automatic threshold processing for dynamic low contrast target images. The recognition rate of dynamic targets is improved through decreased background noise effect and increased correlation information. A series of dynamic tank images with the speed about 70 km/h are adapted as target images. The 1st frame of this series of tanks can correlate only with the 3rd frame without any processing. Through OTSU threshold, the 80th frame can be recognized. By automatic threshold processing of the joint images, this number can be increased to 89 frames. Experimental results show that the improved automatic threshold processing has special application value for the recognition of dynamic target with low contrast.

  9. Advanced miniature processing handware for ATR applications

    NASA Technical Reports Server (NTRS)

    Chao, Tien-Hsin (Inventor); Daud, Taher (Inventor); Thakoor, Anikumar (Inventor)

    2003-01-01

    A Hybrid Optoelectronic Neural Object Recognition System (HONORS), is disclosed, comprising two major building blocks: (1) an advanced grayscale optical correlator (OC) and (2) a massively parallel three-dimensional neural-processor. The optical correlator, with its inherent advantages in parallel processing and shift invariance, is used for target of interest (TOI) detection and segmentation. The three-dimensional neural-processor, with its robust neural learning capability, is used for target classification and identification. The hybrid optoelectronic neural object recognition system, with its powerful combination of optical processing and neural networks, enables real-time, large frame, automatic target recognition (ATR).

  10. Giro form reading machine

    NASA Astrophysics Data System (ADS)

    Minh Ha, Thien; Niggeler, Dieter; Bunke, Horst; Clarinval, Jose

    1995-08-01

    Although giro forms are used by many people in daily life for money remittance in Switzerland, the processing of these forms at banks and post offices is only partly automated. We describe an ongoing project for building an automatic system that is able to recognize various items printed or written on a giro form. The system comprises three main components, namely, an automatic form feeder, a camera system, and a computer. These components are connected in such a way that the system is able to process a bunch of forms without any human interactions. We present two real applications of our system in the field of payment services, which require the reading of both machine printed and handwritten information that may appear on a giro form. One particular feature of giro forms is their flexible layout, i.e., information items are located differently from one form to another, thus requiring an additional analysis step to localize them before recognition. A commercial optical character recognition software package is used for recognition of machine-printed information, whereas handwritten information is read by our own algorithms, the details of which are presented. The system is implemented by using a client/server architecture providing a high degree of flexibility to change. Preliminary results are reported supporting our claim that the system is usable in practice.

  11. Unconstrained face detection and recognition based on RGB-D camera for the visually impaired

    NASA Astrophysics Data System (ADS)

    Zhao, Xiangdong; Wang, Kaiwei; Yang, Kailun; Hu, Weijian

    2017-02-01

    It is highly important for visually impaired people (VIP) to be aware of human beings around themselves, so correctly recognizing people in VIP assisting apparatus provide great convenience. However, in classical face recognition technology, faces used in training and prediction procedures are usually frontal, and the procedures of acquiring face images require subjects to get close to the camera so that frontal face and illumination guaranteed. Meanwhile, labels of faces are defined manually rather than automatically. Most of the time, labels belonging to different classes need to be input one by one. It prevents assisting application for VIP with these constraints in practice. In this article, a face recognition system under unconstrained environment is proposed. Specifically, it doesn't require frontal pose or uniform illumination as required by previous algorithms. The attributes of this work lie in three aspects. First, a real time frontal-face synthesizing enhancement is implemented, and frontal faces help to increase recognition rate, which is proved with experiment results. Secondly, RGB-D camera plays a significant role in our system, from which both color and depth information are utilized to achieve real time face tracking which not only raises the detection rate but also gives an access to label faces automatically. Finally, we propose to use neural networks to train a face recognition system, and Principal Component Analysis (PCA) is applied to pre-refine the input data. This system is expected to provide convenient help for VIP to get familiar with others, and make an access for them to recognize people when the system is trained enough.

  12. Automatic anatomy recognition via multiobject oriented active shape models.

    PubMed

    Chen, Xinjian; Udupa, Jayaram K; Alavi, Abass; Torigian, Drew A

    2010-12-01

    This paper studies the feasibility of developing an automatic anatomy recognition (AAR) system in clinical radiology and demonstrates its operation on clinical 2D images. The anatomy recognition method described here consists of two main components: (a) multiobject generalization of OASM and (b) object recognition strategies. The OASM algorithm is generalized to multiple objects by including a model for each object and assigning a cost structure specific to each object in the spirit of live wire. The delineation of multiobject boundaries is done in MOASM via a three level dynamic programming algorithm, wherein the first level is at pixel level which aims to find optimal oriented boundary segments between successive landmarks, the second level is at landmark level which aims to find optimal location for the landmarks, and the third level is at the object level which aims to find optimal arrangement of object boundaries over all objects. The object recognition strategy attempts to find that pose vector (consisting of translation, rotation, and scale component) for the multiobject model that yields the smallest total boundary cost for all objects. The delineation and recognition accuracies were evaluated separately utilizing routine clinical chest CT, abdominal CT, and foot MRI data sets. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF and FPVF). The recognition accuracy was assessed (1) in terms of the size of the space of the pose vectors for the model assembly that yielded high delineation accuracy, (2) as a function of the number of objects and objects' distribution and size in the model, (3) in terms of the interdependence between delineation and recognition, and (4) in terms of the closeness of the optimum recognition result to the global optimum. When multiple objects are included in the model, the delineation accuracy in terms of TPVF can be improved to 97%-98% with a low FPVF of 0.1%-0.2%. Typically, a recognition accuracy of > or = 90% yielded a TPVF > or = 95% and FPVF < or = 0.5%. Over the three data sets and over all tested objects, in 97% of the cases, the optimal solutions found by the proposed method constituted the true global optimum. The experimental results showed the feasibility and efficacy of the proposed automatic anatomy recognition system. Increasing the number of objects in the model can significantly improve both recognition and delineation accuracy. More spread out arrangement of objects in the model can lead to improved recognition and delineation accuracy. Including larger objects in the model also improved recognition and delineation. The proposed method almost always finds globally optimum solutions.

  13. [Creating language model of the forensic medicine domain for developing a autopsy recording system by automatic speech recognition].

    PubMed

    Niijima, H; Ito, N; Ogino, S; Takatori, T; Iwase, H; Kobayashi, M

    2000-11-01

    For the purpose of practical use of speech recognition technology for recording of forensic autopsy, a language model of the speech recording system, specialized for the forensic autopsy, was developed. The language model for the forensic autopsy by applying 3-gram model was created, and an acoustic model for Japanese speech recognition by Hidden Markov Model in addition to the above were utilized to customize the speech recognition engine for forensic autopsy. A forensic vocabulary set of over 10,000 words was compiled and some 300,000 sentence patterns were made to create the forensic language model, then properly mixing with a general language model to attain high exactitude. When tried by dictating autopsy findings, this speech recognition system was proved to be about 95% of recognition rate that seems to have reached to the practical usability in view of speech recognition software, though there remains rooms for improving its hardware and application-layer software.

  14. Tree-structured sensor fusion architecture for distributed sensor networks

    NASA Astrophysics Data System (ADS)

    Iyengar, S. Sitharama; Kashyap, Rangasami L.; Madan, Rabinder N.; Thomas, Daryl D.

    1990-10-01

    An assessment of numerous activities in the field of multisensor target recognition reveals several trends and conditions which are cause for concern. .These concerns are analyzed in terms of their potential impact on the ultimate employment of automatic target recognition in military systems. Suggestions for additional investigation and guidance for current activities are presented with respect to some of the identified concerns.

  15. Development of an automated ultrasonic testing system

    NASA Astrophysics Data System (ADS)

    Shuxiang, Jiao; Wong, Brian Stephen

    2005-04-01

    Non-Destructive Testing is necessary in areas where defects in structures emerge over time due to wear and tear and structural integrity is necessary to maintain its usability. However, manual testing results in many limitations: high training cost, long training procedure, and worse, the inconsistent test results. A prime objective of this project is to develop an automatic Non-Destructive testing system for a shaft of the wheel axle of a railway carriage. Various methods, such as the neural network, pattern recognition methods and knowledge-based system are used for the artificial intelligence problem. In this paper, a statistical pattern recognition approach, Classification Tree is applied. Before feature selection, a thorough study on the ultrasonic signals produced was carried out. Based on the analysis of the ultrasonic signals, three signal processing methods were developed to enhance the ultrasonic signals: Cross-Correlation, Zero-Phase filter and Averaging. The target of this step is to reduce the noise and make the signal character more distinguishable. Four features: 1. The Auto Regressive Model Coefficients. 2. Standard Deviation. 3. Pearson Correlation 4. Dispersion Uniformity Degree are selected. And then a Classification Tree is created and applied to recognize the peak positions and amplitudes. Searching local maximum is carried out before feature computing. This procedure reduces much computation time in the real-time testing. Based on this algorithm, a software package called SOFRA was developed to recognize the peaks, calibrate automatically and test a simulated shaft automatically. The automatic calibration procedure and the automatic shaft testing procedure are developed.

  16. [Design and implementation of mobile terminal data acquisition for Chinese materia medica resources survey].

    PubMed

    Qi, Yuan-Hua; Wang, Hui; Zhang, Xiao-Bo; Jin, Yan; Ge, Xiao-Guang; Jing, Zhi-Xian; Wang, Ling; Zhao, Yu-Ping; Guo, Lan-Ping; Huang, Lu-Qi

    2017-11-01

    In this paper, a data acquisition system based on mobile terminal combining GPS, offset correction, automatic speech recognition and database networking technology was designed implemented with the function of locating the latitude and elevation information fast, taking conveniently various types of Chinese herbal plant photos, photos, samples habitat photos and so on. The mobile system realizes automatic association with Chinese medicine source information, through the voice recognition function it records the information of plant characteristics and environmental characteristics, and record relevant plant specimen information. The data processing platform based on Chinese medicine resources survey data reporting client can effectively assists in indoor data processing, derives the mobile terminal data to computer terminal. The established data acquisition system provides strong technical support for the fourth national survey of the Chinese materia medica resources (CMMR). Copyright© by the Chinese Pharmaceutical Association.

  17. The Development of the Speaker Independent ARM Continuous Speech Recognition System

    DTIC Science & Technology

    1992-01-01

    spokeTi airborne reconnaissance reports u-ing a speech recognition system based on phoneme-level hidden Markov models (HMMs). Previous versions of the ARM...will involve automatic selection from multiple model sets, corresponding to different speaker types, and that the most rudimen- tary partition of a...The vocabulary size for the ARM task is 497 words. These words are related to the phoneme-level symbols corresponding to the models in the model set

  18. Real-time speech gisting for ATC applications

    NASA Astrophysics Data System (ADS)

    Dunkelberger, Kirk A.

    1995-06-01

    Command and control within the ATC environment remains primarily voice-based. Hence, automatic real time, speaker independent, continuous speech recognition (CSR) has many obvious applications and implied benefits to the ATC community: automated target tagging, aircraft compliance monitoring, controller training, automatic alarm disabling, display management, and many others. However, while current state-of-the-art CSR systems provide upwards of 98% word accuracy in laboratory environments, recent low-intrusion experiments in the ATCT environments demonstrated less than 70% word accuracy in spite of significant investments in recognizer tuning. Acoustic channel irregularities and controller/pilot grammar verities impact current CSR algorithms at their weakest points. It will be shown herein, however, that real time context- and environment-sensitive gisting can provide key command phrase recognition rates of greater than 95% using the same low-intrusion approach. The combination of real time inexact syntactic pattern recognition techniques and a tight integration of CSR, gisting, and ATC database accessor system components is the key to these high phase recognition rates. A system concept for real time gisting in the ATC context is presented herein. After establishing an application context, discussion presents a minimal CSR technology context then focuses on the gisting mechanism, desirable interfaces into the ATCT database environment, and data and control flow within the prototype system. Results of recent tests for a subset of the functionality are presented together with suggestions for further research.

  19. Monitoring caustic injuries from emergency department databases using automatic keyword recognition software.

    PubMed

    Vignally, P; Fondi, G; Taggi, F; Pitidis, A

    2011-03-31

    In Italy the European Union Injury Database reports the involvement of chemical products in 0.9% of home and leisure accidents. The Emergency Department registry on domestic accidents in Italy and the Poison Control Centres record that 90% of cases of exposure to toxic substances occur in the home. It is not rare for the effects of chemical agents to be observed in hospitals, with a high potential risk of damage - the rate of this cause of hospital admission is double the domestic injury average. The aim of this study was to monitor the effects of injuries caused by caustic agents in Italy using automatic free-text recognition in Emergency Department medical databases. We created a Stata software program to automatically identify caustic or corrosive injury cases using an agent-specific list of keywords. We focused attention on the procedure's sensitivity and specificity. Ten hospitals in six regions of Italy participated in the study. The program identified 112 cases of injury by caustic or corrosive agents. Checking the cases by quality controls (based on manual reading of ED reports), we assessed 99 cases as true positive, i.e. 88.4% of the patients were automatically recognized by the software as being affected by caustic substances (99% CI: 80.6%- 96.2%), that is to say 0.59% (99% CI: 0.45%-0.76%) of the whole sample of home injuries, a value almost three times as high as that expected (p < 0.0001) from European codified information. False positives were 11.6% of the recognized cases (99% CI: 5.1%- 21.5%). Our automatic procedure for caustic agent identification proved to have excellent product recognition capacity with an acceptable level of excess sensitivity. Contrary to our a priori hypothesis, the automatic recognition system provided a level of identification of agents possessing caustic effects that was significantly much greater than was predictable on the basis of the values from current codifications reported in the European Database.

  20. Automatic recognition of falls in gait-slip training: Harness load cell based criteria.

    PubMed

    Yang, Feng; Pai, Yi-Chung

    2011-08-11

    Over-head-harness systems, equipped with load cell sensors, are essential to the participants' safety and to the outcome assessment in perturbation training. The purpose of this study was to first develop an automatic outcome recognition criterion among young adults for gait-slip training and then verify such criterion among older adults. Each of 39 young and 71 older subjects, all protected by safety harness, experienced 8 unannounced, repeated slips, while walking on a 7m walkway. Each trial was monitored with a motion capture system, bilateral ground reaction force (GRF), harness force, and video recording. The fall trials were first unambiguously indentified with careful visual inspection of all video records. The recoveries without balance loss (in which subjects' trailing foot landed anteriorly to the slipping foot) were also first fully recognized from motion and GRF analyses. These analyses then set the gold standard for the outcome recognition with load cell measurements. Logistic regression analyses based on young subjects' data revealed that the peak load cell force was the best predictor of falls (with 100% accuracy) at the threshold of 30% body weight. On the other hand, the peak moving average force of load cell across 1s period, was the best predictor (with 100% accuracy) separating recoveries with backward balance loss (in which the recovery step landed posterior to slipping foot) from harness assistance at the threshold of 4.5% body weight. These threshold values were fully verified using the data from older adults (100% accuracy in recognizing falls). Because of the increasing popularity in the perturbation training coupling with the protective over-head-harness system, this new criterion could have far reaching implications in automatic outcome recognition during the movement therapy. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. AUTOMATIC RECOGNITION OF FALLS IN GAIT-SLIP: A HARNESS LOAD CELL BASED CRITERION

    PubMed Central

    Yang, Feng; Pai, Yi-Chung

    2012-01-01

    Over-head-harness systems, equipped with load cell sensors, are essential to the participants’ safety and to the outcome assessment in perturbation training. The purpose of this study was to first develop an automatic outcome recognition criterion among young adults for gait-slip training and then verify such criterion among older adults. Each of 39 young and 71 older subjects, all protected by safety harness, experienced 8 unannounced, repeated slips, while walking on a 7-m walkway. Each trial was monitored with a motion capture system, bilateral ground reaction force (GRF), harness force and video recording. The fall trials were first unambiguously indentified with careful visual inspection of all video records. The recoveries without balance loss (in which subjects’ trailing foot landed anteriorly to the slipping foot) were also first fully recognized from motion and GRF analyses. These analyses then set the gold standard for the outcome recognition with load cell measurements. Logistic regression analyses based on young subjects’ data revealed that peak load cell force was the best predictor of falls (with 100% accuracy) at the threshold of 30% body weight. On the other hand, the peak moving average force of load cell across 1-s period, was the best predictor (with 100% accuracy) separating recoveries with backward balance loss (in which the recovery step landed posterior to slipping foot) from harness assistance at the threshold of 4.5% body weight. These threshold values were fully verified using the data from older adults (100% accuracy in recognizing falls). Because of the increasing popularity in the perturbation training coupling with the protective over-head-harness system, this new criterion could have far reaching implications in automatic outcome recognition during the movement therapy. PMID:21696744

  2. Female voice communications in high level aircraft cockpit noises--part II: vocoder and automatic speech recognition systems.

    PubMed

    Nixon, C; Anderson, T; Morris, L; McCavitt, A; McKinley, R; Yeager, D; McDaniel, M

    1998-11-01

    The intelligibility of female and male speech is equivalent under most ordinary living conditions. However, due to small differences between their acoustic speech signals, called speech spectra, one can be more or less intelligible than the other in certain situations such as high levels of noise. Anecdotal information, supported by some empirical observations, suggests that some of the high intensity noise spectra of military aircraft cockpits may degrade the intelligibility of female speech more than that of male speech. In an applied research study, the intelligibility of female and male speech was measured in several high level aircraft cockpit noise conditions experienced in military aviation. In Part I, (Nixon CW, et al. Aviat Space Environ Med 1998; 69:675-83) female speech intelligibility measured in the spectra and levels of aircraft cockpit noises and with noise-canceling microphones was lower than that of the male speech in all conditions. However, the differences were small and only those at some of the highest noise levels were significant. Although speech intelligibility of both genders was acceptable during normal cruise noises, improvements are required in most of the highest levels of noise created during maximum aircraft operating conditions. These results are discussed in a Part I technical report. This Part II report examines the intelligibility in the same aircraft cockpit noises of vocoded female and male speech and the accuracy with which female and male speech in some of the cockpit noises were understood by automatic speech recognition systems. The intelligibility of vocoded female speech was generally the same as that of vocoded male speech. No significant differences were measured between the recognition accuracy of male and female speech by the automatic speech recognition systems. The intelligibility of female and male speech was equivalent for these conditions.

  3. Intelligent form removal with character stroke preservation

    NASA Astrophysics Data System (ADS)

    Garris, Michael D.

    1996-03-01

    A new technique for intelligent form removal has been developed along with a new method for evaluating its impact on optical character recognition (OCR). All the dominant lines in the image are automatically detected using the Hough line transform and intelligently erased while simultaneously preserving overlapping character strokes by computing line width statistics and keying off of certain visual cues. This new method of form removal operates on loosely defined zones with no image deskewing. Any field in which the writer is provided a horizontal line to enter a response can be processed by this method. Several examples of processed fields are provided, including a comparison of results between the new method and a commercially available forms removal package. Even if this new form removal method did not improve character recognition accuracy, it is still a significant improvement to the technology because the requirement of a priori knowledge of the form's geometric details has been greatly reduced. This relaxes the recognition system's dependence on rigid form design, printing, and reproduction by automatically detecting and removing some of the physical structures (lines) on the form. Using the National Institute of Standards and Technology (NIST) public domain form-based handprint recognition system, the technique was tested on a large number of fields containing randomly ordered handprinted lowercase alphabets, as these letters (especially those with descenders) frequently touch and extend through the line along which they are written. Preserving character strokes improves overall lowercase recognition performance by 3%, which is a net improvement, but a single performance number like this doesn't communicate how the recognition process was really influenced. There is expected to be trade- offs with the introduction of any new technique into a complex recognition system. To understand both the improvements and the trade-offs, a new analysis was designed to compare the statistical distributions of individual confusion pairs between two systems. As OCR technology continues to improve, sophisticated analyses like this are necessary to reduce the errors remaining in complex recognition problems.

  4. Accurate, fast, and secure biometric fingerprint recognition system utilizing sensor fusion of fingerprint patterns

    NASA Astrophysics Data System (ADS)

    El-Saba, Aed; Alsharif, Salim; Jagapathi, Rajendarreddy

    2011-04-01

    Fingerprint recognition is one of the first techniques used for automatically identifying people and today it is still one of the most popular and effective biometric techniques. With this increase in fingerprint biometric uses, issues related to accuracy, security and processing time are major challenges facing the fingerprint recognition systems. Previous work has shown that polarization enhancementencoding of fingerprint patterns increase the accuracy and security of fingerprint systems without burdening the processing time. This is mainly due to the fact that polarization enhancementencoding is inherently a hardware process and does not have detrimental time delay effect on the overall process. Unpolarized images, however, posses a high visual contrast and when fused (without digital enhancement) properly with polarized ones, is shown to increase the recognition accuracy and security of the biometric system without any significant processing time delay.

  5. Method for automatic detection of wheezing in lung sounds.

    PubMed

    Riella, R J; Nohama, P; Maia, J M

    2009-07-01

    The present report describes the development of a technique for automatic wheezing recognition in digitally recorded lung sounds. This method is based on the extraction and processing of spectral information from the respiratory cycle and the use of these data for user feedback and automatic recognition. The respiratory cycle is first pre-processed, in order to normalize its spectral information, and its spectrogram is then computed. After this procedure, the spectrogram image is processed by a two-dimensional convolution filter and a half-threshold in order to increase the contrast and isolate its highest amplitude components, respectively. Thus, in order to generate more compressed data to automatic recognition, the spectral projection from the processed spectrogram is computed and stored as an array. The higher magnitude values of the array and its respective spectral values are then located and used as inputs to a multi-layer perceptron artificial neural network, which results an automatic indication about the presence of wheezes. For validation of the methodology, lung sounds recorded from three different repositories were used. The results show that the proposed technique achieves 84.82% accuracy in the detection of wheezing for an isolated respiratory cycle and 92.86% accuracy for the detection of wheezes when detection is carried out using groups of respiratory cycles obtained from the same person. Also, the system presents the original recorded sound and the post-processed spectrogram image for the user to draw his own conclusions from the data.

  6. A multilingual gold-standard corpus for biomedical concept recognition: the Mantra GSC.

    PubMed

    Kors, Jan A; Clematide, Simon; Akhondi, Saber A; van Mulligen, Erik M; Rebholz-Schuhmann, Dietrich

    2015-09-01

    To create a multilingual gold-standard corpus for biomedical concept recognition. We selected text units from different parallel corpora (Medline abstract titles, drug labels, biomedical patent claims) in English, French, German, Spanish, and Dutch. Three annotators per language independently annotated the biomedical concepts, based on a subset of the Unified Medical Language System and covering a wide range of semantic groups. To reduce the annotation workload, automatically generated preannotations were provided. Individual annotations were automatically harmonized and then adjudicated, and cross-language consistency checks were carried out to arrive at the final annotations. The number of final annotations was 5530. Inter-annotator agreement scores indicate good agreement (median F-score 0.79), and are similar to those between individual annotators and the gold standard. The automatically generated harmonized annotation set for each language performed equally well as the best annotator for that language. The use of automatic preannotations, harmonized annotations, and parallel corpora helped to keep the manual annotation efforts manageable. The inter-annotator agreement scores provide a reference standard for gauging the performance of automatic annotation techniques. To our knowledge, this is the first gold-standard corpus for biomedical concept recognition in languages other than English. Other distinguishing features are the wide variety of semantic groups that are being covered, and the diversity of text genres that were annotated. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  7. Model and algorithmic framework for detection and correction of cognitive errors.

    PubMed

    Feki, Mohamed Ali; Biswas, Jit; Tolstikov, Andrei

    2009-01-01

    This paper outlines an approach that we are taking for elder-care applications in the smart home, involving cognitive errors and their compensation. Our approach involves high level modeling of daily activities of the elderly by breaking down these activities into smaller units, which can then be automatically recognized at a low level by collections of sensors placed in the homes of the elderly. This separation allows us to employ plan recognition algorithms and systems at a high level, while developing stand-alone activity recognition algorithms and systems at a low level. It also allows the mixing and matching of multi-modality sensors of various kinds that go to support the same high level requirement. Currently our plan recognition algorithms are still at a conceptual stage, whereas a number of low level activity recognition algorithms and systems have been developed. Herein we present our model for plan recognition, providing a brief survey of the background literature. We also present some concrete results that we have achieved for activity recognition, emphasizing how these results are incorporated into the overall plan recognition system.

  8. Development of Portable Automatic Number Plate Recognition System on Android Mobile Phone

    NASA Astrophysics Data System (ADS)

    Mutholib, Abdul; Gunawan, Teddy S.; Chebil, Jalel; Kartiwi, Mira

    2013-12-01

    The Automatic Number Plate Recognition (ANPR) System has performed as the main role in various access control and security, such as: tracking of stolen vehicles, traffic violations (speed trap) and parking management system. In this paper, the portable ANPR implemented on android mobile phone is presented. The main challenges in mobile application are including higher coding efficiency, reduced computational complexity, and improved flexibility. Significance efforts are being explored to find suitable and adaptive algorithm for implementation of ANPR on mobile phone. ANPR system for mobile phone need to be optimize due to its limited CPU and memory resources, its ability for geo-tagging image captured using GPS coordinates and its ability to access online database to store the vehicle's information. In this paper, the design of portable ANPR on android mobile phone will be described as follows. First, the graphical user interface (GUI) for capturing image using built-in camera was developed to acquire vehicle plate number in Malaysia. Second, the preprocessing of raw image was done using contrast enhancement. Next, character segmentation using fixed pitch and an optical character recognition (OCR) using neural network were utilized to extract texts and numbers. Both character segmentation and OCR were using Tesseract library from Google Inc. The proposed portable ANPR algorithm was implemented and simulated using Android SDK on a computer. Based on the experimental results, the proposed system can effectively recognize the license plate number at 90.86%. The required processing time to recognize a license plate is only 2 seconds on average. The result is consider good in comparison with the results obtained from previous system that was processed in a desktop PC with the range of result from 91.59% to 98% recognition rate and 0.284 second to 1.5 seconds recognition time.

  9. Dietary Assessment on a Mobile Phone Using Image Processing and Pattern Recognition Techniques: Algorithm Design and System Prototyping

    PubMed Central

    Probst, Yasmine; Nguyen, Duc Thanh; Tran, Minh Khoi; Li, Wanqing

    2015-01-01

    Dietary assessment, while traditionally based on pen-and-paper, is rapidly moving towards automatic approaches. This study describes an Australian automatic food record method and its prototype for dietary assessment via the use of a mobile phone and techniques of image processing and pattern recognition. Common visual features including scale invariant feature transformation (SIFT), local binary patterns (LBP), and colour are used for describing food images. The popular bag-of-words (BoW) model is employed for recognizing the images taken by a mobile phone for dietary assessment. Technical details are provided together with discussions on the issues and future work. PMID:26225994

  10. Anatomical entity mention recognition at literature scale

    PubMed Central

    Pyysalo, Sampo; Ananiadou, Sophia

    2014-01-01

    Motivation: Anatomical entities ranging from subcellular structures to organ systems are central to biomedical science, and mentions of these entities are essential to understanding the scientific literature. Despite extensive efforts to automatically analyze various aspects of biomedical text, there have been only few studies focusing on anatomical entities, and no dedicated methods for learning to automatically recognize anatomical entity mentions in free-form text have been introduced. Results: We present AnatomyTagger, a machine learning-based system for anatomical entity mention recognition. The system incorporates a broad array of approaches proposed to benefit tagging, including the use of Unified Medical Language System (UMLS)- and Open Biomedical Ontologies (OBO)-based lexical resources, word representations induced from unlabeled text, statistical truecasing and non-local features. We train and evaluate the system on a newly introduced corpus that substantially extends on previously available resources, and apply the resulting tagger to automatically annotate the entire open access scientific domain literature. The resulting analyses have been applied to extend services provided by the Europe PubMed Central literature database. Availability and implementation: All tools and resources introduced in this work are available from http://nactem.ac.uk/anatomytagger. Contact: sophia.ananiadou@manchester.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:24162468

  11. Intelligent Automatic Right-Left Sign Lamp Based on Brain Signal Recognition System

    NASA Astrophysics Data System (ADS)

    Winda, A.; Sofyan; Sthevany; Vincent, R. S.

    2017-12-01

    Comfort as a part of the human factor, plays important roles in nowadays advanced automotive technology. Many of the current technologies go in the direction of automotive driver assistance features. However, many of the driver assistance features still require physical movement by human to enable the features. In this work, the proposed method is used in order to make certain feature to be functioning without any physical movement, instead human just need to think about it in their mind. In this work, brain signal is recorded and processed in order to be used as input to the recognition system. Right-Left sign lamp based on the brain signal recognition system can potentially replace the button or switch of the specific device in order to make the lamp work. The system then will decide whether the signal is ‘Right’ or ‘Left’. The decision of the Right-Left side of brain signal recognition will be sent to a processing board in order to activate the automotive relay, which will be used to activate the sign lamp. Furthermore, the intelligent system approach is used to develop authorized model based on the brain signal. Particularly Support Vector Machines (SVMs)-based classification system is used in the proposed system to recognize the Left-Right of the brain signal. Experimental results confirm the effectiveness of the proposed intelligent Automatic brain signal-based Right-Left sign lamp access control system. The signal is processed by Linear Prediction Coefficient (LPC) and Support Vector Machines (SVMs), and the resulting experiment shows the training and testing accuracy of 100% and 80%, respectively.

  12. Blind equalization and automatic modulation classification based on subspace for subcarrier MPSK optical communications

    NASA Astrophysics Data System (ADS)

    Chen, Dan; Guo, Lin-yuan; Wang, Chen-hao; Ke, Xi-zheng

    2017-07-01

    Equalization can compensate channel distortion caused by channel multipath effects, and effectively improve convergent of modulation constellation diagram in optical wireless system. In this paper, the subspace blind equalization algorithm is used to preprocess M-ary phase shift keying (MPSK) subcarrier modulation signal in receiver. Mountain clustering is adopted to get the clustering centers of MPSK modulation constellation diagram, and the modulation order is automatically identified through the k-nearest neighbor (KNN) classifier. The experiment has been done under four different weather conditions. Experimental results show that the convergent of constellation diagram is improved effectively after using the subspace blind equalization algorithm, which means that the accuracy of modulation recognition is increased. The correct recognition rate of 16PSK can be up to 85% in any kind of weather condition which is mentioned in paper. Meanwhile, the correct recognition rate is the highest in cloudy and the lowest in heavy rain condition.

  13. Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers

    NASA Astrophysics Data System (ADS)

    Caballero Morales, Santiago Omar; Cox, Stephen J.

    2009-12-01

    Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.

  14. Speech Processing and Recognition (SPaRe)

    DTIC Science & Technology

    2011-01-01

    results in the areas of automatic speech recognition (ASR), speech processing, machine translation (MT), natural language processing ( NLP ), and...Processing ( NLP ), Information Retrieval (IR) 16. SECURITY CLASSIFICATION OF: UNCLASSIFED 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME...Figure 9, the IOC was only expected to provide document submission and search; automatic speech recognition (ASR) for English, Spanish, Arabic , and

  15. Foreign Language Analysis and Recognition (FLARE) Progress

    DTIC Science & Technology

    2015-02-01

    Copies may be obtained from the Defense Technical Information Center (DTIC) (http://www.dtic.mil). AFRL- RH -WP-TR-2015-0007 HAS BEEN REVIEWED AND IS... retrieval (IR). 15. SUBJECT TERMS Automatic speech recognition (ASR), information retrieval (IR). 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...to the Haystack Multilingual Multimedia Information Extraction and Retrieval (MMIER) system that was initially developed under a prior work unit

  16. Voice technology and BBN

    NASA Technical Reports Server (NTRS)

    Wolf, Jared J.

    1977-01-01

    The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.

  17. A VidEo-Based Intelligent Recognition and Decision System for the Phacoemulsification Cataract Surgery.

    PubMed

    Tian, Shu; Yin, Xu-Cheng; Wang, Zhi-Bin; Zhou, Fang; Hao, Hong-Wei

    2015-01-01

    The phacoemulsification surgery is one of the most advanced surgeries to treat cataract. However, the conventional surgeries are always with low automatic level of operation and over reliance on the ability of surgeons. Alternatively, one imaginative scene is to use video processing and pattern recognition technologies to automatically detect the cataract grade and intelligently control the release of the ultrasonic energy while operating. Unlike cataract grading in the diagnosis system with static images, complicated background, unexpected noise, and varied information are always introduced in dynamic videos of the surgery. Here we develop a Video-Based Intelligent Recognitionand Decision (VeBIRD) system, which breaks new ground by providing a generic framework for automatically tracking the operation process and classifying the cataract grade in microscope videos of the phacoemulsification cataract surgery. VeBIRD comprises a robust eye (iris) detector with randomized Hough transform to precisely locate the eye in the noise background, an effective probe tracker with Tracking-Learning-Detection to thereafter track the operation probe in the dynamic process, and an intelligent decider with discriminative learning to finally recognize the cataract grade in the complicated video. Experiments with a variety of real microscope videos of phacoemulsification verify VeBIRD's effectiveness.

  18. A VidEo-Based Intelligent Recognition and Decision System for the Phacoemulsification Cataract Surgery

    PubMed Central

    Yin, Xu-Cheng; Wang, Zhi-Bin; Zhou, Fang; Hao, Hong-Wei

    2015-01-01

    The phacoemulsification surgery is one of the most advanced surgeries to treat cataract. However, the conventional surgeries are always with low automatic level of operation and over reliance on the ability of surgeons. Alternatively, one imaginative scene is to use video processing and pattern recognition technologies to automatically detect the cataract grade and intelligently control the release of the ultrasonic energy while operating. Unlike cataract grading in the diagnosis system with static images, complicated background, unexpected noise, and varied information are always introduced in dynamic videos of the surgery. Here we develop a Video-Based Intelligent Recognitionand Decision (VeBIRD) system, which breaks new ground by providing a generic framework for automatically tracking the operation process and classifying the cataract grade in microscope videos of the phacoemulsification cataract surgery. VeBIRD comprises a robust eye (iris) detector with randomized Hough transform to precisely locate the eye in the noise background, an effective probe tracker with Tracking-Learning-Detection to thereafter track the operation probe in the dynamic process, and an intelligent decider with discriminative learning to finally recognize the cataract grade in the complicated video. Experiments with a variety of real microscope videos of phacoemulsification verify VeBIRD's effectiveness. PMID:26693249

  19. Target recognition based on convolutional neural network

    NASA Astrophysics Data System (ADS)

    Wang, Liqiang; Wang, Xin; Xi, Fubiao; Dong, Jian

    2017-11-01

    One of the important part of object target recognition is the feature extraction, which can be classified into feature extraction and automatic feature extraction. The traditional neural network is one of the automatic feature extraction methods, while it causes high possibility of over-fitting due to the global connection. The deep learning algorithm used in this paper is a hierarchical automatic feature extraction method, trained with the layer-by-layer convolutional neural network (CNN), which can extract the features from lower layers to higher layers. The features are more discriminative and it is beneficial to the object target recognition.

  20. A novel thermal face recognition approach using face pattern words

    NASA Astrophysics Data System (ADS)

    Zheng, Yufeng

    2010-04-01

    A reliable thermal face recognition system can enhance the national security applications such as prevention against terrorism, surveillance, monitoring and tracking, especially at nighttime. The system can be applied at airports, customs or high-alert facilities (e.g., nuclear power plant) for 24 hours a day. In this paper, we propose a novel face recognition approach utilizing thermal (long wave infrared) face images that can automatically identify a subject at both daytime and nighttime. With a properly acquired thermal image (as a query image) in monitoring zone, the following processes will be employed: normalization and denoising, face detection, face alignment, face masking, Gabor wavelet transform, face pattern words (FPWs) creation, face identification by similarity measure (Hamming distance). If eyeglasses are present on a subject's face, an eyeglasses mask will be automatically extracted from the querying face image, and then masked with all comparing FPWs (no more transforms). A high identification rate (97.44% with Top-1 match) has been achieved upon our preliminary face dataset (of 39 subjects) from the proposed approach regardless operating time and glasses-wearing condition.e

  1. Image simulation for automatic license plate recognition

    NASA Astrophysics Data System (ADS)

    Bala, Raja; Zhao, Yonghui; Burry, Aaron; Kozitsky, Vladimir; Fillion, Claude; Saunders, Craig; Rodríguez-Serrano, José

    2012-01-01

    Automatic license plate recognition (ALPR) is an important capability for traffic surveillance applications, including toll monitoring and detection of different types of traffic violations. ALPR is a multi-stage process comprising plate localization, character segmentation, optical character recognition (OCR), and identification of originating jurisdiction (i.e. state or province). Training of an ALPR system for a new jurisdiction typically involves gathering vast amounts of license plate images and associated ground truth data, followed by iterative tuning and optimization of the ALPR algorithms. The substantial time and effort required to train and optimize the ALPR system can result in excessive operational cost and overhead. In this paper we propose a framework to create an artificial set of license plate images for accelerated training and optimization of ALPR algorithms. The framework comprises two steps: the synthesis of license plate images according to the design and layout for a jurisdiction of interest; and the modeling of imaging transformations and distortions typically encountered in the image capture process. Distortion parameters are estimated by measurements of real plate images. The simulation methodology is successfully demonstrated for training of OCR.

  2. Automatic recognition of postural allocations.

    PubMed

    Sazonov, Edward; Krishnamurthy, Vidya; Makeyev, Oleksandr; Browning, Ray; Schutz, Yves; Hill, James

    2007-01-01

    A significant part of daily energy expenditure may be attributed to non-exercise activity thermogenesis and exercise activity thermogenesis. Automatic recognition of postural allocations such as standing or sitting can be used in behavioral modification programs aimed at minimizing static postures. In this paper we propose a shoe-based device and related pattern recognition methodology for recognition of postural allocations. Inexpensive technology allows implementation of this methodology as a part of footwear. The experimental results suggest high efficiency and reliability of the proposed approach.

  3. Pc-based car license plate reading

    NASA Astrophysics Data System (ADS)

    Tanabe, Katsuyoshi; Marubayashi, Eisaku; Kawashima, Harumi; Nakanishi, Tadashi; Shio, Akio

    1994-03-01

    A PC-based car license plate recognition system has been developed. The system recognizes Chinese characters and Japanese phonetic hiragana characters as well as six digits on Japanese license plates. The system consists of a CCD camera, vehicle sensors, a strobe unit, a monitoring center, and an i486-based PC. The PC includes in its extension slots: a vehicle detector board, a strobe emitter board, and an image grabber board. When a passing vehicle is detected by the vehicle sensors, the strobe emits a pulse of light. The light pulse is synchronized with the time the vehicle image is frozen on an image grabber board. The recognition process is composed of three steps: image thresholding, character region extraction, and matching-based character recognition. The recognition software can handle obscured characters. Experimental results for hundreds of outdoor images showed high recognition performance within relatively short performance times. The results confirmed that the system is applicable to a wide variety of applications such as automatic vehicle identification and travel time measurement.

  4. Digital signal processing algorithms for automatic voice recognition

    NASA Technical Reports Server (NTRS)

    Botros, Nazeih M.

    1987-01-01

    The current digital signal analysis algorithms are investigated that are implemented in automatic voice recognition algorithms. Automatic voice recognition means, the capability of a computer to recognize and interact with verbal commands. The digital signal is focused on, rather than the linguistic, analysis of speech signal. Several digital signal processing algorithms are available for voice recognition. Some of these algorithms are: Linear Predictive Coding (LPC), Short-time Fourier Analysis, and Cepstrum Analysis. Among these algorithms, the LPC is the most widely used. This algorithm has short execution time and do not require large memory storage. However, it has several limitations due to the assumptions used to develop it. The other 2 algorithms are frequency domain algorithms with not many assumptions, but they are not widely implemented or investigated. However, with the recent advances in the digital technology, namely signal processors, these 2 frequency domain algorithms may be investigated in order to implement them in voice recognition. This research is concerned with real time, microprocessor based recognition algorithms.

  5. Dynamic Gesture Recognition with a Terahertz Radar Based on Range Profile Sequences and Doppler Signatures

    PubMed Central

    Pi, Yiming

    2017-01-01

    The frequency of terahertz radar ranges from 0.1 THz to 10 THz, which is higher than that of microwaves. Multi-modal signals, including high-resolution range profile (HRRP) and Doppler signatures, can be acquired by the terahertz radar system. These two kinds of information are commonly used in automatic target recognition; however, dynamic gesture recognition is rarely discussed in the terahertz regime. In this paper, a dynamic gesture recognition system using a terahertz radar is proposed, based on multi-modal signals. The HRRP sequences and Doppler signatures were first achieved from the radar echoes. Considering the electromagnetic scattering characteristics, a feature extraction model is designed using location parameter estimation of scattering centers. Dynamic Time Warping (DTW) extended to multi-modal signals is used to accomplish the classifications. Ten types of gesture signals, collected from a terahertz radar, are applied to validate the analysis and the recognition system. The results of the experiment indicate that the recognition rate reaches more than 91%. This research verifies the potential applications of dynamic gesture recognition using a terahertz radar. PMID:29267249

  6. Dynamic Gesture Recognition with a Terahertz Radar Based on Range Profile Sequences and Doppler Signatures.

    PubMed

    Zhou, Zhi; Cao, Zongjie; Pi, Yiming

    2017-12-21

    The frequency of terahertz radar ranges from 0.1 THz to 10 THz, which is higher than that of microwaves. Multi-modal signals, including high-resolution range profile (HRRP) and Doppler signatures, can be acquired by the terahertz radar system. These two kinds of information are commonly used in automatic target recognition; however, dynamic gesture recognition is rarely discussed in the terahertz regime. In this paper, a dynamic gesture recognition system using a terahertz radar is proposed, based on multi-modal signals. The HRRP sequences and Doppler signatures were first achieved from the radar echoes. Considering the electromagnetic scattering characteristics, a feature extraction model is designed using location parameter estimation of scattering centers. Dynamic Time Warping (DTW) extended to multi-modal signals is used to accomplish the classifications. Ten types of gesture signals, collected from a terahertz radar, are applied to validate the analysis and the recognition system. The results of the experiment indicate that the recognition rate reaches more than 91%. This research verifies the potential applications of dynamic gesture recognition using a terahertz radar.

  7. Segmentation and Recognition of Continuous Human Activity

    DTIC Science & Technology

    2001-01-01

    This paper presents a methodology for automatic segmentation and recognition of continuous human activity . We segment a continuous human activity into...commencement or termination. We use single action sequences for the training data set. The test sequences, on the other hand, are continuous sequences of human ... activity that consist of three or more actions in succession. The system has been tested on continuous activity sequences containing actions such as

  8. Automatic recognition of severity level for diagnosis of diabetic retinopathy using deep visual features.

    PubMed

    Abbas, Qaisar; Fondon, Irene; Sarmiento, Auxiliadora; Jiménez, Soledad; Alemany, Pedro

    2017-11-01

    Diabetic retinopathy (DR) is leading cause of blindness among diabetic patients. Recognition of severity level is required by ophthalmologists to early detect and diagnose the DR. However, it is a challenging task for both medical experts and computer-aided diagnosis systems due to requiring extensive domain expert knowledge. In this article, a novel automatic recognition system for the five severity level of diabetic retinopathy (SLDR) is developed without performing any pre- and post-processing steps on retinal fundus images through learning of deep visual features (DVFs). These DVF features are extracted from each image by using color dense in scale-invariant and gradient location-orientation histogram techniques. To learn these DVF features, a semi-supervised multilayer deep-learning algorithm is utilized along with a new compressed layer and fine-tuning steps. This SLDR system was evaluated and compared with state-of-the-art techniques using the measures of sensitivity (SE), specificity (SP) and area under the receiving operating curves (AUC). On 750 fundus images (150 per category), the SE of 92.18%, SP of 94.50% and AUC of 0.924 values were obtained on average. These results demonstrate that the SLDR system is appropriate for early detection of DR and provide an effective treatment for prediction type of diabetes.

  9. Spoken Grammar Practice and Feedback in an ASR-Based CALL System

    ERIC Educational Resources Information Center

    de Vries, Bart Penning; Cucchiarini, Catia; Bodnar, Stephen; Strik, Helmer; van Hout, Roeland

    2015-01-01

    Speaking practice is important for learners of a second language. Computer assisted language learning (CALL) systems can provide attractive opportunities for speaking practice when combined with automatic speech recognition (ASR) technology. In this paper, we present a CALL system that offers spoken practice of word order, an important aspect of…

  10. Voice Interactive Analysis System Study. Final Report, August 28, 1978 through March 23, 1979.

    ERIC Educational Resources Information Center

    Harry, D. P.; And Others

    The Voice Interactive Analysis System study continued research and development of the LISTEN real-time, minicomputer based connected speech recognition system, within NAVTRAEQUIPCEN'S program of developing automatic speech technology in support of training. An attempt was made to identify the most effective features detected by the TTI-500 model…

  11. An e-Learning System with MR for Experiments Involving Circuit Construction to Control a Robot

    ERIC Educational Resources Information Center

    Takemura, Atsushi

    2016-01-01

    This paper proposes a novel e-Learning system for technological experiments involving electronic circuit-construction and controlling robot motion that are necessary in the field of technology. The proposed system performs automated recognition of circuit images transmitted from individual learners and automatically supplies the learner with…

  12. Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts

    PubMed Central

    Chun, Hong-Woo; Tsuruoka, Yoshimasa; Kim, Jin-Dong; Shiba, Rie; Nagata, Naoki; Hishiki, Teruyoshi; Tsujii, Jun'ichi

    2006-01-01

    Background Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. Methods We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. Results Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. Conclusion A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques. PMID:17134477

  13. Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts.

    PubMed

    Chun, Hong-Woo; Tsuruoka, Yoshimasa; Kim, Jin-Dong; Shiba, Rie; Nagata, Naoki; Hishiki, Teruyoshi; Tsujii, Jun'ichi

    2006-11-24

    Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques.

  14. Shape and texture fused recognition of flying targets

    NASA Astrophysics Data System (ADS)

    Kovács, Levente; Utasi, Ákos; Kovács, Andrea; Szirányi, Tamás

    2011-06-01

    This paper presents visual detection and recognition of flying targets (e.g. planes, missiles) based on automatically extracted shape and object texture information, for application areas like alerting, recognition and tracking. Targets are extracted based on robust background modeling and a novel contour extraction approach, and object recognition is done by comparisons to shape and texture based query results on a previously gathered real life object dataset. Application areas involve passive defense scenarios, including automatic object detection and tracking with cheap commodity hardware components (CPU, camera and GPS).

  15. A Complete OCR System for Tamil Magazine Documents

    NASA Astrophysics Data System (ADS)

    Kokku, Aparna; Chakravarthy, Srinivasa

    We present a complete optical character recognition (OCR) system for Tamil magazines/documents. All the standard elements of OCR process like de-skewing, preprocessing, segmentation, character recognition, and reconstruction are implemented. Experience with OCR problems teaches that for most subtasks of OCR, there is no single technique that gives perfect results for every type of document image. We exploit the ability of neural networks to learn from experience in solving the problems of segmentation and character recognition. Text segmentation of Tamil newsprint poses a new challenge owing to its italic-like font type; problems that arise in recognition of touching and close characters are discussed. Character recognition efficiency varied from 94 to 97% for this type of font. The grouping of blocks into logical units and the determination of reading order within each logical unit helped us in reconstructing automatically the document image in an editable format.

  16. The fast iris image clarity evaluation based on Tenengrad and ROI selection

    NASA Astrophysics Data System (ADS)

    Gao, Shuqin; Han, Min; Cheng, Xu

    2018-04-01

    In iris recognition system, the clarity of iris image is an important factor that influences recognition effect. In the process of recognition, the blurred image may possibly be rejected by the automatic iris recognition system, which will lead to the failure of identification. Therefore it is necessary to evaluate the iris image definition before recognition. Considered the existing evaluation methods on iris image definition, we proposed a fast algorithm to evaluate the definition of iris image in this paper. In our algorithm, firstly ROI (Region of Interest) is extracted based on the reference point which is determined by using the feature of the light spots within the pupil, then Tenengrad operator is used to evaluate the iris image's definition. Experiment results show that, the iris image definition algorithm proposed in this paper could accurately distinguish the iris images of different clarity, and the algorithm has the merit of low computational complexity and more effectiveness.

  17. An automatic speech recognition system with speaker-independent identification support

    NASA Astrophysics Data System (ADS)

    Caranica, Alexandru; Burileanu, Corneliu

    2015-02-01

    The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.

  18. Strategies for distant speech recognitionin reverberant environments

    NASA Astrophysics Data System (ADS)

    Delcroix, Marc; Yoshioka, Takuya; Ogawa, Atsunori; Kubo, Yotaro; Fujimoto, Masakiyo; Ito, Nobutaka; Kinoshita, Keisuke; Espi, Miquel; Araki, Shoko; Hori, Takaaki; Nakatani, Tomohiro

    2015-12-01

    Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.

  19. Constraints in distortion-invariant target recognition system simulation

    NASA Astrophysics Data System (ADS)

    Iftekharuddin, Khan M.; Razzaque, Md A.

    2000-11-01

    Automatic target recognition (ATR) is a mature but active research area. In an earlier paper, we proposed a novel ATR approach for recognition of targets varying in fine details, rotation, and translation using a Learning Vector Quantization (LVQ) Neural Network (NN). The proposed approach performed segmentation of multiple objects and the identification of the objects using LVQNN. In this current paper, we extend the previous approach for recognition of targets varying in rotation, translation, scale, and combination of all three distortions. We obtain the analytical results of the system level design to show that the approach performs well with some constraints. The first constraint determines the size of the input images and input filters. The second constraint shows the limits on amount of rotation, translation, and scale of input objects. We present the simulation verification of the constraints using DARPA's Moving and Stationary Target Recognition (MSTAR) images with different depression and pose angles. The simulation results using MSTAR images verify the analytical constraints of the system level design.

  20. Facial Emotions Recognition using Gabor Transform and Facial Animation Parameters with Neural Networks

    NASA Astrophysics Data System (ADS)

    Harit, Aditya; Joshi, J. C., Col; Gupta, K. K.

    2018-03-01

    The paper proposed an automatic facial emotion recognition algorithm which comprises of two main components: feature extraction and expression recognition. The algorithm uses a Gabor filter bank on fiducial points to find the facial expression features. The resulting magnitudes of Gabor transforms, along with 14 chosen FAPs (Facial Animation Parameters), compose the feature space. There are two stages: the training phase and the recognition phase. Firstly, for the present 6 different emotions, the system classifies all training expressions in 6 different classes (one for each emotion) in the training stage. In the recognition phase, it recognizes the emotion by applying the Gabor bank to a face image, then finds the fiducial points, and then feeds it to the trained neural architecture.

  1. Hidden Markov models in automatic speech recognition

    NASA Astrophysics Data System (ADS)

    Wrzoskowicz, Adam

    1993-11-01

    This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.

  2. Robust and Effective Component-based Banknote Recognition for the Blind

    PubMed Central

    Hasanuzzaman, Faiz M.; Yang, Xiaodong; Tian, YingLi

    2012-01-01

    We develop a novel camera-based computer vision technology to automatically recognize banknotes for assisting visually impaired people. Our banknote recognition system is robust and effective with the following features: 1) high accuracy: high true recognition rate and low false recognition rate, 2) robustness: handles a variety of currency designs and bills in various conditions, 3) high efficiency: recognizes banknotes quickly, and 4) ease of use: helps blind users to aim the target for image capture. To make the system robust to a variety of conditions including occlusion, rotation, scaling, cluttered background, illumination change, viewpoint variation, and worn or wrinkled bills, we propose a component-based framework by using Speeded Up Robust Features (SURF). Furthermore, we employ the spatial relationship of matched SURF features to detect if there is a bill in the camera view. This process largely alleviates false recognition and can guide the user to correctly aim at the bill to be recognized. The robustness and generalizability of the proposed system is evaluated on a dataset including both positive images (with U.S. banknotes) and negative images (no U.S. banknotes) collected under a variety of conditions. The proposed algorithm, achieves 100% true recognition rate and 0% false recognition rate. Our banknote recognition system is also tested by blind users. PMID:22661884

  3. Automatic detection and recognition of traffic signs in stereo images based on features and probabilistic neural networks

    NASA Astrophysics Data System (ADS)

    Sheng, Yehua; Zhang, Ka; Ye, Chun; Liang, Cheng; Li, Jian

    2008-04-01

    Considering the problem of automatic traffic sign detection and recognition in stereo images captured under motion conditions, a new algorithm for traffic sign detection and recognition based on features and probabilistic neural networks (PNN) is proposed in this paper. Firstly, global statistical color features of left image are computed based on statistics theory. Then for red, yellow and blue traffic signs, left image is segmented to three binary images by self-adaptive color segmentation method. Secondly, gray-value projection and shape analysis are used to confirm traffic sign regions in left image. Then stereo image matching is used to locate the homonymy traffic signs in right image. Thirdly, self-adaptive image segmentation is used to extract binary inner core shapes of detected traffic signs. One-dimensional feature vectors of inner core shapes are computed by central projection transformation. Fourthly, these vectors are input to the trained probabilistic neural networks for traffic sign recognition. Lastly, recognition results in left image are compared with recognition results in right image. If results in stereo images are identical, these results are confirmed as final recognition results. The new algorithm is applied to 220 real images of natural scenes taken by the vehicle-borne mobile photogrammetry system in Nanjing at different time. Experimental results show a detection and recognition rate of over 92%. So the algorithm is not only simple, but also reliable and high-speed on real traffic sign detection and recognition. Furthermore, it can obtain geometrical information of traffic signs at the same time of recognizing their types.

  4. Intonation and dialog context as constraints for speech recognition.

    PubMed

    Taylor, P; King, S; Isard, S; Wright, H

    1998-01-01

    This paper describes a way of using intonation and dialog context to improve the performance of an automatic speech recognition (ASR) system. Our experiments were run on the DCIEM Maptask corpus, a corpus of spontaneous task-oriented dialog speech. This corpus has been tagged according to a dialog analysis scheme that assigns each utterance to one of 12 "move types," such as "acknowledge," "query-yes/no" or "instruct." Most ASR systems use a bigram language model to constrain the possible sequences of words that might be recognized. Here we use a separate bigram language model for each move type. We show that when the "correct" move-specific language model is used for each utterance in the test set, the word error rate of the recognizer drops. Of course when the recognizer is run on previously unseen data, it cannot know in advance what move type the speaker has just produced. To determine the move type we use an intonation model combined with a dialog model that puts constraints on possible sequences of move types, as well as the speech recognizer likelihoods for the different move-specific models. In the full recognition system, the combination of automatic move type recognition with the move specific language models reduces the overall word error rate by a small but significant amount when compared with a baseline system that does not take intonation or dialog acts into account. Interestingly, the word error improvement is restricted to "initiating" move types, where word recognition is important. In "response" move types, where the important information is conveyed by the move type itself--for example, positive versus negative response--there is no word error improvement, but recognition of the response types themselves is good. The paper discusses the intonation model, the language models, and the dialog model in detail and describes the architecture in which they are combined.

  5. Automatic translation among spoken languages

    NASA Technical Reports Server (NTRS)

    Walter, Sharon M.; Costigan, Kelly

    1994-01-01

    The Machine Aided Voice Translation (MAVT) system was developed in response to the shortage of experienced military field interrogators with both foreign language proficiency and interrogation skills. Combining speech recognition, machine translation, and speech generation technologies, the MAVT accepts an interrogator's spoken English question and translates it into spoken Spanish. The spoken Spanish response of the potential informant can then be translated into spoken English. Potential military and civilian applications for automatic spoken language translation technology are discussed in this paper.

  6. Image acquisition system for traffic monitoring applications

    NASA Astrophysics Data System (ADS)

    Auty, Glen; Corke, Peter I.; Dunn, Paul; Jensen, Murray; Macintyre, Ian B.; Mills, Dennis C.; Nguyen, Hao; Simons, Ben

    1995-03-01

    An imaging system for monitoring traffic on multilane highways is discussed. The system, named Safe-T-Cam, is capable of operating 24 hours per day in all but extreme weather conditions and can capture still images of vehicles traveling up to 160 km/hr. Systems operating at different remote locations are networked to allow transmission of images and data to a control center. A remote site facility comprises a vehicle detection and classification module (VCDM), an image acquisition module (IAM) and a license plate recognition module (LPRM). The remote site is connected to the central site by an ISDN communications network. The remote site system is discussed in this paper. The VCDM consists of a video camera, a specialized exposure control unit to maintain consistent image characteristics, and a 'real-time' image processing system that processes 50 images per second. The VCDM can detect and classify vehicles (e.g. cars from trucks). The vehicle class is used to determine what data should be recorded. The VCDM uses a vehicle tracking technique to allow optimum triggering of the high resolution camera of the IAM. The IAM camera combines the features necessary to operate consistently in the harsh environment encountered when imaging a vehicle 'head-on' in both day and night conditions. The image clarity obtained is ideally suited for automatic location and recognition of the vehicle license plate. This paper discusses the camera geometry, sensor characteristics and the image processing methods which permit consistent vehicle segmentation from a cluttered background allowing object oriented pattern recognition to be used for vehicle classification. The image capture of high resolution images and the image characteristics required for the LPRMs automatic reading of vehicle license plates, is also discussed. The results of field tests presented demonstrate that the vision based Safe-T-Cam system, currently installed on open highways, is capable of producing automatic classification of vehicle class and recording of vehicle numberplates with a success rate around 90 percent in a period of 24 hours.

  7. Retina vascular network recognition

    NASA Astrophysics Data System (ADS)

    Tascini, Guido; Passerini, Giorgio; Puliti, Paolo; Zingaretti, Primo

    1993-09-01

    The analysis of morphological and structural modifications of the retina vascular network is an interesting investigation method in the study of diabetes and hypertension. Normally this analysis is carried out by qualitative evaluations, according to standardized criteria, though medical research attaches great importance to quantitative analysis of vessel color, shape and dimensions. The paper describes a system which automatically segments and recognizes the ocular fundus circulation and micro circulation network, and extracts a set of features related to morphometric aspects of vessels. For this class of images the classical segmentation methods seem weak. We propose a computer vision system in which segmentation and recognition phases are strictly connected. The system is hierarchically organized in four modules. Firstly the Image Enhancement Module (IEM) operates a set of custom image enhancements to remove blur and to prepare data for subsequent segmentation and recognition processes. Secondly the Papilla Border Analysis Module (PBAM) automatically recognizes number, position and local diameter of blood vessels departing from optical papilla. Then the Vessel Tracking Module (VTM) analyses vessels comparing the results of body and edge tracking and detects branches and crossings. Finally the Feature Extraction Module evaluates PBAM and VTM output data and extracts some numerical indexes. Used algorithms appear to be robust and have been successfully tested on various ocular fundus images.

  8. The software for automatic creation of the formal grammars used by speech recognition, computer vision, editable text conversion systems, and some new functions

    NASA Astrophysics Data System (ADS)

    Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan

    2017-02-01

    For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.

  9. AN AUTOMATIC DEVICE FOR READING TYPOGRAPHICAL TEXTS,

    DTIC Science & Technology

    permissible. The system represents an attempt to apply the methods of machines designed for typescript reading to machines reading printed texts...Some characteristics by which typescript and typographical material differ are presented. The basic aspects of the recognition algorithm are given. A

  10. Artificial fingerprint recognition by using optical coherence tomography with autocorrelation analysis.

    PubMed

    Cheng, Yezeng; Larin, Kirill V

    2006-12-20

    Fingerprint recognition is one of the most widely used methods of biometrics. This method relies on the surface topography of a finger and, thus, is potentially vulnerable for spoofing by artificial dummies with embedded fingerprints. In this study, we applied the optical coherence tomography (OCT) technique to distinguish artificial materials commonly used for spoofing fingerprint scanning systems from the real skin. Several artificial fingerprint dummies made from household cement and liquid silicone rubber were prepared and tested using a commercial fingerprint reader and an OCT system. While the artificial fingerprints easily spoofed the commercial fingerprint reader, OCT images revealed the presence of them at all times. We also demonstrated that an autocorrelation analysis of the OCT images could be potentially used in automatic recognition systems.

  11. Artificial fingerprint recognition by using optical coherence tomography with autocorrelation analysis

    NASA Astrophysics Data System (ADS)

    Cheng, Yezeng; Larin, Kirill V.

    2006-12-01

    Fingerprint recognition is one of the most widely used methods of biometrics. This method relies on the surface topography of a finger and, thus, is potentially vulnerable for spoofing by artificial dummies with embedded fingerprints. In this study, we applied the optical coherence tomography (OCT) technique to distinguish artificial materials commonly used for spoofing fingerprint scanning systems from the real skin. Several artificial fingerprint dummies made from household cement and liquid silicone rubber were prepared and tested using a commercial fingerprint reader and an OCT system. While the artificial fingerprints easily spoofed the commercial fingerprint reader, OCT images revealed the presence of them at all times. We also demonstrated that an autocorrelation analysis of the OCT images could be potentially used in automatic recognition systems.

  12. Application of pattern recognition techniques to crime analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bender, C.F.; Cox, L.A. Jr.; Chappell, G.A.

    1976-08-15

    The initial goal was to evaluate the capabilities of current pattern recognition techniques when applied to existing computerized crime data. Performance was to be evaluated both in terms of the system's capability to predict crimes and to optimize police manpower allocation. A relation was sought to predict the crime's susceptibility to solution, based on knowledge of the crime type, location, time, etc. The preliminary results of this work are discussed. They indicate that automatic crime analysis involving pattern recognition techniques is feasible, and that efforts to determine optimum variables and techniques are warranted. 47 figures (RWR)

  13. DESIGN OF A PATTERN RECOGNITION DIGITAL COMPUTER WITH APPLICATION TO THE AUTOMATIC SCANNING OF BUBBLE CHAMBER NEGATIVES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McCormick, B.H.; Narasimhan, R.

    1963-01-01

    The overall computer system contains three main parts: an input device, a pattern recognition unit (PRU), and a control computer. The bubble chamber picture is divided into a grid of st run. Concent 1-mm squares on the film. It is then processed in parallel in a two-dimensional array of 1024 identical processing modules (stalactites) of the PRU. The array can function as a two- dimensional shift register in which results of successive shifting operations can be accumulated. The pattern recognition process is generally controlled by a conventional arithmetic computer. (A.G.W.)

  14. Seamless Tracing of Human Behavior Using Complementary Wearable and House-Embedded Sensors

    PubMed Central

    Augustyniak, Piotr; Smoleń, Magdalena; Mikrut, Zbigniew; Kańtoch, Eliasz

    2014-01-01

    This paper presents a multimodal system for seamless surveillance of elderly people in their living environment. The system uses simultaneously a wearable sensor network for each individual and premise-embedded sensors specific for each environment. The paper demonstrates the benefits of using complementary information from two types of mobility sensors: visual flow-based image analysis and an accelerometer-based wearable network. The paper provides results for indoor recognition of several elementary poses and outdoor recognition of complex movements. Instead of complete system description, particular attention was drawn to a polar histogram-based method of visual pose recognition, complementary use and synchronization of the data from wearable and premise-embedded networks and an automatic danger detection algorithm driven by two premise- and subject-related databases. The novelty of our approach also consists in feeding the databases with real-life recordings from the subject, and in using the dynamic time-warping algorithm for measurements of distance between actions represented as elementary poses in behavioral records. The main results of testing our method include: 95.5% accuracy of elementary pose recognition by the video system, 96.7% accuracy of elementary pose recognition by the accelerometer-based system, 98.9% accuracy of elementary pose recognition by the combined accelerometer and video-based system, and 80% accuracy of complex outdoor activity recognition by the accelerometer-based wearable system. PMID:24787640

  15. Effectiveness of Feedback for Enhancing English Pronunciation in an ASR-Based CALL System

    ERIC Educational Resources Information Center

    Wang, Y.-H.; Young, S. S.-C.

    2015-01-01

    This paper presents a study on implementing the ASR-based CALL (computer-assisted language learning based upon automatic speech recognition) system embedded with both formative and summative feedback approaches and using implicit and explicit strategies to enhance adult and young learners' English pronunciation. Two groups of learners including 18…

  16. Developing and Evaluating an Oral Skills Training Website Supported by Automatic Speech Recognition Technology

    ERIC Educational Resources Information Center

    Chen, Howard Hao-Jan

    2011-01-01

    Oral communication ability has become increasingly important to many EFL students. Several commercial software programs based on automatic speech recognition (ASR) technologies are available but their prices are not affordable for many students. This paper will demonstrate how the Microsoft Speech Application Software Development Kit (SASDK), a…

  17. Automatic Speech Recognition: Reliability and Pedagogical Implications for Teaching Pronunciation

    ERIC Educational Resources Information Center

    Kim, In-Seok

    2006-01-01

    This study examines the reliability of automatic speech recognition (ASR) software used to teach English pronunciation, focusing on one particular piece of software, "FluSpeak, as a typical example." Thirty-six Korean English as a Foreign Language (EFL) college students participated in an experiment in which they listened to 15 sentences…

  18. Automatic Speech Recognition Technology as an Effective Means for Teaching Pronunciation

    ERIC Educational Resources Information Center

    Elimat, Amal Khalil; AbuSeileek, Ali Farhan

    2014-01-01

    This study aimed to explore the effect of using automatic speech recognition technology (ASR) on the third grade EFL students' performance in pronunciation, whether teaching pronunciation through ASR is better than regular instruction, and the most effective teaching technique (individual work, pair work, or group work) in teaching pronunciation…

  19. Automatization and Orthographic Development in Second Language Visual Word Recognition

    ERIC Educational Resources Information Center

    Kida, Shusaku

    2016-01-01

    The present study investigated second language (L2) learners' acquisition of automatic word recognition and the development of L2 orthographic representation in the mental lexicon. Participants in the study were Japanese university students enrolled in a compulsory course involving a weekly 30-minute sustained silent reading (SSR) activity with…

  20. BeSocratic: An Intelligent Tutoring System for the Recognition, Evaluation, and Analysis of Free-Form Student Input

    ERIC Educational Resources Information Center

    Bryfczynski, Samuel Paul

    2012-01-01

    This dissertation describes a novel intelligent tutoring system, BeSocratic, which aims to help fill the gap between simple multiple-choice systems and free-response systems. BeSocratic focuses on targeting questions that are free-form in nature yet defined to the point which allows for automatic evaluation and analysis. The system includes a set…

  1. Stochastic Modeling as a Means of Automatic Speech Recognition

    DTIC Science & Technology

    1975-04-01

    companng ihc features of different speech recognition systems, attention is often focused on thc control structures and the methods o’ communication...with no need to use secondary storage . Note that we go from a group of separate knowledge sources to an integrated network representation in...exhaust the available lime or storage . - - - . . 1- .-.-.. mmm^~ i — ■ ■ ’ ■ C haplcr I - IN I ROÜliCl ION Page 13 On the other hand

  2. Road Signs Detection and Recognition Utilizing Images and 3d Point Cloud Acquired by Mobile Mapping System

    NASA Astrophysics Data System (ADS)

    Li, Y. H.; Shinohara, T.; Satoh, T.; Tachibana, K.

    2016-06-01

    High-definition and highly accurate road maps are necessary for the realization of automated driving, and road signs are among the most important element in the road map. Therefore, a technique is necessary which can acquire information about all kinds of road signs automatically and efficiently. Due to the continuous technical advancement of Mobile Mapping System (MMS), it has become possible to acquire large number of images and 3d point cloud efficiently with highly precise position information. In this paper, we present an automatic road sign detection and recognition approach utilizing both images and 3D point cloud acquired by MMS. The proposed approach consists of three stages: 1) detection of road signs from images based on their color and shape features using object based image analysis method, 2) filtering out of over detected candidates utilizing size and position information estimated from 3D point cloud, region of candidates and camera information, and 3) road sign recognition using template matching method after shape normalization. The effectiveness of proposed approach was evaluated by testing dataset, acquired from more than 180 km of different types of roads in Japan. The results show a very high success in detection and recognition of road signs, even under the challenging conditions such as discoloration, deformation and in spite of partial occlusions.

  3. Improved automatic adjustment of density and contrast in FCR system using neural network

    NASA Astrophysics Data System (ADS)

    Takeo, Hideya; Nakajima, Nobuyoshi; Ishida, Masamitsu; Kato, Hisatoyo

    1994-05-01

    FCR system has an automatic adjustment of image density and contrast by analyzing the histogram of image data in the radiation field. Advanced image recognition methods proposed in this paper can improve the automatic adjustment performance, in which neural network technology is used. There are two methods. Both methods are basically used 3-layer neural network with back propagation. The image data are directly input to the input-layer in one method and the histogram data is input in the other method. The former is effective to the imaging menu such as shoulder joint in which the position of interest region occupied on the histogram changes by difference of positioning and the latter is effective to the imaging menu such as chest-pediatrics in which the histogram shape changes by difference of positioning. We experimentally confirm the validity of these methods (about the automatic adjustment performance) as compared with the conventional histogram analysis methods.

  4. Automatic ground control point recognition with parallel associative memory

    NASA Technical Reports Server (NTRS)

    Al-Tahir, Raid; Toth, Charles K.; Schenck, Anton F.

    1990-01-01

    The basic principle of the associative memory is to match the unknown input pattern against a stored training set, and responding with the 'closest match' and the corresponding label. Generally, an associative memory system requires two preparatory steps: selecting attributes of the pattern class, and training the system by associating patterns with labels. Experimental results gained from using Parallel Associative Memory are presented. The primary concern is an automatic search for ground control points in aerial photographs. Synthetic patterns are tested followed by real data. The results are encouraging as a relatively high level of correct matches is reached.

  5. Counter-propagation network with variable degree variable step size LMS for single switch typing recognition.

    PubMed

    Yang, Cheng-Huei; Luo, Ching-Hsing; Yang, Cheng-Hong; Chuang, Li-Yeh

    2004-01-01

    Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, including mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for disabled persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. This restriction is a major hindrance. Therefore, a switch adaptive automatic recognition method with a high recognition rate is needed. The proposed system combines counter-propagation networks with a variable degree variable step size LMS algorithm. It is divided into five stages: space recognition, tone recognition, learning process, adaptive processing, and character recognition. Statistical analyses demonstrated that the proposed method elicited a better recognition rate in comparison to alternative methods in the literature.

  6. Automatic de-identification of French clinical records: comparison of rule-based and machine-learning approaches.

    PubMed

    Grouin, Cyril; Zweigenbaum, Pierre

    2013-01-01

    In this paper, we present a comparison of two approaches to automatically de-identify medical records written in French: a rule-based system and a machine-learning based system using a conditional random fields (CRF) formalism. Both systems have been designed to process nine identifiers in a corpus of medical records in cardiology. We performed two evaluations: first, on 62 documents in cardiology, and on 10 documents in foetopathology - produced by optical character recognition (OCR) - to evaluate the robustness of our systems. We achieved a 0.843 (rule-based) and 0.883 (machine-learning) exact match overall F-measure in cardiology. While the rule-based system allowed us to achieve good results on nominative (first and last names) and numerical data (dates, phone numbers, and zip codes), the machine-learning approach performed best on more complex categories (postal addresses, hospital names, medical devices, and towns). On the foetopathology corpus, although our systems have not been designed for this corpus and despite OCR character recognition errors, we obtained promising results: a 0.681 (rule-based) and 0.638 (machine-learning) exact-match overall F-measure. This demonstrates that existing tools can be applied to process new documents of lower quality.

  7. Iris recognition and what is next? Iris diagnosis: a new challenging topic for machine vision from image acquisition to image interpretation

    NASA Astrophysics Data System (ADS)

    Perner, Petra

    2017-03-01

    Molecular image-based techniques are widely used in medicine to detect specific diseases. Look diagnosis is an important issue but also the analysis of the eye plays an important role in order to detect specific diseases. These topics are important topics in medicine and the standardization of these topics by an automatic system can be a new challenging field for machine vision. Compared to iris recognition has the iris diagnosis much more higher demands for the image acquisition and interpretation of the iris. One understands by iris diagnosis (Iridology) the investigation and analysis of the colored part of the eye, the iris, to discover factors, which play an important role for the prevention and treatment of illnesses, but also for the preservation of an optimum health. An automatic system would pave the way for a much wider use of the iris diagnosis for the diagnosis of illnesses and for the purpose of individual health protection. With this paper, we describe our work towards an automatic iris diagnosis system. We describe the image acquisition and the problems with it. Different ways are explained for image acquisition and image preprocessing. We describe the image analysis method for the detection of the iris. The meta-model for image interpretation is given. Based on this model we show the many tasks for image analysis that range from different image-object feature analysis, spatial image analysis to color image analysis. Our first results for the recognition of the iris are given. We describe how detecting the pupil and not wanted lamp spots. We explain how to recognize orange blue spots in the iris and match them against the topological map of the iris. Finally, we give an outlook for further work.

  8. Computer vision system: a tool for evaluating the quality of wheat in a grain tank

    NASA Astrophysics Data System (ADS)

    Minkin, Uryi Igorevish; Panchenko, Aleksei Vladimirovich; Shkanaev, Aleksandr Yurievich; Konovalenko, Ivan Andreevich; Putintsev, Dmitry Nikolaevich; Sadekov, Rinat Nailevish

    2018-04-01

    The paper describes a technology that allows for automatizing the process of evaluating the grain quality in a grain tank of a combine harvester. Special recognition algorithm analyzes photographic images taken by the camera, and that provides automatic estimates of the total mass fraction of broken grains and the presence of non-grains. The paper also presents the operating details of the tank prototype as well as it defines the accuracy of the algorithms designed.

  9. The Effect of Automatic Speech Recognition Eyespeak Software on Iraqi Students' English Pronunciation: A Pilot Study

    ERIC Educational Resources Information Center

    Sidgi, Lina Fathi Sidig; Shaari, Ahmad Jelani

    2017-01-01

    The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the foreign language classrooms where it is most needed. One promising emerging technology that supports language learning is automatic speech recognition (ASR). Integrating such technology, especially in the instruction of pronunciation…

  10. Efficacy of a Classroom Integrated Intervention of Phonological Awareness and Word Recognition in "Double-Deficit Children" Learning a Regular Orthography

    ERIC Educational Resources Information Center

    Mayer, Andreas; Motsch, Hans-Joachim

    2015-01-01

    This study analysed the effects of a classroom intervention focusing on phonological awareness and/or automatized word recognition in children with a deficit in the domains of phonological awareness and rapid automatized naming ("double deficit"). According to the double-deficit hypothesis (Wolf & Bowers, 1999), these children belong…

  11. Using Automatic Speech Recognition Technology with Elicited Oral Response Testing

    ERIC Educational Resources Information Center

    Cox, Troy L.; Davies, Randall S.

    2012-01-01

    This study examined the use of automatic speech recognition (ASR) scored elicited oral response (EOR) tests to assess the speaking ability of English language learners. It also examined the relationship between ASR-scored EOR and other language proficiency measures and the ability of the ASR to rate speakers without bias to gender or native…

  12. Auditory models for speech analysis

    NASA Astrophysics Data System (ADS)

    Maybury, Mark T.

    This paper reviews the psychophysical basis for auditory models and discusses their application to automatic speech recognition. First an overview of the human auditory system is presented, followed by a review of current knowledge gleaned from neurological and psychoacoustic experimentation. Next, a general framework describes established peripheral auditory models which are based on well-understood properties of the peripheral auditory system. This is followed by a discussion of current enhancements to that models to include nonlinearities and synchrony information as well as other higher auditory functions. Finally, the initial performance of auditory models in the task of speech recognition is examined and additional applications are mentioned.

  13. Semi-automatic mapping of cultural heritage from airborne laser scanning using deep learning

    NASA Astrophysics Data System (ADS)

    Due Trier, Øivind; Salberg, Arnt-Børre; Holger Pilø, Lars; Tonning, Christer; Marius Johansen, Hans; Aarsten, Dagrun

    2016-04-01

    This paper proposes to use deep learning to improve semi-automatic mapping of cultural heritage from airborne laser scanning (ALS) data. Automatic detection methods, based on traditional pattern recognition, have been applied in a number of cultural heritage mapping projects in Norway for the past five years. Automatic detection of pits and heaps have been combined with visual interpretation of the ALS data for the mapping of deer hunting systems, iron production sites, grave mounds and charcoal kilns. However, the performance of the automatic detection methods varies substantially between ALS datasets. For the mapping of deer hunting systems on flat gravel and sand sediment deposits, the automatic detection results were almost perfect. However, some false detections appeared in the terrain outside of the sediment deposits. These could be explained by other pit-like landscape features, like parts of river courses, spaces between boulders, and modern terrain modifications. However, these were easy to spot during visual interpretation, and the number of missed individual pitfall traps was still low. For the mapping of grave mounds, the automatic method produced a large number of false detections, reducing the usefulness of the semi-automatic approach. The mound structure is a very common natural terrain feature, and the grave mounds are less distinct in shape than the pitfall traps. Still, applying automatic mound detection on an entire municipality did lead to a new discovery of an Iron Age grave field with more than 15 individual mounds. Automatic mound detection also proved to be useful for a detailed re-mapping of Norway's largest Iron Age grave yard, which contains almost 1000 individual graves. Combined pit and mound detection has been applied to the mapping of more than 1000 charcoal kilns that were used by an iron work 350-200 years ago. The majority of charcoal kilns were indirectly detected as either pits on the circumference, a central mound, or both. However, kilns with a flat interior and a shallow ditch along the circumference were often missed by the automatic detection method. The successfulness of automatic detection seems to depend on two factors: (1) the density of ALS ground hits on the cultural heritage structures being sought, and (2) to what extent these structures stand out from natural terrain structures. The first factor may, to some extent, be improved by using a higher number of ALS pulses per square meter. The second factor is difficult to change, and also highlights another challenge: how to make a general automatic method that is applicable in all types of terrain within a country. The mixed experience with traditional pattern recognition for semi-automatic mapping of cultural heritage led us to consider deep learning as an alternative approach. The main principle is that a general feature detector has been trained on a large image database. The feature detector is then tailored to a specific task by using a modest number of images of true and false examples of the features being sought. Results of using deep learning are compared with previous results using traditional pattern recognition.

  14. Face Recognition From One Example View.

    DTIC Science & Technology

    1995-09-01

    Proceedings, International Workshop on Automatic Face- and Gesture-Recognition, pages 248{253, Zurich, 1995. [32] Yael Moses, Shimon Ullman, and Shimon...recognition. Journal of Cognitive Neuroscience, 3(1):71{86, 1991. [49] Shimon Ullman and Ronen Basri. Recognition by linear combinations of models

  15. Automatic Target Recognition Based on Cross-Plot

    PubMed Central

    Wong, Kelvin Kian Loong; Abbott, Derek

    2011-01-01

    Automatic target recognition that relies on rapid feature extraction of real-time target from photo-realistic imaging will enable efficient identification of target patterns. To achieve this objective, Cross-plots of binary patterns are explored as potential signatures for the observed target by high-speed capture of the crucial spatial features using minimal computational resources. Target recognition was implemented based on the proposed pattern recognition concept and tested rigorously for its precision and recall performance. We conclude that Cross-plotting is able to produce a digital fingerprint of a target that correlates efficiently and effectively to signatures of patterns having its identity in a target repository. PMID:21980508

  16. Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar

    PubMed Central

    Shin, Young Hoon; Seo, Jiwon

    2016-01-01

    People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker’s vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing. PMID:27801867

  17. Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar.

    PubMed

    Shin, Young Hoon; Seo, Jiwon

    2016-10-29

    People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker's vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.

  18. Presentation video retrieval using automatically recovered slide and spoken text

    NASA Astrophysics Data System (ADS)

    Cooper, Matthew

    2013-03-01

    Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.

  19. Image-based automatic recognition of larvae

    NASA Astrophysics Data System (ADS)

    Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai

    2010-08-01

    As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.

  20. Offline Arabic handwriting recognition: a survey.

    PubMed

    Lorigo, Liana M; Govindaraju, Venu

    2006-05-01

    The automatic recognition of text on scanned images has enabled many applications such as searching for words in large volumes of documents, automatic sorting of postal mail, and convenient editing of previously printed documents. The domain of handwriting in the Arabic script presents unique technical challenges and has been addressed more recently than other domains. Many different methods have been proposed and applied to various types of images. This paper provides a comprehensive review of these methods. It is the first survey to focus on Arabic handwriting recognition and the first Arabic character recognition survey to provide recognition rates and descriptions of test data for the approaches discussed. It includes background on the field, discussion of the methods, and future research directions.

  1. Automatic Recognition of Road Signs

    NASA Astrophysics Data System (ADS)

    Inoue, Yasuo; Kohashi, Yuuichirou; Ishikawa, Naoto; Nakajima, Masato

    2002-11-01

    The increase in traffic accidents is becoming a serious social problem with the recent rapid traffic increase. In many cases, the driver"s carelessness is the primary factor of traffic accidents, and the driver assistance system is demanded for supporting driver"s safety. In this research, we propose the new method of automatic detection and recognition of road signs by image processing. The purpose of this research is to prevent accidents caused by driver"s carelessness, and call attention to a driver when the driver violates traffic a regulation. In this research, high accuracy and the efficient sign detecting method are realized by removing unnecessary information except for a road sign from an image, and detect a road sign using shape features. At first, the color information that is not used in road signs is removed from an image. Next, edges except for circular and triangle ones are removed to choose sign shape. In the recognition process, normalized cross correlation operation is carried out to the two-dimensional differentiation pattern of a sign, and the accurate and efficient method for detecting the road sign is realized. Moreover, the real-time operation in a software base was realized by holding down calculation cost, maintaining highly precise sign detection and recognition. Specifically, it becomes specifically possible to process by 0.1 sec(s)/frame using a general-purpose PC (CPU: Pentium4 1.7GHz). As a result of in-vehicle experimentation, our system could process on real time and has confirmed that detection and recognition of a sign could be performed correctly.

  2. Gimli: open source and high-performance biomedical name recognition

    PubMed Central

    2013-01-01

    Background Automatic recognition of biomedical names is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. In recent years, various solutions have been implemented to tackle this problem. However, limitations regarding system characteristics, customization and usability still hinder their wider application outside text mining research. Results We present Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli achieves an F-measure of 87.17% on GENETAG and 72.23% on JNLPBA corpus, significantly outperforming existing open-source solutions. Conclusions Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field. Gimli is freely available at http://bioinformatics.ua.pt/gimli. PMID:23413997

  3. A Vocal-Based Analytical Method for Goose Behaviour Recognition

    PubMed Central

    Steen, Kim Arild; Therkildsen, Ole Roland; Karstoft, Henrik; Green, Ole

    2012-01-01

    Since human-wildlife conflicts are increasing, the development of cost-effective methods for reducing damage or conflict levels is important in wildlife management. A wide range of devices to detect and deter animals causing conflict are used for this purpose, although their effectiveness is often highly variable, due to habituation to disruptive or disturbing stimuli. Automated recognition of behaviours could form a critical component of a system capable of altering the disruptive stimuli to avoid this. In this paper we present a novel method to automatically recognise goose behaviour based on vocalisations from flocks of free-living barnacle geese (Branta leucopsis). The geese were observed and recorded in a natural environment, using a shielded shotgun microphone. The classification used Support Vector Machines (SVMs), which had been trained with labeled data. Greenwood Function Cepstral Coefficients (GFCC) were used as features for the pattern recognition algorithm, as they can be adjusted to the hearing capabilities of different species. Three behaviours are classified based in this approach, and the method achieves a good recognition of foraging behaviour (86–97% sensitivity, 89–98% precision) and a reasonable recognition of flushing (79–86%, 66–80%) and landing behaviour(73–91%, 79–92%). The Support Vector Machine has proven to be a robust classifier for this kind of classification, as generality and non-linear capabilities are important. We conclude that vocalisations can be used to automatically detect behaviour of conflict wildlife species, and as such, may be used as an integrated part of a wildlife management system. PMID:22737037

  4. The Game Embedded CALL System to Facilitate English Vocabulary Acquisition and Pronunciation

    ERIC Educational Resources Information Center

    Young, Shelley Shwu-Ching; Wang, Yi-Hsuan

    2014-01-01

    The aim of this study is to make a new attempt to explore the potential of integrating game strategies with automatic speech recognition technologies to provide learners with individual opportunities for English pronunciation learning. The study developed the Game Embedded CALL (GeCALL) system with two activities for on-line speaking practice. For…

  5. Driver face recognition as a security and safety feature

    NASA Astrophysics Data System (ADS)

    Vetter, Volker; Giefing, Gerd-Juergen; Mai, Rudolf; Weisser, Hubert

    1995-09-01

    We present a driver face recognition system for comfortable access control and individual settings of automobiles. The primary goals are the prevention of car thefts and heavy accidents caused by unauthorized use (joy-riders), as well as the increase of safety through optimal settings, e.g. of the mirrors and the seat position. The person sitting on the driver's seat is observed automatically by a small video camera in the dashboard. All he has to do is to behave cooperatively, i.e. to look into the camera. A classification system validates his access. Only after a positive identification, the car can be used and the driver-specific environment (e.g. seat position, mirrors, etc.) may be set up to ensure the driver's comfort and safety. The driver identification system has been integrated in a Volkswagen research car. Recognition results are presented.

  6. Multiview fusion for activity recognition using deep neural networks

    NASA Astrophysics Data System (ADS)

    Kavi, Rahul; Kulathumani, Vinod; Rohit, Fnu; Kecojevic, Vlad

    2016-07-01

    Convolutional neural networks (ConvNets) coupled with long short term memory (LSTM) networks have been recently shown to be effective for video classification as they combine the automatic feature extraction capabilities of a neural network with additional memory in the temporal domain. This paper shows how multiview fusion can be applied to such a ConvNet LSTM architecture. Two different fusion techniques are presented. The system is first evaluated in the context of a driver activity recognition system using data collected in a multicamera driving simulator. These results show significant improvement in accuracy with multiview fusion and also show that deep learning performs better than a traditional approach using spatiotemporal features even without requiring any background subtraction. The system is also validated on another publicly available multiview action recognition dataset that has 12 action classes and 8 camera views.

  7. Implementation of a high-speed face recognition system that uses an optical parallel correlator.

    PubMed

    Watanabe, Eriko; Kodate, Kashiko

    2005-02-10

    We implement a fully automatic fast face recognition system by using a 1000 frame/s optical parallel correlator designed and assembled by us. The operational speed for the 1:N (i.e., matching one image against N, where N refers to the number of images in the database) identification experiment (4000 face images) amounts to less than 1.5 s, including the preprocessing and postprocessing times. The binary real-only matched filter is devised for the sake of face recognition, and the system is optimized by the false-rejection rate (FRR) and the false-acceptance rate (FAR), according to 300 samples selected by the biometrics guideline. From trial 1:N identification experiments with the optical parallel correlator, we acquired low error rates of 2.6% FRR and 1.3% FAR. Facial images of people wearing thin glasses or heavy makeup that rendered identification difficult were identified with this system.

  8. A Dynamic Time Warping Approach to Real-Time Activity Recognition for Food Preparation

    NASA Astrophysics Data System (ADS)

    Pham, Cuong; Plötz, Thomas; Olivier, Patrick

    We present a dynamic time warping based activity recognition system for the analysis of low-level food preparation activities. Accelerometers embedded into kitchen utensils provide continuous sensor data streams while people are using them for cooking. The recognition framework analyzes frames of contiguous sensor readings in real-time with low latency. It thereby adapts to the idiosyncrasies of utensil use by automatically maintaining a template database. We demonstrate the effectiveness of the classification approach by a number of real-world practical experiments on a publically available dataset. The adaptive system shows superior performance compared to a static recognizer. Furthermore, we demonstrate the generalization capabilities of the system by gradually reducing the amount of training samples. The system achieves excellent classification results even if only a small number of training samples is available, which is especially relevant for real-world scenarios.

  9. Looking inside the Ocean: Toward an Autonomous Imaging System for Monitoring Gelatinous Zooplankton

    PubMed Central

    Corgnati, Lorenzo; Marini, Simone; Mazzei, Luca; Ottaviani, Ennio; Aliani, Stefano; Conversi, Alessandra; Griffa, Annalisa

    2016-01-01

    Marine plankton abundance and dynamics in the open and interior ocean is still an unknown field. The knowledge of gelatinous zooplankton distribution is especially challenging, because this type of plankton has a very fragile structure and cannot be directly sampled using traditional net based techniques. To overcome this shortcoming, Computer Vision techniques can be successfully used for the automatic monitoring of this group.This paper presents the GUARD1 imaging system, a low-cost stand-alone instrument for underwater image acquisition and recognition of gelatinous zooplankton, and discusses the performance of three different methodologies, Tikhonov Regularization, Support Vector Machines and Genetic Programming, that have been compared in order to select the one to be run onboard the system for the automatic recognition of gelatinous zooplankton. The performance comparison results highlight the high accuracy of the three methods in gelatinous zooplankton identification, showing their good capability in robustly selecting relevant features. In particular, Genetic Programming technique achieves the same performances of the other two methods by using a smaller set of features, thus being the most efficient in avoiding computationally consuming preprocessing stages, that is a crucial requirement for running on an autonomous imaging system designed for long lasting deployments, like the GUARD1. The Genetic Programming algorithm has been installed onboard the system, that has been operationally tested in a two-months survey in the Ligurian Sea, providing satisfactory results in terms of monitoring and recognition performances. PMID:27983638

  10. Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications.

    PubMed

    Corneanu, Ciprian Adrian; Simon, Marc Oliu; Cohn, Jeffrey F; Guerrero, Sergio Escalera

    2016-08-01

    Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.

  11. Automatic speech recognition research at NASA-Ames Research Center

    NASA Technical Reports Server (NTRS)

    Coler, Clayton R.; Plummer, Robert P.; Huff, Edward M.; Hitchcock, Myron H.

    1977-01-01

    A trainable acoustic pattern recognizer manufactured by Scope Electronics is presented. The voice command system VCS encodes speech by sampling 16 bandpass filters with center frequencies in the range from 200 to 5000 Hz. Variations in speaking rate are compensated for by a compression algorithm that subdivides each utterance into eight subintervals in such a way that the amount of spectral change within each subinterval is the same. The recorded filter values within each subinterval are then reduced to a 15-bit representation, giving a 120-bit encoding for each utterance. The VCS incorporates a simple recognition algorithm that utilizes five training samples of each word in a vocabulary of up to 24 words. The recognition rate of approximately 85 percent correct for untrained speakers and 94 percent correct for trained speakers was not considered adequate for flight systems use. Therefore, the built-in recognition algorithm was disabled, and the VCS was modified to transmit 120-bit encodings to an external computer for recognition.

  12. An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition.

    PubMed

    Lozano-Diez, Alicia; Zazo, Ruben; Toledano, Doroteo T; Gonzalez-Rodriguez, Joaquin

    2017-01-01

    Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.

  13. Automatic Speech Acquisition and Recognition for Spacesuit Audio Systems

    NASA Technical Reports Server (NTRS)

    Ye, Sherry

    2015-01-01

    NASA has a widely recognized but unmet need for novel human-machine interface technologies that can facilitate communication during astronaut extravehicular activities (EVAs), when loud noises and strong reverberations inside spacesuits make communication challenging. WeVoice, Inc., has developed a multichannel signal-processing method for speech acquisition in noisy and reverberant environments that enables automatic speech recognition (ASR) technology inside spacesuits. The technology reduces noise by exploiting differences between the statistical nature of signals (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, ASR accuracy can be improved to the level at which crewmembers will find the speech interface useful. System components and features include beam forming/multichannel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, and ASR decoding. Arithmetic complexity models were developed and will help designers of real-time ASR systems select proper tasks when confronted with constraints in computational resources. In Phase I of the project, WeVoice validated the technology. The company further refined the technology in Phase II and developed a prototype for testing and use by suited astronauts.

  14. Grading Multiple Choice Exams with Low-Cost and Portable Computer-Vision Techniques

    NASA Astrophysics Data System (ADS)

    Fisteus, Jesus Arias; Pardo, Abelardo; García, Norberto Fernández

    2013-08-01

    Although technology for automatic grading of multiple choice exams has existed for several decades, it is not yet as widely available or affordable as it should be. The main reasons preventing this adoption are the cost and the complexity of the setup procedures. In this paper, Eyegrade, a system for automatic grading of multiple choice exams is presented. While most current solutions are based on expensive scanners, Eyegrade offers a truly low-cost solution requiring only a regular off-the-shelf webcam. Additionally, Eyegrade performs both mark recognition as well as optical character recognition of handwritten student identification numbers, which avoids the use of bubbles in the answer sheet. When compared with similar webcam-based systems, the user interface in Eyegrade has been designed to provide a more efficient and error-free data collection procedure. The tool has been validated with a set of experiments that show the ease of use (both setup and operation), the reduction in grading time, and an increase in the reliability of the results when compared with conventional, more expensive systems.

  15. Automatic event recognition and anomaly detection with attribute grammar by learning scene semantics

    NASA Astrophysics Data System (ADS)

    Qi, Lin; Yao, Zhenyu; Li, Li; Dong, Junyu

    2007-11-01

    In this paper we present a novel framework for automatic event recognition and abnormal behavior detection with attribute grammar by learning scene semantics. This framework combines learning scene semantics by trajectory analysis and constructing attribute grammar-based event representation. The scene and event information is learned automatically. Abnormal behaviors that disobey scene semantics or event grammars rules are detected. By this method, an approach to understanding video scenes is achieved. Further more, with this prior knowledge, the accuracy of abnormal event detection is increased.

  16. Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

    NASA Astrophysics Data System (ADS)

    White, R. W.; Parks, D. L.

    1985-07-01

    A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept.

  17. Recognition of Arabic Sign Language Alphabet Using Polynomial Classifiers

    NASA Astrophysics Data System (ADS)

    Assaleh, Khaled; Al-Rousan, M.

    2005-12-01

    Building an accurate automatic sign language recognition system is of great importance in facilitating efficient communication with deaf people. In this paper, we propose the use of polynomial classifiers as a classification engine for the recognition of Arabic sign language (ArSL) alphabet. Polynomial classifiers have several advantages over other classifiers in that they do not require iterative training, and that they are highly computationally scalable with the number of classes. Based on polynomial classifiers, we have built an ArSL system and measured its performance using real ArSL data collected from deaf people. We show that the proposed system provides superior recognition results when compared with previously published results using ANFIS-based classification on the same dataset and feature extraction methodology. The comparison is shown in terms of the number of misclassified test patterns. The reduction in the rate of misclassified patterns was very significant. In particular, we have achieved a 36% reduction of misclassifications on the training data and 57% on the test data.

  18. Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

    NASA Technical Reports Server (NTRS)

    White, R. W.; Parks, D. L.

    1985-01-01

    A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept.

  19. Gabor filter based fingerprint image enhancement

    NASA Astrophysics Data System (ADS)

    Wang, Jin-Xiang

    2013-03-01

    Fingerprint recognition technology has become the most reliable biometric technology due to its uniqueness and invariance, which has been most convenient and most reliable technique for personal authentication. The development of Automated Fingerprint Identification System is an urgent need for modern information security. Meanwhile, fingerprint preprocessing algorithm of fingerprint recognition technology has played an important part in Automatic Fingerprint Identification System. This article introduces the general steps in the fingerprint recognition technology, namely the image input, preprocessing, feature recognition, and fingerprint image enhancement. As the key to fingerprint identification technology, fingerprint image enhancement affects the accuracy of the system. It focuses on the characteristics of the fingerprint image, Gabor filters algorithm for fingerprint image enhancement, the theoretical basis of Gabor filters, and demonstration of the filter. The enhancement algorithm for fingerprint image is in the windows XP platform with matlab.65 as a development tool for the demonstration. The result shows that the Gabor filter is effective in fingerprint image enhancement technology.

  20. Automatic concept extraction from spoken medical reports.

    PubMed

    Happe, André; Pouliquen, Bruno; Burgun, Anita; Cuggia, Marc; Le Beux, Pierre

    2003-07-01

    The objective of this project is to investigate methods whereby a combination of speech recognition and automated indexing methods substitute for current transcription and indexing practices. We based our study on existing speech recognition software programs and on NOMINDEX, a tool that extracts MeSH concepts from medical text in natural language and that is mainly based on a French medical lexicon and on the UMLS. For each document, the process consists of three steps: (1) dictation and digital audio recording, (2) speech recognition, (3) automatic indexing. The evaluation consisted of a comparison between the set of concepts extracted by NOMINDEX after the speech recognition phase and the set of keywords manually extracted from the initial document. The method was evaluated on a set of 28 patient discharge summaries extracted from the MENELAS corpus in French, corresponding to in-patients admitted for coronarography. The overall precision was 73% and the overall recall was 90%. Indexing errors were mainly due to word sense ambiguity and abbreviations. A specific issue was the fact that the standard French translation of MeSH terms lacks diacritics. A preliminary evaluation of speech recognition tools showed that the rate of accurate recognition was higher than 98%. Only 3% of the indexing errors were generated by inadequate speech recognition. We discuss several areas to focus on to improve this prototype. However, the very low rate of indexing errors due to speech recognition errors highlights the potential benefits of combining speech recognition techniques and automatic indexing.

  1. Automatic Activation of Phonological Code during Visual Word Recognition in Children: A Masked Priming Study in Grades 3 and 5

    ERIC Educational Resources Information Center

    Sauval, Karinne; Perre, Laetitia; Casalis, Séverine

    2017-01-01

    The present study aimed to investigate the development of automatic phonological processes involved in visual word recognition during reading acquisition in French. A visual masked priming lexical decision experiment was carried out with third, fifth graders and adult skilled readers. Three different types of partial overlap between the prime and…

  2. The Use of an Autonomous Pedagogical Agent and Automatic Speech Recognition for Teaching Sight Words to Students with Autism Spectrum Disorder

    ERIC Educational Resources Information Center

    Saadatzi, Mohammad Nasser; Pennington, Robert C.; Welch, Karla C.; Graham, James H.; Scott, Renee E.

    2017-01-01

    In the current study, we examined the effects of an instructional package comprised of an autonomous pedagogical agent, automatic speech recognition, and constant time delay during the instruction of reading sight words aloud to young adults with autism spectrum disorder. We used a concurrent multiple baseline across participants design to…

  3. An Exploration of the Potential of Automatic Speech Recognition to Assist and Enable Receptive Communication in Higher Education

    ERIC Educational Resources Information Center

    Wald, Mike

    2006-01-01

    The potential use of Automatic Speech Recognition to assist receptive communication is explored. The opportunities and challenges that this technology presents students and staff to provide captioning of speech online or in classrooms for deaf or hard of hearing students and assist blind, visually impaired or dyslexic learners to read and search…

  4. Electrooculography-based continuous eye-writing recognition system for efficient assistive communication systems

    PubMed Central

    Shinozaki, Takahiro

    2018-01-01

    Human-computer interface systems whose input is based on eye movements can serve as a means of communication for patients with locked-in syndrome. Eye-writing is one such system; users can input characters by moving their eyes to follow the lines of the strokes corresponding to characters. Although this input method makes it easy for patients to get started because of their familiarity with handwriting, existing eye-writing systems suffer from slow input rates because they require a pause between input characters to simplify the automatic recognition process. In this paper, we propose a continuous eye-writing recognition system that achieves a rapid input rate because it accepts characters eye-written continuously, with no pauses. For recognition purposes, the proposed system first detects eye movements using electrooculography (EOG), and then a hidden Markov model (HMM) is applied to model the EOG signals and recognize the eye-written characters. Additionally, this paper investigates an EOG adaptation that uses a deep neural network (DNN)-based HMM. Experiments with six participants showed an average input speed of 27.9 character/min using Japanese Katakana as the input target characters. A Katakana character-recognition error rate of only 5.0% was achieved using 13.8 minutes of adaptation data. PMID:29425248

  5. Classification of C2C12 cells at differentiation by convolutional neural network of deep learning using phase contrast images.

    PubMed

    Niioka, Hirohiko; Asatani, Satoshi; Yoshimura, Aina; Ohigashi, Hironori; Tagawa, Seiichi; Miyake, Jun

    2018-01-01

    In the field of regenerative medicine, tremendous numbers of cells are necessary for tissue/organ regeneration. Today automatic cell-culturing system has been developed. The next step is constructing a non-invasive method to monitor the conditions of cells automatically. As an image analysis method, convolutional neural network (CNN), one of the deep learning method, is approaching human recognition level. We constructed and applied the CNN algorithm for automatic cellular differentiation recognition of myogenic C2C12 cell line. Phase-contrast images of cultured C2C12 are prepared as input dataset. In differentiation process from myoblasts to myotubes, cellular morphology changes from round shape to elongated tubular shape due to fusion of the cells. CNN abstract the features of the shape of the cells and classify the cells depending on the culturing days from when differentiation is induced. Changes in cellular shape depending on the number of days of culture (Day 0, Day 3, Day 6) are classified with 91.3% accuracy. Image analysis with CNN has a potential to realize regenerative medicine industry.

  6. Automatic recognition of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNNs.

    PubMed

    Han, Guanghui; Liu, Xiabi; Zheng, Guangyuan; Wang, Murong; Huang, Shan

    2018-06-06

    Ground-glass opacity (GGO) is a common CT imaging sign on high-resolution CT, which means the lesion is more likely to be malignant compared to common solid lung nodules. The automatic recognition of GGO CT imaging signs is of great importance for early diagnosis and possible cure of lung cancers. The present GGO recognition methods employ traditional low-level features and system performance improves slowly. Considering the high-performance of CNN model in computer vision field, we proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling is performed on multi-views and multi-receptive fields, which reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has the ability to obtain the optimal fine-tuning model. Multi-CNN models fusion strategy obtains better performance than any single trained model. We evaluated our method on the GGO nodule samples in publicly available LIDC-IDRI dataset of chest CT scans. The experimental results show that our method yields excellent results with 96.64% sensitivity, 71.43% specificity, and 0.83 F1 score. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images. Graphical abstract We proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has ability to obtain the optimal fine-tuning model. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images.

  7. Research Directory for Manpower, Personnel, Training, and Human Factors.

    DTIC Science & Technology

    1991-01-01

    Enhance Automatic Recognition of Speech in Noisy, Highly Stressful Environments Cofod R* Lica Systems Inc 703-359-0996 Smart Contract Preparation...Lab 301-278-2946 Smart Contract Preparation Expediter Frezell T LTCOL Human Engineering Lab 301-278-5998 Impulse Noise Hazard Information Processing R&D

  8. Automatic Recognition and Understanding of the Driving Environment for Driver Feedback

    DOT National Transportation Integrated Search

    2018-01-01

    A smart driving system must consider two key elements to be able to generate recommendations and make driving decisions that are effective and accurate: The environment of the car and the behavior of the driver. Our long-term goal is to develop techn...

  9. Quality based approach for adaptive face recognition

    NASA Astrophysics Data System (ADS)

    Abboud, Ali J.; Sellahewa, Harin; Jassim, Sabah A.

    2009-05-01

    Recent advances in biometric technology have pushed towards more robust and reliable systems. We aim to build systems that have low recognition errors and are less affected by variation in recording conditions. Recognition errors are often attributed to the usage of low quality biometric samples. Hence, there is a need to develop new intelligent techniques and strategies to automatically measure/quantify the quality of biometric image samples and if necessary restore image quality according to the need of the intended application. In this paper, we present no-reference image quality measures in the spatial domain that have impact on face recognition. The first is called symmetrical adaptive local quality index (SALQI) and the second is called middle halve (MH). Also, an adaptive strategy has been developed to select the best way to restore the image quality, called symmetrical adaptive histogram equalization (SAHE). The main benefits of using quality measures for adaptive strategy are: (1) avoidance of excessive unnecessary enhancement procedures that may cause undesired artifacts, and (2) reduced computational complexity which is essential for real time applications. We test the success of the proposed measures and adaptive approach for a wavelet-based face recognition system that uses the nearest neighborhood classifier. We shall demonstrate noticeable improvements in the performance of adaptive face recognition system over the corresponding non-adaptive scheme.

  10. Automatic recognition of seismic intensity based on RS and GIS: a case study in Wenchuan Ms8.0 earthquake of China.

    PubMed

    Zhang, Qiuwen; Zhang, Yan; Yang, Xiaohong; Su, Bin

    2014-01-01

    In recent years, earthquakes have frequently occurred all over the world, which caused huge casualties and economic losses. It is very necessary and urgent to obtain the seismic intensity map timely so as to master the distribution of the disaster and provide supports for quick earthquake relief. Compared with traditional methods of drawing seismic intensity map, which require many investigations in the field of earthquake area or are too dependent on the empirical formulas, spatial information technologies such as Remote Sensing (RS) and Geographical Information System (GIS) can provide fast and economical way to automatically recognize the seismic intensity. With the integrated application of RS and GIS, this paper proposes a RS/GIS-based approach for automatic recognition of seismic intensity, in which RS is used to retrieve and extract the information on damages caused by earthquake, and GIS is applied to manage and display the data of seismic intensity. The case study in Wenchuan Ms8.0 earthquake in China shows that the information on seismic intensity can be automatically extracted from remotely sensed images as quickly as possible after earthquake occurrence, and the Digital Intensity Model (DIM) can be used to visually query and display the distribution of seismic intensity.

  11. TU-C-17A-03: An Integrated Contour Evaluation Software Tool Using Supervised Pattern Recognition for Radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, H; Tan, J; Kavanaugh, J

    Purpose: Radiotherapy (RT) contours delineated either manually or semiautomatically require verification before clinical usage. Manual evaluation is very time consuming. A new integrated software tool using supervised pattern contour recognition was thus developed to facilitate this process. Methods: The contouring tool was developed using an object-oriented programming language C# and application programming interfaces, e.g. visualization toolkit (VTK). The C# language served as the tool design basis. The Accord.Net scientific computing libraries were utilized for the required statistical data processing and pattern recognition, while the VTK was used to build and render 3-D mesh models from critical RT structures in real-timemore » and 360° visualization. Principal component analysis (PCA) was used for system self-updating geometry variations of normal structures based on physician-approved RT contours as a training dataset. The inhouse design of supervised PCA-based contour recognition method was used for automatically evaluating contour normality/abnormality. The function for reporting the contour evaluation results was implemented by using C# and Windows Form Designer. Results: The software input was RT simulation images and RT structures from commercial clinical treatment planning systems. Several abilities were demonstrated: automatic assessment of RT contours, file loading/saving of various modality medical images and RT contours, and generation/visualization of 3-D images and anatomical models. Moreover, it supported the 360° rendering of the RT structures in a multi-slice view, which allows physicians to visually check and edit abnormally contoured structures. Conclusion: This new software integrates the supervised learning framework with image processing and graphical visualization modules for RT contour verification. This tool has great potential for facilitating treatment planning with the assistance of an automatic contour evaluation module in avoiding unnecessary manual verification for physicians/dosimetrists. In addition, its nature as a compact and stand-alone tool allows for future extensibility to include additional functions for physicians’ clinical needs.« less

  12. Face recognition in the thermal infrared domain

    NASA Astrophysics Data System (ADS)

    Kowalski, M.; Grudzień, A.; Palka, N.; Szustakowski, M.

    2017-10-01

    Biometrics refers to unique human characteristics. Each unique characteristic may be used to label and describe individuals and for automatic recognition of a person based on physiological or behavioural properties. One of the most natural and the most popular biometric trait is a face. The most common research methods on face recognition are based on visible light. State-of-the-art face recognition systems operating in the visible light spectrum achieve very high level of recognition accuracy under controlled environmental conditions. Thermal infrared imagery seems to be a promising alternative or complement to visible range imaging due to its relatively high resistance to illumination changes. A thermal infrared image of the human face presents its unique heat-signature and can be used for recognition. The characteristics of thermal images maintain advantages over visible light images, and can be used to improve algorithms of human face recognition in several aspects. Mid-wavelength or far-wavelength infrared also referred to as thermal infrared seems to be promising alternatives. We present the study on 1:1 recognition in thermal infrared domain. The two approaches we are considering are stand-off face verification of non-moving person as well as stop-less face verification on-the-move. The paper presents methodology of our studies and challenges for face recognition systems in the thermal infrared domain.

  13. Computational Modeling of Emotions and Affect in Social-Cultural Interaction

    DTIC Science & Technology

    2013-10-02

    acoustic and textual information sources. Second, a cross-lingual study was performed that shed light on how human perception and automatic recognition...speech is produced, a speaker’s pitch and intonational pattern, and word usage. Better feature representation and advanced approaches were used to...recognition performance, and improved our understanding of language/cultural impact on human perception of emotion and automatic classification. • Units

  14. Geophysical phenomena classification by artificial neural networks

    NASA Technical Reports Server (NTRS)

    Gough, M. P.; Bruckner, J. R.

    1995-01-01

    Space science information systems involve accessing vast data bases. There is a need for an automatic process by which properties of the whole data set can be assimilated and presented to the user. Where data are in the form of spectrograms, phenomena can be detected by pattern recognition techniques. Presented are the first results obtained by applying unsupervised Artificial Neural Networks (ANN's) to the classification of magnetospheric wave spectra. The networks used here were a simple unsupervised Hamming network run on a PC and a more sophisticated CALM network run on a Sparc workstation. The ANN's were compared in their geophysical data recognition performance. CALM networks offer such qualities as fast learning, superiority in generalizing, the ability to continuously adapt to changes in the pattern set, and the possibility to modularize the network to allow the inter-relation between phenomena and data sets. This work is the first step toward an information system interface being developed at Sussex, the Whole Information System Expert (WISE). Phenomena in the data are automatically identified and provided to the user in the form of a data occurrence morphology, the Whole Information System Data Occurrence Morphology (WISDOM), along with relationships to other parameters and phenomena.

  15. Modeling the Perceptual Learning of Novel Dialect Features

    ERIC Educational Resources Information Center

    Tatman, Rachael

    2017-01-01

    All language use reflects the user's social identity in systematic ways. While humans can easily adapt to this sociolinguistic variation, automatic speech recognition (ASR) systems continue to struggle with it. This dissertation makes three main contributions. The first is to provide evidence that modern state-of-the-art commercial ASR systems…

  16. Polarimetric Imaging System for Automatic Target Detection and Recognition

    DTIC Science & Technology

    2000-03-01

    technique shown in Figure 4(b) can also be used to integrate polarizer arrays with other types of imaging sensors, such as LWIR cameras and uncooled...vertical stripe pattern in this φ image is caused by nonuniformities in the particular polarizer array used. 2. CIRCULAR POLARIZATION IMAGING USING

  17. Semi-automated identification of leopard frogs

    USGS Publications Warehouse

    Petrovska-Delacrétaz, Dijana; Edwards, Aaron; Chiasson, John; Chollet, Gérard; Pilliod, David S.

    2014-01-01

    Principal component analysis is used to implement a semi-automatic recognition system to identify recaptured northern leopard frogs (Lithobates pipiens). Results of both open set and closed set experiments are given. The presented algorithm is shown to provide accurate identification of 209 individual leopard frogs from a total set of 1386 images.

  18. Towards Automatic Threat Recognition

    DTIC Science & Technology

    2006-12-01

    York: Bantam. Forschungsinstitut für Kommunikation, Informationsverarbeitung und Ergonomie FGAN Informationstechnik und Führungssysteme KIE Towards...Informationsverarbeitung und Ergonomie FGAN Informationstechnik und Führungssysteme KIE Content Preliminaries about Information Fusion The System Ontology Unification...as Processing Principle Back to the Example Conclusion and Outlook Forschungsinstitut für Kommunikation, Informationsverarbeitung und Ergonomie FGAN

  19. Use of Computer Speech Technologies To Enhance Learning.

    ERIC Educational Resources Information Center

    Ferrell, Joe

    1999-01-01

    Discusses the design of an innovative learning system that uses new technologies for the man-machine interface, incorporating a combination of Automatic Speech Recognition (ASR) and Text To Speech (TTS) synthesis. Highlights include using speech technologies to mimic the attributes of the ideal tutor and design features. (AEF)

  20. Dance recognition system using lower body movement.

    PubMed

    Simpson, Travis T; Wiesner, Susan L; Bennett, Bradford C

    2014-02-01

    The current means of locating specific movements in film necessitate hours of viewing, making the task of conducting research into movement characteristics and patterns tedious and difficult. This is particularly problematic for the research and analysis of complex movement systems such as sports and dance. While some systems have been developed to manually annotate film, to date no automated way of identifying complex, full body movement exists. With pattern recognition technology and knowledge of joint locations, automatically describing filmed movement using computer software is possible. This study used various forms of lower body kinematic analysis to identify codified dance movements. We created an algorithm that compares an unknown move with a specified start and stop against known dance moves. Our recognition method consists of classification and template correlation using a database of model moves. This system was optimized to include nearly 90 dance and Tai Chi Chuan movements, producing accurate name identification in over 97% of trials. In addition, the program had the capability to provide a kinematic description of either matched or unmatched moves obtained from classification recognition.

  1. Speech recognition systems on the Cell Broadband Engine

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Y; Jones, H; Vaidya, S

    In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousandsmore » of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.« less

  2. System transfer modelling for automatic target recognizer evaluations

    NASA Astrophysics Data System (ADS)

    Clark, Lloyd G.

    1991-11-01

    Image processing to accomplish automatic recognition of military vehicles has promised increased weapons systems effectiveness and reduced timelines for a number of Department of Defense missions. Automatic Target Recognizers (ATR) are often claimed to be able to recognize many different ground vehicles as possible targets in military air-to- surface targeting applications. The targeting scenario conditions include different vehicle poses and histories as well as a variety of imaging geometries, intervening atmospheres, and background environments. Testing these ATR subsystems in most cases has been limited to a handful of the scenario conditions of interest, as is represented by imagery collected with the desired imaging sensor. The question naturally arises as to how robust the performance of the ATR is for all scenario conditions of interest, not just for the set of imagery upon which an algorithm was trained.

  3. Automated recognition and extraction of tabular fields for the indexing of census records

    NASA Astrophysics Data System (ADS)

    Clawson, Robert; Bauer, Kevin; Chidester, Glen; Pohontsch, Milan; Kennard, Douglas; Ryu, Jongha; Barrett, William

    2013-01-01

    We describe a system for indexing of census records in tabular documents with the goal of recognizing the content of each cell, including both headers and handwritten entries. Each document is automatically rectified, registered and scaled to a known template following which lines and fields are detected and delimited as cells in a tabular form. Whole-word or whole-phrase recognition of noisy machine-printed text is performed using a glyph library, providing greatly increased efficiency and accuracy (approaching 100%), while avoiding the problems inherent with traditional OCR approaches. Constrained handwriting recognition results for a single author reach as high as 98% and 94.5% for the Gender field and Birthplace respectively. Multi-author accuracy (currently 82%) can be improved through an increased training set. Active integration of user feedback in the system will accelerate the indexing of records while providing a tightly coupled learning mechanism for system improvement.

  4. A neural approach for improving the measurement capability of an electronic nose

    NASA Astrophysics Data System (ADS)

    Chimenti, M.; DeRossi, D.; Di Francesco, F.; Domenici, C.; Pieri, G.; Pioggia, G.; Salvetti, O.

    2003-06-01

    Electronic noses, instruments for automatic recognition of odours, are typically composed of an array of partially selective sensors, a sampling system, a data acquisition device and a data processing system. For the purpose of evaluating the quality of olive oil, an electronic nose based on an array of conducting polymer sensors capable of discriminating olive oil aromas was developed. The selection of suitable pattern recognition techniques for a particular application can enhance the performance of electronic noses. Therefore, an advanced neural recognition algorithm for improving the measurement capability of the device was designed and implemented. This method combines multivariate statistical analysis and a hierarchical neural-network architecture based on self-organizing maps and error back-propagation. The complete system was tested using samples composed of characteristic olive oil aromatic components in refined olive oil. The results obtained have shown that this approach is effective in grouping aromas into different categories representative of their chemical structure.

  5. The impact of OCR accuracy on automated cancer classification of pathology reports.

    PubMed

    Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle

    2012-01-01

    To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.

  6. Face Recognition in Humans and Machines

    NASA Astrophysics Data System (ADS)

    O'Toole, Alice; Tistarelli, Massimo

    The study of human face recognition by psychologists and neuroscientists has run parallel to the development of automatic face recognition technologies by computer scientists and engineers. In both cases, there are analogous steps of data acquisition, image processing, and the formation of representations that can support the complex and diverse tasks we accomplish with faces. These processes can be understood and compared in the context of their neural and computational implementations. In this chapter, we present the essential elements of face recognition by humans and machines, taking a perspective that spans psychological, neural, and computational approaches. From the human side, we overview the methods and techniques used in the neurobiology of face recognition, the underlying neural architecture of the system, the role of visual attention, and the nature of the representations that emerges. From the computational side, we discuss face recognition technologies and the strategies they use to overcome challenges to robust operation over viewing parameters. Finally, we conclude the chapter with a look at some recent studies that compare human and machine performances at face recognition.

  7. Image processing and machine learning in the morphological analysis of blood cells.

    PubMed

    Rodellar, J; Alférez, S; Acevedo, A; Molina, A; Merino, A

    2018-05-01

    This review focuses on how image processing and machine learning can be useful for the morphological characterization and automatic recognition of cell images captured from peripheral blood smears. The basics of the 3 core elements (segmentation, quantitative features, and classification) are outlined, and recent literature is discussed. Although red blood cells are a significant part of this context, this study focuses on malignant lymphoid cells and blast cells. There is no doubt that these technologies may help the cytologist to perform efficient, objective, and fast morphological analysis of blood cells. They may also help in the interpretation of some morphological features and may serve as learning and survey tools. Although research is still needed, it is important to define screening strategies to exploit the potential of image-based automatic recognition systems integrated in the daily routine of laboratories along with other analysis methodologies. © 2018 John Wiley & Sons Ltd.

  8. Convolutional neural networks with balanced batches for facial expressions recognition

    NASA Astrophysics Data System (ADS)

    Battini Sönmez, Elena; Cangelosi, Angelo

    2017-03-01

    This paper considers the issue of fully automatic emotion classification on 2D faces. In spite of the great effort done in recent years, traditional machine learning approaches based on hand-crafted feature extraction followed by the classification stage failed to develop a real-time automatic facial expression recognition system. The proposed architecture uses Convolutional Neural Networks (CNN), which are built as a collection of interconnected processing elements to simulate the brain of human beings. The basic idea of CNNs is to learn a hierarchical representation of the input data, which results in a better classification performance. In this work we present a block-based CNN algorithm, which uses noise, as data augmentation technique, and builds batches with a balanced number of samples per class. The proposed architecture is a very simple yet powerful CNN, which can yield state-of-the-art accuracy on the very competitive benchmark algorithm of the Extended Cohn Kanade database.

  9. Real-time Flare Detection in Ground-Based Hα Imaging at Kanzelhöhe Observatory

    NASA Astrophysics Data System (ADS)

    Pötzi, W.; Veronig, A. M.; Riegler, G.; Amerstorfer, U.; Pock, T.; Temmer, M.; Polanec, W.; Baumgartner, D. J.

    2015-03-01

    Kanzelhöhe Observatory (KSO) regularly performs high-cadence full-disk imaging of the solar chromosphere in the Hα and Ca ii K spectral lines as well as in the solar photosphere in white light. In the frame of ESA's (European Space Agency) Space Situational Awareness (SSA) program, a new system for real-time Hα data provision and automatic flare detection was developed at KSO. The data and events detected are published in near real-time at ESA's SSA Space Weather portal (http://swe.ssa.esa.int/web/guest/kso-federated). In this article, we describe the Hα instrument, the image-recognition algorithms we developed, and the implementation into the KSO Hα observing system. We also present the evaluation results of the real-time data provision and flare detection for a period of five months. The Hα data provision worked in 99.96 % of the images, with a mean time lag of four seconds between image recording and online provision. Within the given criteria for the automatic image-recognition system (at least three Hα images are needed for a positive detection), all flares with an area ≥ 50 micro-hemispheres that were located within 60° of the solar center and occurred during the KSO observing times were detected, a number of 87 events in total. The automatically determined flare importance and brightness classes were correct in ˜ 85 %. The mean flare positions in heliographic longitude and latitude were correct to within ˜ 1°. The median of the absolute differences for the flare start and peak times from the automatic detections in comparison with the official NOAA (and KSO) visual flare reports were 3 min (1 min).

  10. Specific acoustic models for spontaneous and dictated style in indonesian speech recognition

    NASA Astrophysics Data System (ADS)

    Vista, C. B.; Satriawan, C. H.; Lestari, D. P.; Widyantoro, D. H.

    2018-03-01

    The performance of an automatic speech recognition system is affected by differences in speech style between the data the model is originally trained upon and incoming speech to be recognized. In this paper, the usage of GMM-HMM acoustic models for specific speech styles is investigated. We develop two systems for the experiments; the first employs a speech style classifier to predict the speech style of incoming speech, either spontaneous or dictated, then decodes this speech using an acoustic model specifically trained for that speech style. The second system uses both acoustic models to recognise incoming speech and decides upon a final result by calculating a confidence score of decoding. Results show that training specific acoustic models for spontaneous and dictated speech styles confers a slight recognition advantage as compared to a baseline model trained on a mixture of spontaneous and dictated training data. In addition, the speech style classifier approach of the first system produced slightly more accurate results than the confidence scoring employed in the second system.

  11. Construction, implementation and testing of an image identification system using computer vision methods for fruit flies with economic importance (Diptera: Tephritidae).

    PubMed

    Wang, Jiang-Ning; Chen, Xiao-Lin; Hou, Xin-Wen; Zhou, Li-Bing; Zhu, Chao-Dong; Ji, Li-Qiang

    2017-07-01

    Many species of Tephritidae are damaging to fruit, which might negatively impact international fruit trade. Automatic or semi-automatic identification of fruit flies are greatly needed for diagnosing causes of damage and quarantine protocols for economically relevant insects. A fruit fly image identification system named AFIS1.0 has been developed using 74 species belonging to six genera, which include the majority of pests in the Tephritidae. The system combines automated image identification and manual verification, balancing operability and accuracy. AFIS1.0 integrates image analysis and expert system into a content-based image retrieval framework. In the the automatic identification module, AFIS1.0 gives candidate identification results. Afterwards users can do manual selection based on comparing unidentified images with a subset of images corresponding to the automatic identification result. The system uses Gabor surface features in automated identification and yielded an overall classification success rate of 87% to the species level by Independent Multi-part Image Automatic Identification Test. The system is useful for users with or without specific expertise on Tephritidae in the task of rapid and effective identification of fruit flies. It makes the application of computer vision technology to fruit fly recognition much closer to production level. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  12. Towards automatic musical instrument timbre recognition

    NASA Astrophysics Data System (ADS)

    Park, Tae Hong

    This dissertation is comprised of two parts---focus on issues concerning research and development of an artificial system for automatic musical instrument timbre recognition and musical compositions. The technical part of the essay includes a detailed record of developed and implemented algorithms for feature extraction and pattern recognition. A review of existing literature introducing historical aspects surrounding timbre research, problems associated with a number of timbre definitions, and highlights of selected research activities that have had significant impact in this field are also included. The developed timbre recognition system follows a bottom-up, data-driven model that includes a pre-processing module, feature extraction module, and a RBF/EBF (Radial/Elliptical Basis Function) neural network-based pattern recognition module. 829 monophonic samples from 12 instruments have been chosen from the Peter Siedlaczek library (Best Service) and other samples from the Internet and personal collections. Significant emphasis has been put on feature extraction development and testing to achieve robust and consistent feature vectors that are eventually passed to the neural network module. In order to avoid a garbage-in-garbage-out (GIGO) trap and improve generality, extra care was taken in designing and testing the developed algorithms using various dynamics, different playing techniques, and a variety of pitches for each instrument with inclusion of attack and steady-state portions of a signal. Most of the research and development was conducted in Matlab. The compositional part of the essay includes brief introductions to "A d'Ess Are ," "Aboji," "48 13 N, 16 20 O," and "pH-SQ." A general outline pertaining to the ideas and concepts behind the architectural designs of the pieces including formal structures, time structures, orchestration methods, and pitch structures are also presented.

  13. Impact of translation on named-entity recognition in radiology texts

    PubMed Central

    Pedro, Vasco

    2017-01-01

    Abstract Radiology reports describe the results of radiography procedures and have the potential of being a useful source of information which can bring benefits to health care systems around the world. One way to automatically extract information from the reports is by using Text Mining tools. The problem is that these tools are mostly developed for English and reports are usually written in the native language of the radiologist, which is not necessarily English. This creates an obstacle to the sharing of Radiology information between different communities. This work explores the solution of translating the reports to English before applying the Text Mining tools, probing the question of what translation approach should be used. We created MRRAD (Multilingual Radiology Research Articles Dataset), a parallel corpus of Portuguese research articles related to Radiology and a number of alternative translations (human, automatic and semi-automatic) to English. This is a novel corpus which can be used to move forward the research on this topic. Using MRRAD we studied which kind of automatic or semi-automatic translation approach is more effective on the Named-entity recognition task of finding RadLex terms in the English version of the articles. Considering the terms extracted from human translations as our gold standard, we calculated how similar to this standard were the terms extracted using other translations. We found that a completely automatic translation approach using Google leads to F-scores (between 0.861 and 0.868, depending on the extraction approach) similar to the ones obtained through a more expensive semi-automatic translation approach using Unbabel (between 0.862 and 0.870). To better understand the results we also performed a qualitative analysis of the type of errors found in the automatic and semi-automatic translations. Database URL: https://github.com/lasigeBioTM/MRRAD PMID:29220455

  14. Cross spectral, active and passive approach to face recognition for improved performance

    NASA Astrophysics Data System (ADS)

    Grudzien, A.; Kowalski, M.; Szustakowski, M.

    2017-08-01

    Biometrics is a technique for automatic recognition of a person based on physiological or behavior characteristics. Since the characteristics used are unique, biometrics can create a direct link between a person and identity, based on variety of characteristics. The human face is one of the most important biometric modalities for automatic authentication. The most popular method of face recognition which relies on processing of visual information seems to be imperfect. Thermal infrared imagery may be a promising alternative or complement to visible range imaging due to its several reasons. This paper presents an approach of combining both methods.

  15. A new accurate pill recognition system using imprint information

    NASA Astrophysics Data System (ADS)

    Chen, Zhiyuan; Kamata, Sei-ichiro

    2013-12-01

    Great achievements in modern medicine benefit human beings. Also, it has brought about an explosive growth of pharmaceuticals that current in the market. In daily life, pharmaceuticals sometimes confuse people when they are found unlabeled. In this paper, we propose an automatic pill recognition technique to solve this problem. It functions mainly based on the imprint feature of the pills, which is extracted by proposed MSWT (modified stroke width transform) and described by WSC (weighted shape context). Experiments show that our proposed pill recognition method can reach an accurate rate up to 92.03% within top 5 ranks when trying to classify more than 10 thousand query pill images into around 2000 categories.

  16. Automatic Picking of Foraminifera: Design of the Foraminifera Image Recognition and Sorting Tool (FIRST) Prototype and Results of the Image Classification Scheme

    NASA Astrophysics Data System (ADS)

    de Garidel-Thoron, T.; Marchant, R.; Soto, E.; Gally, Y.; Beaufort, L.; Bolton, C. T.; Bouslama, M.; Licari, L.; Mazur, J. C.; Brutti, J. M.; Norsa, F.

    2017-12-01

    Foraminifera tests are the main proxy carriers for paleoceanographic reconstructions. Both geochemical and taxonomical studies require large numbers of tests to achieve statistical relevance. To date, the extraction of foraminifera from the sediment coarse fraction is still done by hand and thus time-consuming. Moreover, the recognition of morphotypes, ecologically relevant, requires some taxonomical skills not easily taught. The automatic recognition and extraction of foraminifera would largely help paleoceanographers to overcome these issues. Recent advances in automatic image classification using machine learning opens the way to automatic extraction of foraminifera. Here we detail progress on the design of an automatic picking machine as part of the FIRST project. The machine handles 30 pre-sieved samples (100-1000µm), separating them into individual particles (including foraminifera) and imaging each in pseudo-3D. The particles are classified and specimens of interest are sorted either for Individual Foraminifera Analyses (44 per slide) and/or for classical multiple analyses (8 morphological classes per slide, up to 1000 individuals per hole). The classification is based on machine learning using Convolutional Neural Networks (CNNs), similar to the approach used in the coccolithophorid imaging system SYRACO. To prove its feasibility, we built two training image datasets of modern planktonic foraminifera containing approximately 2000 and 5000 images each, corresponding to 15 & 25 morphological classes. Using a CNN with a residual topology (ResNet) we achieve over 95% correct classification for each dataset. We tested the network on 160,000 images from 45 depths of a sediment core from the Pacific ocean, for which we have human counts. The current algorithm is able to reproduce the downcore variability in both Globigerinoides ruber and the fragmentation index (r2 = 0.58 and 0.88 respectively). The FIRST prototype yields some promising results for high-resolution paleoceanographic studies and evolutionary studies.

  17. Target recognition of log-polar ladar range images using moment invariants

    NASA Astrophysics Data System (ADS)

    Xia, Wenze; Han, Shaokun; Cao, Jie; Yu, Haoyong

    2017-01-01

    The ladar range image has received considerable attentions in the automatic target recognition field. However, previous research does not cover target recognition using log-polar ladar range images. Therefore, we construct a target recognition system based on log-polar ladar range images in this paper. In this system combined moment invariants and backpropagation neural network are selected as shape descriptor and shape classifier, respectively. In order to fully analyze the effect of log-polar sampling pattern on recognition result, several comparative experiments based on simulated and real range images are carried out. Eventually, several important conclusions are drawn: (i) if combined moments are computed directly by log-polar range images, translation, rotation and scaling invariant properties of combined moments will be invalid (ii) when object is located in the center of field of view, recognition rate of log-polar range images is less sensitive to the changing of field of view (iii) as object position changes from center to edge of field of view, recognition performance of log-polar range images will decline dramatically (iv) log-polar range images has a better noise robustness than Cartesian range images. Finally, we give a suggestion that it is better to divide field of view into recognition area and searching area in the real application.

  18. Factors that influence the performance of experienced speech recognition users.

    PubMed

    Koester, Heidi Horstmann

    2006-01-01

    Performance on automatic speech recognition (ASR) systems for users with physical disabilities varies widely between individuals. The goal of this study was to discover some key factors that account for that variation. Using data from 23 experienced ASR users with physical disabilities, the effect of 20 different independent variables on recognition accuracy and text entry rate with ASR was measured using bivariate and multivariate analyses. The results show that use of appropriate correction strategies had the strongest influence on user performance with ASR. The amount of time the user spent on his or her computer, the user's manual typing speed, and the speed with which the ASR system recognized speech were all positively associated with better performance. The amount or perceived adequacy of ASR training did not have a significant impact on performance for this user group.

  19. Quest Hierarchy for Hyperspectral Face Recognition

    DTIC Science & Technology

    2011-03-01

    numerous face recognition algorithms available, several very good literature surveys are available that include Abate [29], Samal [110], Kong [18], Zou...Perception, Japan (January 1994). [110] Samal , Ashok and P. Iyengar, Automatic Recognition and Analysis of Human Faces and Facial Expressions: A Survey

  20. The Mechanical Recognition of Speech: Prospects for Use in the Teaching of Languages.

    ERIC Educational Resources Information Center

    Pulliam, Robert

    1970-01-01

    This paper begins with a brief account of the development of automatic speech recogniton (ASR) and then proceeds to an examination of ASR systems typical of the kind now in operation. It is stressed that such systems, although highly developed, do not recognize speech in the same sense as the human being does, and that they can not deal with a…

  1. Automatic Target Recognition Classification System Evaluation Methodology

    DTIC Science & Technology

    2002-09-01

    Testing Set of Two-Class XOR Data (250 Samples)......................................... 2-59 2.27 Decision Analysis Process Flow Chart...ROC curve meta - analysis , which is the estimation of the true ROC curve of a given diagnostic system through ROC analysis across many studies or...technique can be very effective in sensitivity analysis ; trying to determine which data points have the most effect on the solution, and in

  2. The Suitability of Cloud-Based Speech Recognition Engines for Language Learning

    ERIC Educational Resources Information Center

    Daniels, Paul; Iwago, Koji

    2017-01-01

    As online automatic speech recognition (ASR) engines become more accurate and more widely implemented with call software, it becomes important to evaluate the effectiveness and the accuracy of these recognition engines using authentic speech samples. This study investigates two of the most prominent cloud-based speech recognition engines--Apple's…

  3. Model-based vision system for automatic recognition of structures in dental radiographs

    NASA Astrophysics Data System (ADS)

    Acharya, Raj S.; Samarabandu, Jagath K.; Hausmann, E.; Allen, K. A.

    1991-07-01

    X-ray diagnosis of destructive periodontal disease requires assessing serial radiographs by an expert to determine the change in the distance between cemento-enamel junction (CEJ) and the bone crest. To achieve this without the subjectivity of a human expert, a knowledge based system is proposed to automatically locate the two landmarks which are the CEJ and the level of alveolar crest at its junction with the periodontal ligament space. This work is a part of an ongoing project to automatically measure the distance between CEJ and the bone crest along a line parallel to the axis of the tooth. The approach presented in this paper is based on identifying a prominent feature such as the tooth boundary using local edge detection and edge thresholding to establish a reference and then using model knowledge to process sub-regions in locating the landmarks. Segmentation techniques invoked around these regions consists of a neural-network like hierarchical refinement scheme together with local gradient extraction, multilevel thresholding and ridge tracking. Recognition accuracy is further improved by first locating the easily identifiable parts of the bone surface and the interface between the enamel and the dentine and then extending these boundaries towards the periodontal ligament space and the tooth boundary respectively. The system is realized as a collection of tools (or knowledge sources) for pre-processing, segmentation, primary and secondary feature detection and a control structure based on the blackboard model to coordinate the activities of these tools.

  4. A new fast scanning system for the measurement of large angle tracks in nuclear emulsions

    NASA Astrophysics Data System (ADS)

    Alexandrov, A.; Buonaura, A.; Consiglio, L.; D'Ambrosio, N.; De Lellis, G.; Di Crescenzo, A.; Di Marco, N.; Galati, G.; Lauria, A.; Montesi, M. C.; Pupilli, F.; Shchedrina, T.; Tioukov, V.; Vladymyrov, M.

    2015-11-01

    Nuclear emulsions have been widely used in particle physics to identify new particles through the observation of their decays thanks to their unique spatial resolution. Nevertheless, before the advent of automatic scanning systems, the emulsion analysis was very demanding in terms of well trained manpower. Due to this reason, they were gradually replaced by electronic detectors, until the '90s, when automatic microscopes started to be developed in Japan and in Europe. Automatic scanning was essential to conceive large scale emulsion-based neutrino experiments like CHORUS, DONUT and OPERA. Standard scanning systems have been initially designed to recognize tracks within a limited angular acceptance (θ lesssim 30°) where θ is the track angle with respect to a line perpendicular to the emulsion plane. In this paper we describe the implementation of a novel fast automatic scanning system aimed at extending the track recognition to the full angular range and improving the present scanning speed. Indeed, nuclear emulsions do not have any intrinsic limit to detect particle direction. Such improvement opens new perspectives to use nuclear emulsions in several fields in addition to large scale neutrino experiments, like muon radiography, medical applications and dark matter directional detection.

  5. Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners With Simulated Age-Related Hearing Loss.

    PubMed

    Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian

    2017-09-18

    The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist audiologists/hearing-aid dispensers in the fine-tuning of hearing aids. Sixty young participants with normal hearing listened to speech materials mimicking the perceptual consequences of ARHL at different levels of severity. Two intelligibility tests (repetition of words and sentences) and 1 comprehension test (responding to oral commands by moving virtual objects) were administered. Several language models were developed and used by the ASR system in order to fit human performances. Strong significant positive correlations were observed between human and ASR scores, with coefficients up to .99. However, the spectral smearing used to simulate losses in frequency selectivity caused larger declines in ASR performance than in human performance. Both intelligibility and comprehension scores for listeners with simulated ARHL are highly correlated with the performances of an ASR-based system. In the future, it needs to be determined if the ASR system is similarly successful in predicting speech processing in noise and by older people with ARHL.

  6. Can a CNN recognize Catalan diet?

    NASA Astrophysics Data System (ADS)

    Herruzo, P.; Bolaños, M.; Radeva, P.

    2016-10-01

    Nowadays, we can find several diseases related to the unhealthy diet habits of the population, such as diabetes, obesity, anemia, bulimia and anorexia. In many cases, these diseases are related to the food consumption of people. Mediterranean diet is scientifically known as a healthy diet that helps to prevent many metabolic diseases. In particular, our work focuses on the recognition of Mediterranean food and dishes. The development of this methodology would allow to analise the daily habits of users with wearable cameras, within the topic of lifelogging. By using automatic mechanisms we could build an objective tool for the analysis of the patient's behavior, allowing specialists to discover unhealthy food patterns and understand the user's lifestyle. With the aim to automatically recognize a complete diet, we introduce a challenging multi-labeled dataset related to Mediter-ranean diet called FoodCAT. The first type of label provided consists of 115 food classes with an average of 400 images per dish, and the second one consists of 12 food categories with an average of 3800 pictures per class. This dataset will serve as a basis for the development of automatic diet recognition. In this context, deep learning and more specifically, Convolutional Neural Networks (CNNs), currently are state-of-the-art methods for automatic food recognition. In our work, we compare several architectures for image classification, with the purpose of diet recognition. Applying the best model for recognising food categories, we achieve a top-1 accuracy of 72.29%, and top-5 of 97.07%. In a complete diet recognition of dishes from Mediterranean diet, enlarged with the Food-101 dataset for international dishes recognition, we achieve a top-1 accuracy of 68.07%, and top-5 of 89.53%, for a total of 115+101 food classes.

  7. Embodied Memory Judgments: A Case of Motor Fluency

    ERIC Educational Resources Information Center

    Yang, Shu-Ju; Gallo, David A.; Beilock, Sian L.

    2009-01-01

    It is well known that perceptual and conceptual fluency can influence episodic memory judgments. Here, the authors asked whether fluency arising from the motor system also impacts recognition memory. Past research has shown that the perception of letters automatically activates motor programs of typing actions in skilled typists. In this study,…

  8. Data Intensive Systems (DIS) Benchmark Performance Summary

    DTIC Science & Technology

    2003-08-01

    models assumed by today’s conventional architectures. Such applications include model- based Automatic Target Recognition (ATR), synthetic aperture...radar (SAR) codes, large scale dynamic databases/battlefield integration, dynamic sensor- based processing, high-speed cryptanalysis, high speed...distributed interactive and data intensive simulations, data-oriented problems characterized by pointer- based and other highly irregular data structures

  9. Validation of Automated Scoring of Oral Reading

    ERIC Educational Resources Information Center

    Balogh, Jennifer; Bernstein, Jared; Cheng, Jian; Van Moere, Alistair; Townshend, Brent; Suzuki, Masanori

    2012-01-01

    A two-part experiment is presented that validates a new measurement tool for scoring oral reading ability. Data collected by the U.S. government in a large-scale literacy assessment of adults were analyzed by a system called VersaReader that uses automatic speech recognition and speech processing technologies to score oral reading fluency. In the…

  10. The Influence of Anticipation of Word Misrecognition on the Likelihood of Stuttering

    ERIC Educational Resources Information Center

    Brocklehurst, Paul H.; Lickley, Robin J.; Corley, Martin

    2012-01-01

    This study investigates whether the experience of stuttering can result from the speaker's anticipation of his words being misrecognized. Twelve adults who stutter (AWS) repeated single words into what appeared to be an automatic speech-recognition system. Following each iteration of each word, participants provided a self-rating of whether they…

  11. Automatic vigilance for negative words in lexical decision and naming: comment on Larsen, Mercer, and Balota (2006).

    PubMed

    Estes, Zachary; Adelman, James S

    2008-08-01

    An automatic vigilance hypothesis states that humans preferentially attend to negative stimuli, and this attention to negative valence disrupts the processing of other stimulus properties. Thus, negative words typically elicit slower color naming, word naming, and lexical decisions than neutral or positive words. Larsen, Mercer, and Balota analyzed the stimuli from 32 published studies, and they found that word valence was confounded with several lexical factors known to affect word recognition. Indeed, with these lexical factors covaried out, Larsen et al. found no evidence of automatic vigilance. The authors report a more sensitive analysis of 1011 words. Results revealed a small but reliable valence effect, such that negative words (e.g., "shark") elicit slower lexical decisions and naming than positive words (e.g., "beach"). Moreover, the relation between valence and recognition was categorical rather than linear; the extremity of a word's valence did not affect its recognition. This valence effect was not attributable to word length, frequency, orthographic neighborhood size, contextual diversity, first phoneme, or arousal. Thus, the present analysis provides the most powerful demonstration of automatic vigilance to date.

  12. Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity

    NASA Astrophysics Data System (ADS)

    Moses, David A.; Mesgarani, Nima; Leonard, Matthew K.; Chang, Edward F.

    2016-10-01

    Objective. The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arbitrary vocabulary sizes using the high gamma band power of local field potentials in the STG and neighboring cortical areas obtained via electrocorticography. Approach. The system implements a Viterbi decoder that incorporates phoneme likelihood estimates from a linear discriminant analysis model and transition probabilities from an n-gram phonemic language model. Grid searches were used in an attempt to determine optimal parameterizations of the feature vectors and Viterbi decoder. Main results. The performance of the system was significantly improved by using spatiotemporal representations of the neural activity (as opposed to purely spatial representations) and by including language modeling and Viterbi decoding in the NSR system. Significance. These results emphasize the importance of modeling the temporal dynamics of neural responses when analyzing their variations with respect to varying stimuli and demonstrate that speech recognition techniques can be successfully leveraged when decoding speech from neural signals. Guided by the results detailed in this work, further development of the NSR system could have applications in the fields of automatic speech recognition and neural prosthetics.

  13. Analysis of straw row in the image to control the trajectory of the agricultural combine harvester

    NASA Astrophysics Data System (ADS)

    Shkanaev, Aleksandr Yurievich; Polevoy, Dmitry Valerevich; Panchenko, Aleksei Vladimirovich; Krokhina, Darya Alekseevna; Nailevish, Sadekov Rinat

    2018-04-01

    The paper proposes a solution to the automatic operation of the combine harvester along the straw rows by means of the images from the camera, installed in the cab of the harvester. The U-Net is used to recognize straw rows in the image. The edges of the row are approximated in the segmented image by the curved lines and further converted into the harvester coordinate system for the automatic operating system. The "new" network architecture and approaches to the row approximation has improved the quality of the recognition task and the processing speed of the frames up to 96% and 7.5 fps, respectively. Keywords: Grain harvester,

  14. Maximum mutual information estimation of a simplified hidden MRF for offline handwritten Chinese character recognition

    NASA Astrophysics Data System (ADS)

    Xiong, Yan; Reichenbach, Stephen E.

    1999-01-01

    Understanding of hand-written Chinese characters is at such a primitive stage that models include some assumptions about hand-written Chinese characters that are simply false. So Maximum Likelihood Estimation (MLE) may not be an optimal method for hand-written Chinese characters recognition. This concern motivates the research effort to consider alternative criteria. Maximum Mutual Information Estimation (MMIE) is an alternative method for parameter estimation that does not derive its rationale from presumed model correctness, but instead examines the pattern-modeling problem in automatic recognition system from an information- theoretic point of view. The objective of MMIE is to find a set of parameters in such that the resultant model allows the system to derive from the observed data as much information as possible about the class. We consider MMIE for recognition of hand-written Chinese characters using on a simplified hidden Markov Random Field. MMIE provides improved performance improvement over MLE in this application.

  15. Ambient agents: embedded agents for remote control and monitoring using the PANGEA platform.

    PubMed

    Villarrubia, Gabriel; De Paz, Juan F; Bajo, Javier; Corchado, Juan M

    2014-07-31

    Ambient intelligence has advanced significantly during the last few years. The incorporation of image processing and artificial intelligence techniques have opened the possibility for such aspects as pattern recognition, thus allowing for a better adaptation of these systems. This study presents a new model of an embedded agent especially designed to be implemented in sensing devices with resource constraints. This new model of an agent is integrated within the PANGEA (Platform for the Automatic Construction of Organiztions of Intelligent Agents) platform, an organizational-based platform, defining a new sensor role in the system and aimed at providing contextual information and interacting with the environment. A case study was developed over the PANGEA platform and designed using different agents and sensors responsible for providing user support at home in the event of incidents or emergencies. The system presented in the case study incorporates agents in Arduino hardware devices with recognition modules and illuminated bands; it also incorporates IP cameras programmed for automatic tracking, which can connect remotely in the event of emergencies. The user wears a bracelet, which contains a simple vibration sensor that can receive notifications about the emergency situation.

  16. Ambient Agents: Embedded Agents for Remote Control and Monitoring Using the PANGEA Platform

    PubMed Central

    Villarrubia, Gabriel; De Paz, Juan F.; Bajo, Javier; Corchado, Juan M.

    2014-01-01

    Ambient intelligence has advanced significantly during the last few years. The incorporation of image processing and artificial intelligence techniques have opened the possibility for such aspects as pattern recognition, thus allowing for a better adaptation of these systems. This study presents a new model of an embedded agent especially designed to be implemented in sensing devices with resource constraints. This new model of an agent is integrated within the PANGEA (Platform for the Automatic Construction of Organiztions of Intelligent Agents) platform, an organizational-based platform, defining a new sensor role in the system and aimed at providing contextual information and interacting with the environment. A case study was developed over the PANGEA platform and designed using different agents and sensors responsible for providing user support at home in the event of incidents or emergencies. The system presented in the case study incorporates agents in Arduino hardware devices with recognition modules and illuminated bands; it also incorporates IP cameras programmed for automatic tracking, which can connect remotely in the event of emergencies. The user wears a bracelet, which contains a simple vibration sensor that can receive notifications about the emergency situation. PMID:25090416

  17. Automatic Recognition of Phonemes Using a Syntactic Processor for Error Correction.

    DTIC Science & Technology

    1980-12-01

    OF PHONEMES USING A SYNTACTIC PROCESSOR FOR ERROR CORRECTION THESIS AFIT/GE/EE/8D-45 Robert B. ’Taylor 2Lt USAF Approved for public release...distribution unlimilted. AbP AFIT/GE/EE/ 80D-45 AUTOMATIC RECOGNITION OF PHONEMES USING A SYNTACTIC PROCESSOR FOR ERROR CORRECTION THESIS Presented to the...Testing ..................... 37 Bayes Decision Rule for Minimum Error ........... 37 Bayes Decision Rule for Minimum Risk ............ 39 Mini Max Test

  18. Assessment of Severe Apnoea through Voice Analysis, Automatic Speech, and Speaker Recognition Techniques

    NASA Astrophysics Data System (ADS)

    Fernández Pozo, Rubén; Blanco Murillo, Jose Luis; Hernández Gómez, Luis; López Gonzalo, Eduardo; Alcázar Ramírez, José; Toledano, Doroteo T.

    2009-12-01

    This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.

  19. Building intelligent communication systems for handicapped aphasiacs.

    PubMed

    Fu, Yu-Fen; Ho, Cheng-Seen

    2010-01-01

    This paper presents an intelligent system allowing handicapped aphasiacs to perform basic communication tasks. It has the following three key features: (1) A 6-sensor data glove measures the finger gestures of a patient in terms of the bending degrees of his fingers. (2) A finger language recognition subsystem recognizes language components from the finger gestures. It employs multiple regression analysis to automatically extract proper finger features so that the recognition model can be fast and correctly constructed by a radial basis function neural network. (3) A coordinate-indexed virtual keyboard allows the users to directly access the letters on the keyboard at a practical speed. The system serves as a viable tool for natural and affordable communication for handicapped aphasiacs through continuous finger language input.

  20. Comparison of eye imaging pattern recognition using neural network

    NASA Astrophysics Data System (ADS)

    Bukhari, W. M.; Syed A., M.; Nasir, M. N. M.; Sulaima, M. F.; Yahaya, M. S.

    2015-05-01

    The beauty of eye recognition system that it is used in automatic identifying and verifies a human weather from digital images or video source. There are various behaviors of the eye such as the color of the iris, size of pupil and shape of the eye. This study represents the analysis, design and implementation of a system for recognition of eye imaging. All the eye images that had been captured from the webcam in RGB format must through several techniques before it can be input for the pattern and recognition processes. The result shows that the final value of weight and bias after complete training 6 eye images for one subject is memorized by the neural network system and be the reference value of the weight and bias for the testing part. The target classifies to 5 different types for 5 subjects. The eye images can recognize the subject based on the target that had been set earlier during the training process. When the values between new eye image and the eye image in the database are almost equal, it is considered the eye image is matched.

  1. Human detection and motion analysis at security points

    NASA Astrophysics Data System (ADS)

    Ozer, I. Burak; Lv, Tiehan; Wolf, Wayne H.

    2003-08-01

    This paper presents a real-time video surveillance system for the recognition of specific human activities. Specifically, the proposed automatic motion analysis is used as an on-line alarm system to detect abnormal situations in a campus environment. A smart multi-camera system developed at Princeton University is extended for use in smart environments in which the camera detects the presence of multiple persons as well as their gestures and their interaction in real-time.

  2. Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition

    NASA Astrophysics Data System (ADS)

    Drygajlo, Andrzej

    Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide a coherent way of quantifying and presenting recorded voice as biometric evidence. In such methods, the biometric evidence consists of the quantified degree of similarity between speaker-dependent features extracted from the trace and speaker-dependent features extracted from recorded speech of a suspect. The interpretation of recorded voice as evidence in the forensic context presents particular challenges, including within-speaker (within-source) variability and between-speakers (between-sources) variability. Consequently, FASR methods must provide a statistical evaluation which gives the court an indication of the strength of the evidence given the estimated within-source and between-sources variabilities. This paper reports on the first ENFSI evaluation campaign through a fake case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.

  3. Feature extraction for face recognition via Active Shape Model (ASM) and Active Appearance Model (AAM)

    NASA Astrophysics Data System (ADS)

    Iqtait, M.; Mohamad, F. S.; Mamat, M.

    2018-03-01

    Biometric is a pattern recognition system which is used for automatic recognition of persons based on characteristics and features of an individual. Face recognition with high recognition rate is still a challenging task and usually accomplished in three phases consisting of face detection, feature extraction, and expression classification. Precise and strong location of trait point is a complicated and difficult issue in face recognition. Cootes proposed a Multi Resolution Active Shape Models (ASM) algorithm, which could extract specified shape accurately and efficiently. Furthermore, as the improvement of ASM, Active Appearance Models algorithm (AAM) is proposed to extracts both shape and texture of specified object simultaneously. In this paper we give more details about the two algorithms and give the results of experiments, testing their performance on one dataset of faces. We found that the ASM is faster and gains more accurate trait point location than the AAM, but the AAM gains a better match to the texture.

  4. Automated aural classification used for inter-species discrimination of cetaceans.

    PubMed

    Binder, Carolyn M; Hines, Paul C

    2014-04-01

    Passive acoustic methods are in widespread use to detect and classify cetacean species; however, passive acoustic systems often suffer from large false detection rates resulting from numerous transient sources. To reduce the acoustic analyst workload, automatic recognition methods may be implemented in a two-stage process. First, a general automatic detector is implemented that produces many detections to ensure cetacean presence is noted. Then an automatic classifier is used to significantly reduce the number of false detections and classify the cetacean species. This process requires development of a robust classifier capable of performing inter-species classification. Because human analysts can aurally discriminate species, an automated aural classifier that uses perceptual signal features was tested on a cetacean data set. The classifier successfully discriminated between four species of cetaceans-bowhead, humpback, North Atlantic right, and sperm whales-with 85% accuracy. It also performed well (100% accuracy) for discriminating sperm whale clicks from right whale gunshots. An accuracy of 92% and area under the receiver operating characteristic curve of 0.97 were obtained for the relatively challenging bowhead and humpback recognition case. These results demonstrated that the perceptual features employed by the aural classifier provided powerful discrimination cues for inter-species classification of cetaceans.

  5. Body-wide anatomy recognition in PET/CT images

    NASA Astrophysics Data System (ADS)

    Wang, Huiqian; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Zhao, Liming; Torigian, Drew A.

    2015-03-01

    With the rapid growth of positron emission tomography/computed tomography (PET/CT)-based medical applications, body-wide anatomy recognition on whole-body PET/CT images becomes crucial for quantifying body-wide disease burden. This, however, is a challenging problem and seldom studied due to unclear anatomy reference frame and low spatial resolution of PET images as well as low contrast and spatial resolution of the associated low-dose CT images. We previously developed an automatic anatomy recognition (AAR) system [15] whose applicability was demonstrated on diagnostic computed tomography (CT) and magnetic resonance (MR) images in different body regions on 35 objects. The aim of the present work is to investigate strategies for adapting the previous AAR system to low-dose CT and PET images toward automated body-wide disease quantification. Our adaptation of the previous AAR methodology to PET/CT images in this paper focuses on 16 objects in three body regions - thorax, abdomen, and pelvis - and consists of the following steps: collecting whole-body PET/CT images from existing patient image databases, delineating all objects in these images, modifying the previous hierarchical models built from diagnostic CT images to account for differences in appearance in low-dose CT and PET images, automatically locating objects in these images following object hierarchy, and evaluating performance. Our preliminary evaluations indicate that the performance of the AAR approach on low-dose CT images achieves object localization accuracy within about 2 voxels, which is comparable to the accuracies achieved on diagnostic contrast-enhanced CT images. Object recognition on low-dose CT images from PET/CT examinations without requiring diagnostic contrast-enhanced CT seems feasible.

  6. Automated alignment system for optical wireless communication systems using image recognition.

    PubMed

    Brandl, Paul; Weiss, Alexander; Zimmermann, Horst

    2014-07-01

    In this Letter, we describe the realization of a tracked line-of-sight optical wireless communication system for indoor data distribution. We built a laser-based transmitter with adaptive focus and ray steering by a microelectromechanical systems mirror. To execute the alignment procedure, we used a CMOS image sensor at the transmitter side and developed an algorithm for image recognition to localize the receiver's position. The receiver is based on a self-developed optoelectronic integrated chip with low requirements on the receiver optics to make the system economically attractive. With this system, we were able to set up the communication link automatically without any back channel and to perform error-free (bit error rate <10⁻⁹) data transmission over a distance of 3.5 m with a data rate of 3 Gbit/s.

  7. Morphological self-organizing feature map neural network with applications to automatic target recognition

    NASA Astrophysics Data System (ADS)

    Zhang, Shijun; Jing, Zhongliang; Li, Jianxun

    2005-01-01

    The rotation invariant feature of the target is obtained using the multi-direction feature extraction property of the steerable filter. Combining the morphological operation top-hat transform with the self-organizing feature map neural network, the adaptive topological region is selected. Using the erosion operation, the topological region shrinkage is achieved. The steerable filter based morphological self-organizing feature map neural network is applied to automatic target recognition of binary standard patterns and real-world infrared sequence images. Compared with Hamming network and morphological shared-weight networks respectively, the higher recognition correct rate, robust adaptability, quick training, and better generalization of the proposed method are achieved.

  8. Cherry recognition in natural environment based on the vision of picking robot

    NASA Astrophysics Data System (ADS)

    Zhang, Qirong; Chen, Shanxiong; Yu, Tingzhong; Wang, Yan

    2017-04-01

    In order to realize the automatic recognition of cherry in the natural environment, this paper designed a robot vision system recognition method. The first step of this method is to pre-process the cherry image by median filtering. The second step is to identify the colour of the cherry through the 0.9R-G colour difference formula, and then use the Otsu algorithm for threshold segmentation. The third step is to remove noise by using the area threshold. The fourth step is to remove the holes in the cherry image by morphological closed and open operation. The fifth step is to obtain the centroid and contour of cherry by using the smallest external rectangular and the Hough transform. Through this recognition process, we can successfully identify 96% of the cherry without blocking and adhesion.

  9. Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels

    PubMed Central

    Caballero-Morales, Santiago-Omar

    2013-01-01

    An approach for the recognition of emotions in speech is presented. The target language is Mexican Spanish, and for this purpose a speech database was created. The approach consists in the phoneme acoustic modelling of emotion-specific vowels. For this, a standard phoneme-based Automatic Speech Recognition (ASR) system was built with Hidden Markov Models (HMMs), where different phoneme HMMs were built for the consonants and emotion-specific vowels associated with four emotional states (anger, happiness, neutral, sadness). Then, estimation of the emotional state from a spoken sentence is performed by counting the number of emotion-specific vowels found in the ASR's output for the sentence. With this approach, accuracy of 87–100% was achieved for the recognition of emotional state of Mexican Spanish speech. PMID:23935410

  10. Visual object recognition for automatic micropropagation of plants

    NASA Astrophysics Data System (ADS)

    Brendel, Thorsten; Schwanke, Joerg; Jensch, Peter F.

    1994-11-01

    Micropropagation of plants is done by cutting juvenile plants and placing them into special container-boxes with nutrient-solution where the pieces can grow up and be cut again several times. To produce high amounts of biomass it is necessary to do plant micropropagation by a robotic system. In this paper we describe parts of the vision system that recognizes plants and their particular cutting points. Therefore, it is necessary to extract elements of the plants and relations between these elements (for example root, stem, leaf). Different species vary in their morphological appearance, variation is also immanent in plants of the same species. Therefore, we introduce several morphological classes of plants from that we expect same recognition methods.

  11. Invariant-feature-based adaptive automatic target recognition in obscured 3D point clouds

    NASA Astrophysics Data System (ADS)

    Khuon, Timothy; Kershner, Charles; Mattei, Enrico; Alverio, Arnel; Rand, Robert

    2014-06-01

    Target recognition and classification in a 3D point cloud is a non-trivial process due to the nature of the data collected from a sensor system. The signal can be corrupted by noise from the environment, electronic system, A/D converter, etc. Therefore, an adaptive system with a desired tolerance is required to perform classification and recognition optimally. The feature-based pattern recognition algorithm architecture as described below is particularly devised for solving a single-sensor classification non-parametrically. Feature set is extracted from an input point cloud, normalized, and classifier a neural network classifier. For instance, automatic target recognition in an urban area would require different feature sets from one in a dense foliage area. The figure above (see manuscript) illustrates the architecture of the feature based adaptive signature extraction of 3D point cloud including LIDAR, RADAR, and electro-optical data. This network takes a 3D cluster and classifies it into a specific class. The algorithm is a supervised and adaptive classifier with two modes: the training mode and the performing mode. For the training mode, a number of novel patterns are selected from actual or artificial data. A particular 3D cluster is input to the network as shown above for the decision class output. The network consists of three sequential functional modules. The first module is for feature extraction that extracts the input cluster into a set of singular value features or feature vector. Then the feature vector is input into the feature normalization module to normalize and balance it before being fed to the neural net classifier for the classification. The neural net can be trained by actual or artificial novel data until each trained output reaches the declared output within the defined tolerance. In case new novel data is added after the neural net has been learned, the training is then resumed until the neural net has incrementally learned with the new novel data. The associative memory capability of the neural net enables the incremental learning. The back propagation algorithm or support vector machine can be utilized for the classification and recognition.

  12. Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

    NASA Astrophysics Data System (ADS)

    Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

    2017-09-01

    This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.

  13. Exploiting range imagery: techniques and applications

    NASA Astrophysics Data System (ADS)

    Armbruster, Walter

    2009-07-01

    Practically no applications exist for which automatic processing of 2D intensity imagery can equal human visual perception. This is not the case for range imagery. The paper gives examples of 3D laser radar applications, for which automatic data processing can exceed human visual cognition capabilities and describes basic processing techniques for attaining these results. The examples are drawn from the fields of helicopter obstacle avoidance, object detection in surveillance applications, object recognition at high range, multi-object-tracking, and object re-identification in range image sequences. Processing times and recognition performances are summarized. The techniques used exploit the bijective continuity of the imaging process as well as its independence of object reflectivity, emissivity and illumination. This allows precise formulations of the probability distributions involved in figure-ground segmentation, feature-based object classification and model based object recognition. The probabilistic approach guarantees optimal solutions for single images and enables Bayesian learning in range image sequences. Finally, due to recent results in 3D-surface completion, no prior model libraries are required for recognizing and re-identifying objects of quite general object categories, opening the way to unsupervised learning and fully autonomous cognitive systems.

  14. Understanding Cognitive Development: Automaticity and the Early Years Child

    ERIC Educational Resources Information Center

    Gray, Colette

    2004-01-01

    In recent years a growing body of evidence has implicated deficits in the automaticity of fundamental facts such as word and number recognition in a range of disorders: including attention deficit hyperactivity disorder, dyslexia, apraxia and autism. Variously described as habits, fluency, chunking and over learning, automatic processes are best…

  15. Noise-robust speech recognition through auditory feature detection and spike sequence decoding.

    PubMed

    Schafer, Phillip B; Jin, Dezhe Z

    2014-03-01

    Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition (ASR) systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons. Each neuron is trained to respond selectively to a brief spectrotemporal pattern, or feature, drawn from the simulated auditory nerve response to speech. The neural population conveys the time-dependent structure of a sound by its sequence of spikes. We compare two methods for decoding the spike sequences--one using a hidden Markov model-based recognizer, the other using a novel template-based recognition scheme. In the latter case, words are recognized by comparing their spike sequences to template sequences obtained from clean training data, using a similarity measure based on the length of the longest common sub-sequence. Using isolated spoken digits from the AURORA-2 database, we show that our combined system outperforms a state-of-the-art robust speech recognizer at low signal-to-noise ratios. Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods. Our system highlights potential advantages of spike-based acoustic coding and provides a biologically motivated framework for robust ASR development.

  16. Definition and automatic anatomy recognition of lymph node zones in the pelvis on CT images

    NASA Astrophysics Data System (ADS)

    Liu, Yu; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Guo, Shuxu; Attor, Rosemary; Reinicke, Danica; Torigian, Drew A.

    2016-03-01

    Currently, unlike IALSC-defined thoracic lymph node zones, no explicitly provided definitions for lymph nodes in other body regions are available. Yet, definitions are critical for standardizing the recognition, delineation, quantification, and reporting of lymphadenopathy in other body regions. Continuing from our previous work in the thorax, this paper proposes a standardized definition of the grouping of pelvic lymph nodes into 10 zones. We subsequently employ our earlier Automatic Anatomy Recognition (AAR) framework designed for body-wide organ modeling, recognition, and delineation to actually implement these zonal definitions where the zones are treated as anatomic objects. First, all 10 zones and key anatomic organs used as anchors are manually delineated under expert supervision for constructing fuzzy anatomy models of the assembly of organs together with the zones. Then, optimal hierarchical arrangement of these objects is constructed for the purpose of achieving the best zonal recognition. For actual localization of the objects, two strategies are used -- optimal thresholded search for organs and one-shot method for the zones where the known relationship of the zones to key organs is exploited. Based on 50 computed tomography (CT) image data sets for the pelvic body region and an equal division into training and test subsets, automatic zonal localization within 1-3 voxels is achieved.

  17. Health smart home for elders - a tool for automatic recognition of activities of daily living.

    PubMed

    Le, Xuan Hoa Binh; Di Mascolo, Maria; Gouin, Alexia; Noury, Norbert

    2008-01-01

    Elders live preferently in their own home, but with aging comes the loss of autonomy and associated risks. In order to help them live longer in safe conditions, we need a tool to automatically detect their loss of autonomy by assessing the degree of performance of activities of daily living. This article presents an approach enabling the activities recognition of an elder living alone in a home equipped with noninvasive sensors.

  18. Analysis of Factors Affecting System Performance in the ASpIRE Challenge

    DTIC Science & Technology

    2015-12-13

    performance in the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge. In particular, overall word error rate (WER) of the solver...systems is analyzed as a function of room, distance between talker and microphone, and microphone type. We also analyze speech activity detection...analysis will inform the design of future challenges and provide insight into the efficacy of current solutions addressing noisy reverberant speech

  19. Image processing tool for automatic feature recognition and quantification

    DOEpatents

    Chen, Xing; Stoddard, Ryan J.

    2017-05-02

    A system for defining structures within an image is described. The system includes reading of an input file, preprocessing the input file while preserving metadata such as scale information and then detecting features of the input file. In one version the detection first uses an edge detector followed by identification of features using a Hough transform. The output of the process is identified elements within the image.

  20. A Cross-Lingual Mobile Medical Communication System Prototype for Foreigners and Subjects with Speech, Hearing, and Mental Disabilities Based on Pictograms

    PubMed Central

    Wołk, Agnieszka; Glinkowski, Wojciech

    2017-01-01

    People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer. PMID:29230254

  1. A Cross-Lingual Mobile Medical Communication System Prototype for Foreigners and Subjects with Speech, Hearing, and Mental Disabilities Based on Pictograms.

    PubMed

    Wołk, Krzysztof; Wołk, Agnieszka; Glinkowski, Wojciech

    2017-01-01

    People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer.

  2. High-accuracy and robust face recognition system based on optical parallel correlator using a temporal image sequence

    NASA Astrophysics Data System (ADS)

    Watanabe, Eriko; Ishikawa, Mami; Ohta, Maiko; Kodate, Kashiko

    2005-09-01

    Face recognition is used in a wide range of security systems, such as monitoring credit card use, searching for individuals with street cameras via Internet and maintaining immigration control. There are still many technical subjects under study. For instance, the number of images that can be stored is limited under the current system, and the rate of recognition must be improved to account for photo shots taken at different angles under various conditions. We implemented a fully automatic Fast Face Recognition Optical Correlator (FARCO) system by using a 1000 frame/s optical parallel correlator designed and assembled by us. Operational speed for the 1: N (i.e. matching a pair of images among N, where N refers to the number of images in the database) identification experiment (4000 face images) amounts to less than 1.5 seconds, including the pre/post processing. From trial 1: N identification experiments using FARCO, we acquired low error rates of 2.6% False Reject Rate and 1.3% False Accept Rate. By making the most of the high-speed data-processing capability of this system, much more robustness can be achieved for various recognition conditions when large-category data are registered for a single person. We propose a face recognition algorithm for the FARCO while employing a temporal image sequence of moving images. Applying this algorithm to a natural posture, a two times higher recognition rate scored compared with our conventional system. The system has high potential for future use in a variety of purposes such as search for criminal suspects by use of street and airport video cameras, registration of babies at hospitals or handling of an immeasurable number of images in a database.

  3. Syntax-directed content analysis of videotext: application to a map detection recognition system

    NASA Astrophysics Data System (ADS)

    Aradhye, Hrishikesh; Herson, James A.; Myers, Gregory

    2003-01-01

    Video is an increasingly important and ever-growing source of information to the intelligence and homeland defense analyst. A capability to automatically identify the contents of video imagery would enable the analyst to index relevant foreign and domestic news videos in a convenient and meaningful way. To this end, the proposed system aims to help determine the geographic focus of a news story directly from video imagery by detecting and geographically localizing political maps from news broadcasts, using the results of videotext recognition in lieu of a computationally expensive, scale-independent shape recognizer. Our novel method for the geographic localization of a map is based on the premise that the relative placement of text superimposed on a map roughly corresponds to the geographic coordinates of the locations the text represents. Our scheme extracts and recognizes videotext, and iteratively identifies the geographic area, while allowing for OCR errors and artistic freedom. The fast and reliable recognition of such maps by our system may provide valuable context and supporting evidence for other sources, such as speech recognition transcripts. The concepts of syntax-directed content analysis of videotext presented here can be extended to other content analysis systems.

  4. A novel probabilistic framework for event-based speech recognition

    NASA Astrophysics Data System (ADS)

    Juneja, Amit; Espy-Wilson, Carol

    2003-10-01

    One of the reasons for unsatisfactory performance of the state-of-the-art automatic speech recognition (ASR) systems is the inferior acoustic modeling of low-level acoustic-phonetic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal, but such a system for continuous speech recognition (CSR) is not known to exist. A probabilistic and statistical framework for CSR based on the idea of the representation of speech sounds by bundles of binary valued articulatory phonetic features is proposed. Multiple probabilistic sequences of linguistically motivated landmarks are obtained using binary classifiers of manner phonetic features-syllabic, sonorant and continuant-and the knowledge-based acoustic parameters (APs) that are acoustic correlates of those features. The landmarks are then used for the extraction of knowledge-based APs for source and place phonetic features and their binary classification. Probabilistic landmark sequences are constrained using manner class language models for isolated or connected word recognition. The proposed method could overcome the disadvantages encountered by the early acoustic-phonetic knowledge-based systems that led the ASR community to switch to systems highly dependent on statistical pattern analysis methods and probabilistic language or grammar models.

  5. Behavioral features recognition and oestrus detection based on fast approximate clustering algorithm in dairy cows

    NASA Astrophysics Data System (ADS)

    Tian, Fuyang; Cao, Dong; Dong, Xiaoning; Zhao, Xinqiang; Li, Fade; Wang, Zhonghua

    2017-06-01

    Behavioral features recognition was an important effect to detect oestrus and sickness in dairy herds and there is a need for heat detection aid. The detection method was based on the measure of the individual behavioural activity, standing time, and temperature of dairy using vibrational sensor and temperature sensor in this paper. The data of behavioural activity index, standing time, lying time and walking time were sent to computer by lower power consumption wireless communication system. The fast approximate K-means algorithm (FAKM) was proposed to deal the data of the sensor for behavioral features recognition. As a result of technical progress in monitoring cows using computers, automatic oestrus detection has become possible.

  6. Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System

    NASA Astrophysics Data System (ADS)

    Wang, Hongcui; Kawahara, Tatsuya

    CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.

  7. Autoregressive statistical pattern recognition algorithms for damage detection in civil structures

    NASA Astrophysics Data System (ADS)

    Yao, Ruigen; Pakzad, Shamim N.

    2012-08-01

    Statistical pattern recognition has recently emerged as a promising set of complementary methods to system identification for automatic structural damage assessment. Its essence is to use well-known concepts in statistics for boundary definition of different pattern classes, such as those for damaged and undamaged structures. In this paper, several statistical pattern recognition algorithms using autoregressive models, including statistical control charts and hypothesis testing, are reviewed as potentially competitive damage detection techniques. To enhance the performance of statistical methods, new feature extraction techniques using model spectra and residual autocorrelation, together with resampling-based threshold construction methods, are proposed. Subsequently, simulated acceleration data from a multi degree-of-freedom system is generated to test and compare the efficiency of the existing and proposed algorithms. Data from laboratory experiments conducted on a truss and a large-scale bridge slab model are then used to further validate the damage detection methods and demonstrate the superior performance of proposed algorithms.

  8. A speech-controlled environmental control system for people with severe dysarthria.

    PubMed

    Hawley, Mark S; Enderby, Pam; Green, Phil; Cunningham, Stuart; Brownsell, Simon; Carmichael, James; Parker, Mark; Hatzis, Athanassios; O'Neill, Peter; Palmer, Rebecca

    2007-06-01

    Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.

  9. Facial Emotion Recognition: A Survey and Real-World User Experiences in Mixed Reality

    PubMed Central

    Mehta, Dhwani; Siddiqui, Mohammad Faridul Haque

    2018-01-01

    Extensive possibilities of applications have made emotion recognition ineluctable and challenging in the field of computer science. The use of non-verbal cues such as gestures, body movement, and facial expressions convey the feeling and the feedback to the user. This discipline of Human–Computer Interaction places reliance on the algorithmic robustness and the sensitivity of the sensor to ameliorate the recognition. Sensors play a significant role in accurate detection by providing a very high-quality input, hence increasing the efficiency and the reliability of the system. Automatic recognition of human emotions would help in teaching social intelligence in the machines. This paper presents a brief study of the various approaches and the techniques of emotion recognition. The survey covers a succinct review of the databases that are considered as data sets for algorithms detecting the emotions by facial expressions. Later, mixed reality device Microsoft HoloLens (MHL) is introduced for observing emotion recognition in Augmented Reality (AR). A brief introduction of its sensors, their application in emotion recognition and some preliminary results of emotion recognition using MHL are presented. The paper then concludes by comparing results of emotion recognition by the MHL and a regular webcam. PMID:29389845

  10. Facial Emotion Recognition: A Survey and Real-World User Experiences in Mixed Reality.

    PubMed

    Mehta, Dhwani; Siddiqui, Mohammad Faridul Haque; Javaid, Ahmad Y

    2018-02-01

    Extensive possibilities of applications have made emotion recognition ineluctable and challenging in the field of computer science. The use of non-verbal cues such as gestures, body movement, and facial expressions convey the feeling and the feedback to the user. This discipline of Human-Computer Interaction places reliance on the algorithmic robustness and the sensitivity of the sensor to ameliorate the recognition. Sensors play a significant role in accurate detection by providing a very high-quality input, hence increasing the efficiency and the reliability of the system. Automatic recognition of human emotions would help in teaching social intelligence in the machines. This paper presents a brief study of the various approaches and the techniques of emotion recognition. The survey covers a succinct review of the databases that are considered as data sets for algorithms detecting the emotions by facial expressions. Later, mixed reality device Microsoft HoloLens (MHL) is introduced for observing emotion recognition in Augmented Reality (AR). A brief introduction of its sensors, their application in emotion recognition and some preliminary results of emotion recognition using MHL are presented. The paper then concludes by comparing results of emotion recognition by the MHL and a regular webcam.

  11. Research into the Architecture of CAD Based Robot Vision Systems

    DTIC Science & Technology

    1988-02-09

    Vision 󈨚 and "Automatic Generation of Recognition Features for Com- puter Vision," Mudge, Turney and Volz, published in Robotica (1987). All of the...Occluded Parts," (T.N. Mudge, J.L. Turney, and R.A. Volz), Robotica , vol. 5, 1987, pp. 117-127. 5. "Vision Algorithms for Hypercube Machines," (T.N. Mudge

  12. Polyphonic Music Information Retrieval Based on Multi-Label Cascade Classification System

    ERIC Educational Resources Information Center

    Jiang, Wenxin

    2009-01-01

    Recognition and separation of sounds played by various instruments is very useful in labeling audio files with semantic information. This is a non-trivial task requiring sound analysis, but the results can aid automatic indexing and browsing music data when searching for melodies played by user specified instruments. Melody match based on pitch…

  13. A distributed automatic target recognition system using multiple low resolution sensors

    NASA Astrophysics Data System (ADS)

    Yue, Zhanfeng; Lakshmi Narasimha, Pramod; Topiwala, Pankaj

    2008-04-01

    In this paper, we propose a multi-agent system which uses swarming techniques to perform high accuracy Automatic Target Recognition (ATR) in a distributed manner. The proposed system can co-operatively share the information from low-resolution images of different looks and use this information to perform high accuracy ATR. An advanced, multiple-agent Unmanned Aerial Vehicle (UAV) systems-based approach is proposed which integrates the processing capabilities, combines detection reporting with live video exchange, and swarm behavior modalities that dramatically surpass individual sensor system performance levels. We employ real-time block-based motion analysis and compensation scheme for efficient estimation and correction of camera jitter, global motion of the camera/scene and the effects of atmospheric turbulence. Our optimized Partition Weighted Sum (PWS) approach requires only bitshifts and additions, yet achieves a stunning 16X pixel resolution enhancement, which is moreover parallizable. We develop advanced, adaptive particle-filtering based algorithms to robustly track multiple mobile targets by adaptively changing the appearance model of the selected targets. The collaborative ATR system utilizes the homographies between the sensors induced by the ground plane to overlap the local observation with the received images from other UAVs. The motion of the UAVs distorts estimated homography frame to frame. A robust dynamic homography estimation algorithm is proposed to address this, by using the homography decomposition and the ground plane surface estimation.

  14. 3D automatic anatomy segmentation based on iterative graph-cut-ASM.

    PubMed

    Chen, Xinjian; Bagci, Ulas

    2011-08-01

    This paper studies the feasibility of developing an automatic anatomy segmentation (AAS) system in clinical radiology and demonstrates its operation on clinical 3D images. The AAS system, the authors are developing consists of two main parts: object recognition and object delineation. As for recognition, a hierarchical 3D scale-based multiobject method is used for the multiobject recognition task, which incorporates intensity weighted ball-scale (b-scale) information into the active shape model (ASM). For object delineation, an iterative graph-cut-ASM (IGCASM) algorithm is proposed, which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. The presented IGCASM algorithm is a 3D generalization of the 2D GC-ASM method that they proposed previously in Chen et al. [Proc. SPIE, 7259, 72590C1-72590C-8 (2009)]. The proposed methods are tested on two datasets comprised of images obtained from 20 patients (10 male and 10 female) of clinical abdominal CT scans, and 11 foot magnetic resonance imaging (MRI) scans. The test is for four organs (liver, left and right kidneys, and spleen) segmentation, five foot bones (calcaneus, tibia, cuboid, talus, and navicular). The recognition and delineation accuracies were evaluated separately. The recognition accuracy was evaluated in terms of translation, rotation, and scale (size) error. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF, FPVF). The efficiency of the delineation method was also evaluated on an Intel Pentium IV PC with a 3.4 GHZ CPU machine. The recognition accuracies in terms of translation, rotation, and scale error over all organs are about 8 mm, 10 degrees and 0.03, and over all foot bones are about 3.5709 mm, 0.35 degrees and 0.025, respectively. The accuracy of delineation over all organs for all subjects as expressed in TPVF and FPVF is 93.01% and 0.22%, and all foot bones for all subjects are 93.75% and 0.28%, respectively. While the delineations for the four organs can be accomplished quite rapidly with average of 78 s, the delineations for the five foot bones can be accomplished with average of 70 s. The experimental results showed the feasibility and efficacy of the proposed automatic anatomy segmentation system: (a) the incorporation of shape priors into the GC framework is feasible in 3D as demonstrated previously for 2D images; (b) our results in 3D confirm the accuracy behavior observed in 2D. The hybrid strategy IGCASM seems to be more robust and accurate than ASM and GC individually; and (c) delineations within body regions and foot bones of clinical importance can be accomplished quite rapidly within 1.5 min.

  15. Rapid Word Recognition as a Measure of Word-Level Automaticity and Its Relation to Other Measures of Reading

    ERIC Educational Resources Information Center

    Frye, Elizabeth M.; Gosky, Ross

    2012-01-01

    The present study investigated the relationship between rapid recognition of individual words (Word Recognition Test) and two measures of contextual reading: (1) grade-level Passage Reading Test (IRI passage) and (2) performance on standardized STAR Reading Test. To establish if time of presentation on the word recognition test was a factor in…

  16. Voice reaction times with recognition for Commodore computers

    NASA Technical Reports Server (NTRS)

    Washburn, David A.; Putney, R. Thompson

    1990-01-01

    Hardware and software modifications are presented that allow for collection and recognition by a Commodore computer of spoken responses. Responses are timed with millisecond accuracy and automatically analyzed and scored. Accuracy data for this device from several experiments are presented. Potential applications and suggestions for improving recognition accuracy are also discussed.

  17. Automatic Intention Recognition in Conversation Processing

    ERIC Educational Resources Information Center

    Holtgraves, Thomas

    2008-01-01

    A fundamental assumption of many theories of conversation is that comprehension of a speaker's utterance involves recognition of the speaker's intention in producing that remark. However, the nature of intention recognition is not clear. One approach is to conceptualize a speaker's intention in terms of speech acts [Searle, J. (1969). "Speech…

  18. Image/text automatic indexing and retrieval system using context vector approach

    NASA Astrophysics Data System (ADS)

    Qing, Kent P.; Caid, William R.; Ren, Clara Z.; McCabe, Patrick

    1995-11-01

    Thousands of documents and images are generated daily both on and off line on the information superhighway and other media. Storage technology has improved rapidly to handle these data but indexing this information is becoming very costly. HNC Software Inc. has developed a technology for automatic indexing and retrieval of free text and images. This technique is demonstrated and is based on the concept of `context vectors' which encode a succinct representation of the associated text and features of sub-image. In this paper, we will describe the Automated Librarian System which was designed for free text indexing and the Image Content Addressable Retrieval System (ICARS) which extends the technique from the text domain into the image domain. Both systems have the ability to automatically assign indices for a new document and/or image based on the content similarities in the database. ICARS also has the capability to retrieve images based on similarity of content using index terms, text description, and user-generated images as a query without performing segmentation or object recognition.

  19. Automatic localization of IASLC-defined mediastinal lymph node stations on CT images using fuzzy models

    NASA Astrophysics Data System (ADS)

    Matsumoto, Monica M. S.; Beig, Niha G.; Udupa, Jayaram K.; Archer, Steven; Torigian, Drew A.

    2014-03-01

    Lung cancer is associated with the highest cancer mortality rates among men and women in the United States. The accurate and precise identification of the lymph node stations on computed tomography (CT) images is important for staging disease and potentially for prognosticating outcome in patients with lung cancer, as well as for pretreatment planning and response assessment purposes. To facilitate a standard means of referring to lymph nodes, the International Association for the Study of Lung Cancer (IASLC) has recently proposed a definition of the different lymph node stations and zones in the thorax. However, nodal station identification is typically performed manually by visual assessment in clinical radiology. This approach leaves room for error due to the subjective and potentially ambiguous nature of visual interpretation, and is labor intensive. We present a method of automatically recognizing the mediastinal IASLC-defined lymph node stations by modifying a hierarchical fuzzy modeling approach previously developed for body-wide automatic anatomy recognition (AAR) in medical imagery. Our AAR-lymph node (AAR-LN) system follows the AAR methodology and consists of two steps. In the first step, the various lymph node stations are manually delineated on a set of CT images following the IASLC definitions. These delineations are then used to build a fuzzy hierarchical model of the nodal stations which are considered as 3D objects. In the second step, the stations are automatically located on any given CT image of the thorax by using the hierarchical fuzzy model and object recognition algorithms. Based on 23 data sets used for model building, 22 independent data sets for testing, and 10 lymph node stations, a mean localization accuracy of within 1-6 voxels has been achieved by the AAR-LN system.

  20. A novel automatic method for monitoring Tourette motor tics through a wearable device.

    PubMed

    Bernabei, Michel; Preatoni, Ezio; Mendez, Martin; Piccini, Luca; Porta, Mauro; Andreoni, Giuseppe

    2010-09-15

    The aim of this study was to propose a novel automatic method for quantifying motor-tics caused by the Tourette Syndrome (TS). In this preliminary report, the feasibility of the monitoring process was tested over a series of standard clinical trials in a population of 12 subjects affected by TS. A wearable instrument with an embedded three-axial accelerometer was used to detect and classify motor tics during standing and walking activities. An algorithm was devised to analyze acceleration data by: eliminating noise; detecting peaks connected to pathological events; and classifying intensity and frequency of motor tics into quantitative scores. These indexes were compared with the video-based ones provided by expert clinicians, which were taken as the gold-standard. Sensitivity, specificity, and accuracy of tic detection were estimated, and an agreement analysis was performed through the least square regression and the Bland-Altman test. The tic recognition algorithm showed sensitivity = 80.8% ± 8.5% (mean ± SD), specificity = 75.8% ± 17.3%, and accuracy = 80.5% ± 12.2%. The agreement study showed that automatic detection tended to overestimate the number of tics occurred. Although, it appeared this may be a systematic error due to the different recognition principles of the wearable and video-based systems. Furthermore, there was substantial concurrency with the gold-standard in estimating the severity indexes. The proposed methodology gave promising performances in terms of automatic motor-tics detection and classification in a standard clinical context. The system may provide physicians with a quantitative aid for TS assessment. Further developments will focus on the extension of its application to everyday long-term monitoring out of clinical environments. © 2010 Movement Disorder Society.

  1. SU-D-201-05: On the Automatic Recognition of Patient Safety Hazards in a Radiotherapy Setup Using a Novel 3D Camera System and a Deep Learning Framework

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Santhanam, A; Min, Y; Beron, P

    Purpose: Patient safety hazards such as a wrong patient/site getting treated can lead to catastrophic results. The purpose of this project is to automatically detect potential patient safety hazards during the radiotherapy setup and alert the therapist before the treatment is initiated. Methods: We employed a set of co-located and co-registered 3D cameras placed inside the treatment room. Each camera provided a point-cloud of fraxels (fragment pixels with 3D depth information). Each of the cameras were calibrated using a custom-built calibration target to provide 3D information with less than 2 mm error in the 500 mm neighborhood around the isocenter.more » To identify potential patient safety hazards, the treatment room components and the patient’s body needed to be identified and tracked in real-time. For feature recognition purposes, we used a graph-cut based feature recognition with principal component analysis (PCA) based feature-to-object correlation to segment the objects in real-time. Changes in the object’s position were tracked using the CamShift algorithm. The 3D object information was then stored for each classified object (e.g. gantry, couch). A deep learning framework was then used to analyze all the classified objects in both 2D and 3D and was then used to fine-tune a convolutional network for object recognition. The number of network layers were optimized to identify the tracked objects with >95% accuracy. Results: Our systematic analyses showed that, the system was effectively able to recognize wrong patient setups and wrong patient accessories. The combined usage of 2D camera information (color + depth) enabled a topology-preserving approach to verify patient safety hazards in an automatic manner and even in scenarios where the depth information is partially available. Conclusion: By utilizing the 3D cameras inside the treatment room and a deep learning based image classification, potential patient safety hazards can be effectively avoided.« less

  2. Recognition of surface lithologic and topographic patterns in southwest Colorado with ADP techniques

    NASA Technical Reports Server (NTRS)

    Melhorn, W. N.; Sinnock, S.

    1973-01-01

    Analysis of ERTS-1 multispectral data by automatic pattern recognition procedures is applicable toward grappling with current and future resource stresses by providing a means for refining existing geologic maps. The procedures used in the current analysis already yield encouraging results toward the eventual machine recognition of extensive surface lithologic and topographic patterns. Automatic mapping of a series of hogbacks, strike valleys, and alluvial surfaces along the northwest flank of the San Juan Basin in Colorado can be obtained by minimal man-machine interaction. The determination of causes for separable spectral signatures is dependent upon extensive correlation of micro- and macro field based ground truth observations and aircraft underflight data with the satellite data.

  3. Histogram equalization with Bayesian estimation for noise robust speech recognition.

    PubMed

    Suh, Youngjoo; Kim, Hoirin

    2018-02-01

    The histogram equalization approach is an efficient feature normalization technique for noise robust automatic speech recognition. However, it suffers from performance degradation when some fundamental conditions are not satisfied in the test environment. To remedy these limitations of the original histogram equalization methods, class-based histogram equalization approach has been proposed. Although this approach showed substantial performance improvement under noise environments, it still suffers from performance degradation due to the overfitting problem when test data are insufficient. To address this issue, the proposed histogram equalization technique employs the Bayesian estimation method in the test cumulative distribution function estimation. It was reported in a previous study conducted on the Aurora-4 task that the proposed approach provided substantial performance gains in speech recognition systems based on the acoustic modeling of the Gaussian mixture model-hidden Markov model. In this work, the proposed approach was examined in speech recognition systems with deep neural network-hidden Markov model (DNN-HMM), the current mainstream speech recognition approach where it also showed meaningful performance improvement over the conventional maximum likelihood estimation-based method. The fusion of the proposed features with the mel-frequency cepstral coefficients provided additional performance gains in DNN-HMM systems, which otherwise suffer from performance degradation in the clean test condition.

  4. Pattern recognition for passive polarimetric data using nonparametric classifiers

    NASA Astrophysics Data System (ADS)

    Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.

    2005-08-01

    Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.

  5. Computer aided analysis of gait patterns in patients with acute anterior cruciate ligament injury.

    PubMed

    Christian, Josef; Kröll, Josef; Strutzenberger, Gerda; Alexander, Nathalie; Ofner, Michael; Schwameder, Hermann

    2016-03-01

    Gait analysis is a useful tool to evaluate the functional status of patients with anterior cruciate ligament injury. Pattern recognition methods can be used to automatically assess walking patterns and objectively support clinical decisions. This study aimed to test a pattern recognition system for analyzing kinematic gait patterns of recently anterior cruciate ligament injured patients and for evaluating the effects of a therapeutic treatment. Gait kinematics of seven male patients with an acute unilateral anterior cruciate ligament rupture and seven healthy males were recorded. A support vector machine was trained to distinguish the groups. Principal component analysis and recursive feature elimination were used to extract features from 3D marker trajectories. A Classifier Oriented Gait Score was defined as a measure of gait quality. Visualizations were used to allow functional interpretations of characteristic group differences. The injured group was evaluated by the system after a therapeutic treatment. The results were compared against a clinical rating of the patients' gait. Cross validation yielded 100% accuracy. After the treatment the score improved significantly (P<0.01) as well as the clinical rating (P<0.05). The visualizations revealed characteristic kinematic features, which differentiated between the groups. The results show that gait alterations in the early phase after anterior cruciate ligament injury can be detected automatically. The results of the automatic analysis are comparable with the clinical rating and support the validity of the system. The visualizations allow interpretations on discriminatory features and can facilitate the integration of the results into the diagnostic process. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Automatic image orientation detection via confidence-based integration of low-level and semantic cues.

    PubMed

    Luo, Jiebo; Boutell, Matthew

    2005-05-01

    Automatic image orientation detection for natural images is a useful, yet challenging research topic. Humans use scene context and semantic object recognition to identify the correct image orientation. However, it is difficult for a computer to perform the task in the same way because current object recognition algorithms are extremely limited in their scope and robustness. As a result, existing orientation detection methods were built upon low-level vision features such as spatial distributions of color and texture. Discrepant detection rates have been reported for these methods in the literature. We have developed a probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework. Our current accuracy is 90 percent for unconstrained consumer photos, impressive given the findings of a psychophysical study conducted recently. The proposed framework is an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.

  7. Semi-automatic recognition of marine debris on beaches

    NASA Astrophysics Data System (ADS)

    Ge, Zhenpeng; Shi, Huahong; Mei, Xuefei; Dai, Zhijun; Li, Daoji

    2016-05-01

    An increasing amount of anthropogenic marine debris is pervading the earth’s environmental systems, resulting in an enormous threat to living organisms. Additionally, the large amount of marine debris around the world has been investigated mostly through tedious manual methods. Therefore, we propose the use of a new technique, light detection and ranging (LIDAR), for the semi-automatic recognition of marine debris on a beach because of its substantially more efficient role in comparison with other more laborious methods. Our results revealed that LIDAR should be used for the classification of marine debris into plastic, paper, cloth and metal. Additionally, we reconstructed a 3-dimensional model of different types of debris on a beach with a high validity of debris revivification using LIDAR-based individual separation. These findings demonstrate that the availability of this new technique enables detailed observations to be made of debris on a large beach that was previously not possible. It is strongly suggested that LIDAR could be implemented as an appropriate monitoring tool for marine debris by global researchers and governments.

  8. Hierarchical classification of dynamically varying radar pulse repetition interval modulation patterns.

    PubMed

    Kauppi, Jukka-Pekka; Martikainen, Kalle; Ruotsalainen, Ulla

    2010-12-01

    The central purpose of passive signal intercept receivers is to perform automatic categorization of unknown radar signals. Currently, there is an urgent need to develop intelligent classification algorithms for these devices due to emerging complexity of radar waveforms. Especially multifunction radars (MFRs) capable of performing several simultaneous tasks by utilizing complex, dynamically varying scheduled waveforms are a major challenge for automatic pattern classification systems. To assist recognition of complex radar emissions in modern intercept receivers, we have developed a novel method to recognize dynamically varying pulse repetition interval (PRI) modulation patterns emitted by MFRs. We use robust feature extraction and classifier design techniques to assist recognition in unpredictable real-world signal environments. We classify received pulse trains hierarchically which allows unambiguous detection of the subpatterns using a sliding window. Accuracy, robustness and reliability of the technique are demonstrated with extensive simulations using both static and dynamically varying PRI modulation patterns. Copyright © 2010 Elsevier Ltd. All rights reserved.

  9. How should a speech recognizer work?

    PubMed

    Scharenborg, Odette; Norris, Dennis; Bosch, Louis; McQueen, James M

    2005-11-12

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input. 2005 Lawrence Erlbaum Associates, Inc.

  10. Fuzzy support vector machines for adaptive Morse code recognition.

    PubMed

    Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh

    2006-11-01

    Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.

  11. Automatic integration of social information in emotion recognition.

    PubMed

    Mumenthaler, Christian; Sander, David

    2015-04-01

    This study investigated the automaticity of the influence of social inference on emotion recognition. Participants were asked to recognize dynamic facial expressions of emotion (fear or anger in Experiment 1 and blends of fear and surprise or of anger and disgust in Experiment 2) in a target face presented at the center of a screen while a subliminal contextual face appearing in the periphery expressed an emotion (fear or anger) or not (neutral) and either looked at the target face or not. Results of Experiment 1 revealed that recognition of the target emotion of fear was improved when a subliminal angry contextual face gazed toward-rather than away from-the fearful face. We replicated this effect in Experiment 2, in which facial expression blends of fear and surprise were more often and more rapidly categorized as expressing fear when the subliminal contextual face expressed anger and gazed toward-rather than away from-the target face. With the contextual face appearing for 30 ms in total, including only 10 ms of emotion expression, and being immediately masked, our data provide the first evidence that social influence on emotion recognition can occur automatically. (c) 2015 APA, all rights reserved).

  12. Automatic lip reading by using multimodal visual features

    NASA Astrophysics Data System (ADS)

    Takahashi, Shohei; Ohya, Jun

    2013-12-01

    Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.

  13. I Hear You Eat and Speak: Automatic Recognition of Eating Condition and Food Type, Use-Cases, and Impact on ASR Performance

    PubMed Central

    Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn

    2016-01-01

    We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486

  14. [Advantages and Application Prospects of Deep Learning in Image Recognition and Bone Age Assessment].

    PubMed

    Hu, T H; Wan, L; Liu, T A; Wang, M W; Chen, T; Wang, Y H

    2017-12-01

    Deep learning and neural network models have been new research directions and hot issues in the fields of machine learning and artificial intelligence in recent years. Deep learning has made a breakthrough in the applications of image and speech recognitions, and also has been extensively used in the fields of face recognition and information retrieval because of its special superiority. Bone X-ray images express different variations in black-white-gray gradations, which have image features of black and white contrasts and level differences. Based on these advantages of deep learning in image recognition, we combine it with the research of bone age assessment to provide basic datum for constructing a forensic automatic system of bone age assessment. This paper reviews the basic concept and network architectures of deep learning, and describes its recent research progress on image recognition in different research fields at home and abroad, and explores its advantages and application prospects in bone age assessment. Copyright© by the Editorial Department of Journal of Forensic Medicine.

  15. Automatic Recognition of Fetal Facial Standard Plane in Ultrasound Image via Fisher Vector.

    PubMed

    Lei, Baiying; Tan, Ee-Leng; Chen, Siping; Zhuo, Liu; Li, Shengli; Ni, Dong; Wang, Tianfu

    2015-01-01

    Acquisition of the standard plane is the prerequisite of biometric measurement and diagnosis during the ultrasound (US) examination. In this paper, a new algorithm is developed for the automatic recognition of the fetal facial standard planes (FFSPs) such as the axial, coronal, and sagittal planes. Specifically, densely sampled root scale invariant feature transform (RootSIFT) features are extracted and then encoded by Fisher vector (FV). The Fisher network with multi-layer design is also developed to extract spatial information to boost the classification performance. Finally, automatic recognition of the FFSPs is implemented by support vector machine (SVM) classifier based on the stochastic dual coordinate ascent (SDCA) algorithm. Experimental results using our dataset demonstrate that the proposed method achieves an accuracy of 93.27% and a mean average precision (mAP) of 99.19% in recognizing different FFSPs. Furthermore, the comparative analyses reveal the superiority of the proposed method based on FV over the traditional methods.

  16. Intelligent Facial Recognition Systems: Technology advancements for security applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beer, C.L.

    1993-07-01

    Insider problems such as theft and sabotage can occur within the security and surveillance realm of operations when unauthorized people obtain access to sensitive areas. A possible solution to these problems is a means to identify individuals (not just credentials or badges) in a given sensitive area and provide full time personnel accountability. One approach desirable at Department of Energy facilities for access control and/or personnel identification is an Intelligent Facial Recognition System (IFRS) that is non-invasive to personnel. Automatic facial recognition does not require the active participation of the enrolled subjects, unlike most other biological measurement (biometric) systems (e.g.,more » fingerprint, hand geometry, or eye retinal scan systems). It is this feature that makes an IFRS attractive for applications other than access control such as emergency evacuation verification, screening, and personnel tracking. This paper discusses current technology that shows promising results for DOE and other security applications. A survey of research and development in facial recognition identified several companies and universities that were interested and/or involved in the area. A few advanced prototype systems were also identified. Sandia National Laboratories is currently evaluating facial recognition systems that are in the advanced prototype stage. The initial application for the evaluation is access control in a controlled environment with a constant background and with cooperative subjects. Further evaluations will be conducted in a less controlled environment, which may include a cluttered background and subjects that are not looking towards the camera. The outcome of the evaluations will help identify areas of facial recognition systems that need further development and will help to determine the effectiveness of the current systems for security applications.« less

  17. Development of coffee maker service robot using speech and face recognition systems using POMDP

    NASA Astrophysics Data System (ADS)

    Budiharto, Widodo; Meiliana; Santoso Gunawan, Alexander Agung

    2016-07-01

    There are many development of intelligent service robot in order to interact with user naturally. This purpose can be done by embedding speech and face recognition ability on specific tasks to the robot. In this research, we would like to propose Intelligent Coffee Maker Robot which the speech recognition is based on Indonesian language and powered by statistical dialogue systems. This kind of robot can be used in the office, supermarket or restaurant. In our scenario, robot will recognize user's face and then accept commands from the user to do an action, specifically in making a coffee. Based on our previous work, the accuracy for speech recognition is about 86% and face recognition is about 93% in laboratory experiments. The main problem in here is to know the intention of user about how sweetness of the coffee. The intelligent coffee maker robot should conclude the user intention through conversation under unreliable automatic speech in noisy environment. In this paper, this spoken dialog problem is treated as a partially observable Markov decision process (POMDP). We describe how this formulation establish a promising framework by empirical results. The dialog simulations are presented which demonstrate significant quantitative outcome.

  18. [Extraction and recognition of attractors in three-dimensional Lorenz plot].

    PubMed

    Hu, Min; Jang, Chengfan; Wang, Suxia

    2018-02-01

    Lorenz plot (LP) method which gives a global view of long-time electrocardiogram signals, is an efficient simple visualization tool to analyze cardiac arrhythmias, and the morphologies and positions of the extracted attractors may reveal the underlying mechanisms of the onset and termination of arrhythmias. But automatic diagnosis is still impossible because it is lack of the method of extracting attractors by now. We presented here a methodology of attractor extraction and recognition based upon homogeneously statistical properties of the location parameters of scatter points in three dimensional LP (3DLP), which was constructed by three successive RR intervals as X , Y and Z axis in Cartesian coordinate system. Validation experiments were tested in a group of RR-interval time series and tags data with frequent unifocal premature complexes exported from a 24-hour Holter system. The results showed that this method had excellent effective not only on extraction of attractors, but also on automatic recognition of attractors by the location parameters such as the azimuth of the points peak frequency ( A PF ) of eccentric attractors once stereographic projection of 3DLP along the space diagonal. Besides, A PF was still a powerful index of differential diagnosis of atrial and ventricular extrasystole. Additional experiments proved that this method was also available on several other arrhythmias. Moreover, there were extremely relevant relationships between 3DLP and two dimensional LPs which indicate any conventional achievement of LPs could be implanted into 3DLP. It would have a broad application prospect to integrate this method into conventional long-time electrocardiogram monitoring and analysis system.

  19. Real time biometric surveillance with gait recognition

    NASA Astrophysics Data System (ADS)

    Mohapatra, Subasish; Swain, Anisha; Das, Manaswini; Mohanty, Subhadarshini

    2018-04-01

    Bio metric surveillance has become indispensable for every system in the recent years. The contribution of bio metric authentication, identification, and screening purposes are widely used in various domains for preventing unauthorized access. A large amount of data needs to be updated, segregated and safeguarded from malicious software and misuse. Bio metrics is the intrinsic characteristics of each individual. Recently fingerprints, iris, passwords, unique keys, and cards are commonly used for authentication purposes. These methods have various issues related to security and confidentiality. These systems are not yet automated to provide the safety and security. The gait recognition system is the alternative for overcoming the drawbacks of the recent bio metric based authentication systems. Gait recognition is newer as it hasn't been implemented in the real-world scenario so far. This is an un-intrusive system that requires no knowledge or co-operation of the subject. Gait is a unique behavioral characteristic of every human being which is hard to imitate. The walking style of an individual teamed with the orientation of joints in the skeletal structure and inclinations between them imparts the unique characteristic. A person can alter one's own external appearance but not skeletal structure. These are real-time, automatic systems that can even process low-resolution images and video frames. In this paper, we have proposed a gait recognition system and compared the performance with conventional bio metric identification systems.

  20. PCI bus content-addressable-memory (CAM) implementation on FPGA for pattern recognition/image retrieval in a distributed environment

    NASA Astrophysics Data System (ADS)

    Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.

    2004-11-01

    Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.

  1. Interactive display/graphics systems for remote sensor data analysis.

    NASA Technical Reports Server (NTRS)

    Eppler, W. G.; Loe, D. L.; Wilson, E. L.; Whitley, S. L.; Sachen, R. J.

    1971-01-01

    Using a color-television display system and interactive graphics equipment on-line to an IBM 360/44 computer, investigators at the Manned Spacecraft Center have developed a variety of interactive displays which aid in analyzing remote sensor data. This paper describes how such interactive displays are used to: (1) analyze data from a multispectral scanner, (2) develop automatic pattern recognition systems based on multispectral scanner measurements, and (3) analyze data from nonimaging sensors such as the infrared radiometer and microwave scatterometer.

  2. An adaptive Hidden Markov Model for activity recognition based on a wearable multi-sensor device

    USDA-ARS?s Scientific Manuscript database

    Human activity recognition is important in the study of personal health, wellness and lifestyle. In order to acquire human activity information from the personal space, many wearable multi-sensor devices have been developed. In this paper, a novel technique for automatic activity recognition based o...

  3. Distributed pheromone-based swarming control of unmanned air and ground vehicles for RSTA

    NASA Astrophysics Data System (ADS)

    Sauter, John A.; Mathews, Robert S.; Yinger, Andrew; Robinson, Joshua S.; Moody, John; Riddle, Stephanie

    2008-04-01

    The use of unmanned vehicles in Reconnaissance, Surveillance, and Target Acquisition (RSTA) applications has received considerable attention recently. Cooperating land and air vehicles can support multiple sensor modalities providing pervasive and ubiquitous broad area sensor coverage. However coordination of multiple air and land vehicles serving different mission objectives in a dynamic and complex environment is a challenging problem. Swarm intelligence algorithms, inspired by the mechanisms used in natural systems to coordinate the activities of many entities provide a promising alternative to traditional command and control approaches. This paper describes recent advances in a fully distributed digital pheromone algorithm that has demonstrated its effectiveness in managing the complexity of swarming unmanned systems. The results of a recent demonstration at NASA's Wallops Island of multiple Aerosonde Unmanned Air Vehicles (UAVs) and Pioneer Unmanned Ground Vehicles (UGVs) cooperating in a coordinated RSTA application are discussed. The vehicles were autonomously controlled by the onboard digital pheromone responding to the needs of the automatic target recognition algorithms. UAVs and UGVs controlled by the same pheromone algorithm self-organized to perform total area surveillance, automatic target detection, sensor cueing, and automatic target recognition with no central processing or control and minimal operator input. Complete autonomy adds several safety and fault tolerance requirements which were integrated into the basic pheromone framework. The adaptive algorithms demonstrated the ability to handle some unplanned hardware failures during the demonstration without any human intervention. The paper describes lessons learned and the next steps for this promising technology.

  4. Fully automatized renal parenchyma volumetry using a support vector machine based recognition system for subject-specific probability map generation in native MR volume data

    NASA Astrophysics Data System (ADS)

    Gloger, Oliver; Tönnies, Klaus; Mensel, Birger; Völzke, Henry

    2015-11-01

    In epidemiological studies as well as in clinical practice the amount of produced medical image data strongly increased in the last decade. In this context organ segmentation in MR volume data gained increasing attention for medical applications. Especially in large-scale population-based studies organ volumetry is highly relevant requiring exact organ segmentation. Since manual segmentation is time-consuming and prone to reader variability, large-scale studies need automatized methods to perform organ segmentation. Fully automatic organ segmentation in native MR image data has proven to be a very challenging task. Imaging artifacts as well as inter- and intrasubject MR-intensity differences complicate the application of supervised learning strategies. Thus, we propose a modularized framework of a two-stepped probabilistic approach that generates subject-specific probability maps for renal parenchyma tissue, which are refined subsequently by using several, extended segmentation strategies. We present a three class-based support vector machine recognition system that incorporates Fourier descriptors as shape features to recognize and segment characteristic parenchyma parts. Probabilistic methods use the segmented characteristic parenchyma parts to generate high quality subject-specific parenchyma probability maps. Several refinement strategies including a final shape-based 3D level set segmentation technique are used in subsequent processing modules to segment renal parenchyma. Furthermore, our framework recognizes and excludes renal cysts from parenchymal volume, which is important to analyze renal functions. Volume errors and Dice coefficients show that our presented framework outperforms existing approaches.

  5. Fully automatized renal parenchyma volumetry using a support vector machine based recognition system for subject-specific probability map generation in native MR volume data.

    PubMed

    Gloger, Oliver; Tönnies, Klaus; Mensel, Birger; Völzke, Henry

    2015-11-21

    In epidemiological studies as well as in clinical practice the amount of produced medical image data strongly increased in the last decade. In this context organ segmentation in MR volume data gained increasing attention for medical applications. Especially in large-scale population-based studies organ volumetry is highly relevant requiring exact organ segmentation. Since manual segmentation is time-consuming and prone to reader variability, large-scale studies need automatized methods to perform organ segmentation. Fully automatic organ segmentation in native MR image data has proven to be a very challenging task. Imaging artifacts as well as inter- and intrasubject MR-intensity differences complicate the application of supervised learning strategies. Thus, we propose a modularized framework of a two-stepped probabilistic approach that generates subject-specific probability maps for renal parenchyma tissue, which are refined subsequently by using several, extended segmentation strategies. We present a three class-based support vector machine recognition system that incorporates Fourier descriptors as shape features to recognize and segment characteristic parenchyma parts. Probabilistic methods use the segmented characteristic parenchyma parts to generate high quality subject-specific parenchyma probability maps. Several refinement strategies including a final shape-based 3D level set segmentation technique are used in subsequent processing modules to segment renal parenchyma. Furthermore, our framework recognizes and excludes renal cysts from parenchymal volume, which is important to analyze renal functions. Volume errors and Dice coefficients show that our presented framework outperforms existing approaches.

  6. Automatic measurement and representation of prosodic features

    NASA Astrophysics Data System (ADS)

    Ying, Goangshiuan Shawn

    Effective measurement and representation of prosodic features of the acoustic signal for use in automatic speech recognition and understanding systems is the goal of this work. Prosodic features-stress, duration, and intonation-are variations of the acoustic signal whose domains are beyond the boundaries of each individual phonetic segment. Listeners perceive prosodic features through a complex combination of acoustic correlates such as intensity, duration, and fundamental frequency (F0). We have developed new tools to measure F0 and intensity features. We apply a probabilistic global error correction routine to an Average Magnitude Difference Function (AMDF) pitch detector. A new short-term frequency-domain Teager energy algorithm is used to measure the energy of a speech signal. We have conducted a series of experiments performing lexical stress detection on words in continuous English speech from two speech corpora. We have experimented with two different approaches, a segment-based approach and a rhythm unit-based approach, in lexical stress detection. The first approach uses pattern recognition with energy- and duration-based measurements as features to build Bayesian classifiers to detect the stress level of a vowel segment. In the second approach we define rhythm unit and use only the F0-based measurement and a scoring system to determine the stressed segment in the rhythm unit. A duration-based segmentation routine was developed to break polysyllabic words into rhythm units. The long-term goal of this work is to develop a system that can effectively detect the stress pattern for each word in continuous speech utterances. Stress information will be integrated as a constraint for pruning the word hypotheses in a word recognition system based on hidden Markov models.

  7. Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

    NASA Technical Reports Server (NTRS)

    Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

    2010-01-01

    A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.

  8. Novel Approaches to Improve Iris Recognition System Performance Based on Local Quality Evaluation and Feature Fusion

    PubMed Central

    2014-01-01

    For building a new iris template, this paper proposes a strategy to fuse different portions of iris based on machine learning method to evaluate local quality of iris. There are three novelties compared to previous work. Firstly, the normalized segmented iris is divided into multitracks and then each track is estimated individually to analyze the recognition accuracy rate (RAR). Secondly, six local quality evaluation parameters are adopted to analyze texture information of each track. Besides, particle swarm optimization (PSO) is employed to get the weights of these evaluation parameters and corresponding weighted coefficients of different tracks. Finally, all tracks' information is fused according to the weights of different tracks. The experimental results based on subsets of three public and one private iris image databases demonstrate three contributions of this paper. (1) Our experimental results prove that partial iris image cannot completely replace the entire iris image for iris recognition system in several ways. (2) The proposed quality evaluation algorithm is a self-adaptive algorithm, and it can automatically optimize the parameters according to iris image samples' own characteristics. (3) Our feature information fusion strategy can effectively improve the performance of iris recognition system. PMID:24693243

  9. Novel approaches to improve iris recognition system performance based on local quality evaluation and feature fusion.

    PubMed

    Chen, Ying; Liu, Yuanning; Zhu, Xiaodong; Chen, Huiling; He, Fei; Pang, Yutong

    2014-01-01

    For building a new iris template, this paper proposes a strategy to fuse different portions of iris based on machine learning method to evaluate local quality of iris. There are three novelties compared to previous work. Firstly, the normalized segmented iris is divided into multitracks and then each track is estimated individually to analyze the recognition accuracy rate (RAR). Secondly, six local quality evaluation parameters are adopted to analyze texture information of each track. Besides, particle swarm optimization (PSO) is employed to get the weights of these evaluation parameters and corresponding weighted coefficients of different tracks. Finally, all tracks' information is fused according to the weights of different tracks. The experimental results based on subsets of three public and one private iris image databases demonstrate three contributions of this paper. (1) Our experimental results prove that partial iris image cannot completely replace the entire iris image for iris recognition system in several ways. (2) The proposed quality evaluation algorithm is a self-adaptive algorithm, and it can automatically optimize the parameters according to iris image samples' own characteristics. (3) Our feature information fusion strategy can effectively improve the performance of iris recognition system.

  10. Prosody's Contribution to Fluency: An Examination of the Theory of Automatic Information Processing

    ERIC Educational Resources Information Center

    Schrauben, Julie E.

    2010-01-01

    LaBerge and Samuels' (1974) theory of automatic information processing in reading offers a model that explains how and where the processing of information occurs and the degree to which processing of information occurs. These processes are dependent upon two criteria: accurate word decoding and automatic word recognition. However, LaBerge and…

  11. Towards the Development of a Comprehensive Pedagogical Framework for Pronunciation Training Based on Adapted Automatic Speech Recognition Systems

    ERIC Educational Resources Information Center

    Ali, Saandia

    2016-01-01

    This paper reports on the early stages of a locally funded research and development project taking place at Rennes 2 university. It aims at developing a comprehensive pedagogical framework for pronunciation training for adult learners of English. This framework will combine a direct approach to pronunciation training (face-to-face teaching) with…

  12. SigVox - A 3D feature matching algorithm for automatic street object recognition in mobile laser scanning point clouds

    NASA Astrophysics Data System (ADS)

    Wang, Jinhu; Lindenbergh, Roderik; Menenti, Massimo

    2017-06-01

    Urban road environments contain a variety of objects including different types of lamp poles and traffic signs. Its monitoring is traditionally conducted by visual inspection, which is time consuming and expensive. Mobile laser scanning (MLS) systems sample the road environment efficiently by acquiring large and accurate point clouds. This work proposes a methodology for urban road object recognition from MLS point clouds. The proposed method uses, for the first time, shape descriptors of complete objects to match repetitive objects in large point clouds. To do so, a novel 3D multi-scale shape descriptor is introduced, that is embedded in a workflow that efficiently and automatically identifies different types of lamp poles and traffic signs. The workflow starts by tiling the raw point clouds along the scanning trajectory and by identifying non-ground points. After voxelization of the non-ground points, connected voxels are clustered to form candidate objects. For automatic recognition of lamp poles and street signs, a 3D significant eigenvector based shape descriptor using voxels (SigVox) is introduced. The 3D SigVox descriptor is constructed by first subdividing the points with an octree into several levels. Next, significant eigenvectors of the points in each voxel are determined by principal component analysis (PCA) and mapped onto the appropriate triangle of a sphere approximating icosahedron. This step is repeated for different scales. By determining the similarity of 3D SigVox descriptors between candidate point clusters and training objects, street furniture is automatically identified. The feasibility and quality of the proposed method is verified on two point clouds obtained in opposite direction of a stretch of road of 4 km. 6 types of lamp pole and 4 types of road sign were selected as objects of interest. Ground truth validation showed that the overall accuracy of the ∼170 automatically recognized objects is approximately 95%. The results demonstrate that the proposed method is able to recognize street furniture in a practical scenario. Remaining difficult cases are touching objects, like a lamp pole close to a tree.

  13. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model

    NASA Astrophysics Data System (ADS)

    He, Di; Lim, Boon Pang; Yang, Xuesong; Hasegawa-Johnson, Mark; Chen, Deming

    2018-06-01

    Most mainstream Automatic Speech Recognition (ASR) systems consider all feature frames equally important. However, acoustic landmark theory is based on a contradictory idea, that some frames are more important than others. Acoustic landmark theory exploits quantal non-linearities in the articulatory-acoustic and acoustic-perceptual relations to define landmark times at which the speech spectrum abruptly changes or reaches an extremum; frames overlapping landmarks have been demonstrated to be sufficient for speech perception. In this work, we conduct experiments on the TIMIT corpus, with both GMM and DNN based ASR systems and find that frames containing landmarks are more informative for ASR than others. We find that altering the level of emphasis on landmarks by re-weighting acoustic likelihood tends to reduce the phone error rate (PER). Furthermore, by leveraging the landmark as a heuristic, one of our hybrid DNN frame dropping strategies maintained a PER within 0.44% of optimal when scoring less than half (45.8% to be precise) of the frames. This hybrid strategy out-performs other non-heuristic-based methods and demonstrate the potential of landmarks for reducing computation.

  14. Automatic speech recognition in air-ground data link

    NASA Technical Reports Server (NTRS)

    Armstrong, Herbert B.

    1989-01-01

    In the present air traffic system, information presented to the transport aircraft cockpit crew may originate from a variety of sources and may be presented to the crew in visual or aural form, either through cockpit instrument displays or, most often, through voice communication. Voice radio communications are the most error prone method for air-ground data link. Voice messages can be misstated or misunderstood and radio frequency congestion can delay or obscure important messages. To prevent proliferation, a multiplexed data link display can be designed to present information from multiple data link sources on a shared cockpit display unit (CDU) or multi-function display (MFD) or some future combination of flight management and data link information. An aural data link which incorporates an automatic speech recognition (ASR) system for crew response offers several advantages over visual displays. The possibility of applying ASR to the air-ground data link was investigated. The first step was to review current efforts in ASR applications in the cockpit and in air traffic control and evaluated their possible data line application. Next, a series of preliminary research questions is to be developed for possible future collaboration.

  15. Clustering and classification of infrasonic events at Mount Etna using pattern recognition techniques

    NASA Astrophysics Data System (ADS)

    Cannata, A.; Montalto, P.; Aliotta, M.; Cassisi, C.; Pulvirenti, A.; Privitera, E.; Patanè, D.

    2011-04-01

    Active volcanoes generate sonic and infrasonic signals, whose investigation provides useful information for both monitoring purposes and the study of the dynamics of explosive phenomena. At Mt. Etna volcano (Italy), a pattern recognition system based on infrasonic waveform features has been developed. First, by a parametric power spectrum method, the features describing and characterizing the infrasound events were extracted: peak frequency and quality factor. Then, together with the peak-to-peak amplitude, these features constituted a 3-D ‘feature space’; by Density-Based Spatial Clustering of Applications with Noise algorithm (DBSCAN) three clusters were recognized inside it. After the clustering process, by using a common location method (semblance method) and additional volcanological information concerning the intensity of the explosive activity, we were able to associate each cluster to a particular source vent and/or a kind of volcanic activity. Finally, for automatic event location, clusters were used to train a model based on Support Vector Machine, calculating optimal hyperplanes able to maximize the margins of separation among the clusters. After the training phase this system automatically allows recognizing the active vent with no location algorithm and by using only a single station.

  16. Fast title extraction method for business documents

    NASA Astrophysics Data System (ADS)

    Katsuyama, Yutaka; Naoi, Satoshi

    1997-04-01

    Conventional electronic document filing systems are inconvenient because the user must specify the keywords in each document for later searches. To solve this problem, automatic keyword extraction methods using natural language processing and character recognition have been developed. However, these methods are slow, especially for japanese documents. To develop a practical electronic document filing system, we focused on the extraction of keyword areas from a document by image processing. Our fast title extraction method can automatically extract titles as keywords from business documents. All character strings are evaluated for similarity by rating points associated with title similarity. We classified these points as four items: character sitting size, position of character strings, relative position among character strings, and string attribution. Finally, the character string that has the highest rating is selected as the title area. The character recognition process is carried out on the selected area. It is fast because this process must recognize a small number of patterns in the restricted area only, and not throughout the entire document. The mean performance of this method is an accuracy of about 91 percent and a 1.8 sec. processing time for an examination of 100 Japanese business documents.

  17. Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data

    PubMed Central

    Cao, Beiming; Kim, Myungjong; Mau, Ted; Wang, Jun

    2017-01-01

    Individuals with larynx (vocal folds) impaired have problems in controlling their glottal vibration, producing whispered speech with extreme hoarseness. Standard automatic speech recognition using only acoustic cues is typically ineffective for whispered speech because the corresponding spectral characteristics are distorted. Articulatory cues such as the tongue and lip motion may help in recognizing whispered speech since articulatory motion patterns are generally not affected. In this paper, we investigated whispered speech recognition for patients with reconstructed larynx using articulatory movement data. A data set with both acoustic and articulatory motion data was collected from a patient with surgically reconstructed larynx using an electromagnetic articulograph. Two speech recognition systems, Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network-HMM (DNN-HMM), were used in the experiments. Experimental results showed adding either tongue or lip motion data to acoustic features such as mel-frequency cepstral coefficient (MFCC) significantly reduced the phone error rates on both speech recognition systems. Adding both tongue and lip data achieved the best performance. PMID:29423453

  18. Research on application of LADAR in ground vehicle recognition

    NASA Astrophysics Data System (ADS)

    Lan, Jinhui; Shen, Zhuoxun

    2009-11-01

    For the requirement of many practical applications in the field of military, the research of 3D target recognition is active. The representation that captures the salient attributes of a 3D target independent of the viewing angle will be especially useful to the automatic 3D target recognition system. This paper presents a new approach of image generation based on Laser Detection and Ranging (LADAR) data. Range image of target is obtained by transformation of point cloud. In order to extract features of different ground vehicle targets and to recognize targets, zernike moment properties of typical ground vehicle targets are researched in this paper. A technique of support vector machine is applied to the classification and recognition of target. The new method of image generation and feature representation has been applied to the outdoor experiments. Through outdoor experiments, it can be proven that the method of image generation is stability, the moments are effective to be used as features for recognition, and the LADAR can be applied to the field of 3D target recognition.

  19. Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

    NASA Astrophysics Data System (ADS)

    Kayasith, Prakasith; Theeramunkong, Thanaruk

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  20. Detecting buried explosive hazards with handheld GPR and deep learning

    NASA Astrophysics Data System (ADS)

    Besaw, Lance E.

    2016-05-01

    Buried explosive hazards (BEHs), including traditional landmines and homemade improvised explosives, have proven difficult to detect and defeat during and after conflicts around the world. Despite their various sizes, shapes and construction material, ground penetrating radar (GPR) is an excellent phenomenology for detecting BEHs due to its ability to sense localized differences in electromagnetic properties. Handheld GPR detectors are common equipment for detecting BEHs because of their flexibility (in part due to the human operator) and effectiveness in cluttered environments. With modern digital electronics and positioning systems, handheld GPR sensors can sense and map variation in electromagnetic properties while searching for BEHs. Additionally, large-scale computers have demonstrated an insatiable appetite for ingesting massive datasets and extracting meaningful relationships. This is no more evident than the maturation of deep learning artificial neural networks (ANNs) for image and speech recognition now commonplace in industry and academia. This confluence of sensing, computing and pattern recognition technologies offers great potential to develop automatic target recognition techniques to assist GPR operators searching for BEHs. In this work deep learning ANNs are used to detect BEHs and discriminate them from harmless clutter. We apply these techniques to a multi-antennae, handheld GPR with centimeter-accurate positioning system that was used to collect data over prepared lanes containing a wide range of BEHs. This work demonstrates that deep learning ANNs can automatically extract meaningful information from complex GPR signatures, complementing existing GPR anomaly detection and classification techniques.

  1. Photonic correlator pattern recognition: Application to autonomous docking

    NASA Technical Reports Server (NTRS)

    Sjolander, Gary W.

    1991-01-01

    Optical correlators for real-time automatic pattern recognition applications have recently become feasible due to advances in high speed devices and filter formulation concepts. The devices are discussed in the context of their use in autonomous docking.

  2. Review of Medical Image Classification using the Adaptive Neuro-Fuzzy Inference System

    PubMed Central

    Hosseini, Monireh Sheikh; Zekri, Maryam

    2012-01-01

    Image classification is an issue that utilizes image processing, pattern recognition and classification methods. Automatic medical image classification is a progressive area in image classification, and it is expected to be more developed in the future. Because of this fact, automatic diagnosis can assist pathologists by providing second opinions and reducing their workload. This paper reviews the application of the adaptive neuro-fuzzy inference system (ANFIS) as a classifier in medical image classification during the past 16 years. ANFIS is a fuzzy inference system (FIS) implemented in the framework of an adaptive fuzzy neural network. It combines the explicit knowledge representation of an FIS with the learning power of artificial neural networks. The objective of ANFIS is to integrate the best features of fuzzy systems and neural networks. A brief comparison with other classifiers, main advantages and drawbacks of this classifier are investigated. PMID:23493054

  3. Social cognition in schizophrenia and healthy aging: differences and similarities.

    PubMed

    Silver, Henry; Bilker, Warren B

    2014-12-01

    Social cognition is impaired in schizophrenia but it is not clear whether this is specific for the illness and whether emotion perception is selectively affected. To study this we examined the perception of emotional and non-emotional clues in facial expressions, a key social cognitive skill, in schizophrenia patients and old healthy individuals using young healthy individuals as reference. Tests of object recognition, visual orientation, psychomotor speed, and working memory were included to allow multivariate analysis taking into account other cognitive functions Schizophrenia patients showed impairments in recognition of identity and emotional facial clues compared to young and old healthy groups. Severity was similar to that for object recognition and visuospatial processing. Older and younger healthy groups did not differ from each other on these tests. Schizophrenia patients and old healthy individuals were similarly impaired in the ability to automatically learn new faces during the testing procedure (measured by the CSTFAC index) compared to young healthy individuals. Social cognition is distinctly impaired in schizophrenia compared to healthy aging. Further study is needed to identify the mechanisms of automatic social cognitive learning impairment in schizophrenia patients and healthy aging individuals and determine whether similar neural systems are affected. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

    PubMed

    Schädler, Marc René; Kollmeier, Birger

    2015-04-01

    To test if simultaneous spectral and temporal processing is required to extract robust features for automatic speech recognition (ASR), the robust spectro-temporal two-dimensional-Gabor filter bank (GBFB) front-end from Schädler, Meyer, and Kollmeier [J. Acoust. Soc. Am. 131, 4134-4151 (2012)] was de-composed into a spectral one-dimensional-Gabor filter bank and a temporal one-dimensional-Gabor filter bank. A feature set that is extracted with these separate spectral and temporal modulation filter banks was introduced, the separate Gabor filter bank (SGBFB) features, and evaluated on the CHiME (Computational Hearing in Multisource Environments) keywords-in-noise recognition task. From the perspective of robust ASR, the results showed that spectral and temporal processing can be performed independently and are not required to interact with each other. Using SGBFB features permitted the signal-to-noise ratio (SNR) to be lowered by 1.2 dB while still performing as well as the GBFB-based reference system, which corresponds to a relative improvement of the word error rate by 12.8%. Additionally, the real time factor of the spectro-temporal processing could be reduced by more than an order of magnitude. Compared to human listeners, the SNR needed to be 13 dB higher when using Mel-frequency cepstral coefficient features, 11 dB higher when using GBFB features, and 9 dB higher when using SGBFB features to achieve the same recognition performance.

  5. Automatic cloud tracking applied to GOES and Meteosat observations

    NASA Technical Reports Server (NTRS)

    Endlich, R. M.; Wolf, D. E.

    1981-01-01

    An improved automatic processing method for the tracking of cloud motions as revealed by satellite imagery is presented and applications of the method to GOES observations of Hurricane Eloise and Meteosat water vapor and infrared data are presented. The method is shown to involve steps of picture smoothing, target selection and the calculation of cloud motion vectors by the matching of a group at a given time with its best likeness at a later time, or by a cross-correlation computation. Cloud motion computations can be made in as many as four separate layers simultaneously. For data of 4 and 8 km resolution in the eye of Hurricane Eloise, the automatic system is found to provide results comparable in accuracy and coverage to those obtained by NASA analysts using the Atmospheric and Oceanographic Information Processing System, with results obtained by the pattern recognition and cross correlation computations differing by only fractions of a pixel. For Meteosat water vapor data from the tropics and midlatitudes, the automatic motion computations are found to be reliable only in areas where the water vapor fields contained small-scale structure, although excellent results are obtained using Meteosat IR data in the same regions. The automatic method thus appears to be competitive in accuracy and coverage with motion determination by human analysts.

  6. Real-time color/shape-based traffic signs acquisition and recognition system

    NASA Astrophysics Data System (ADS)

    Saponara, Sergio

    2013-02-01

    A real-time system is proposed to acquire from an automotive fish-eye CMOS camera the traffic signs, and provide their automatic recognition on the vehicle network. Differently from the state-of-the-art, in this work color-detection is addressed exploiting the HSI color space which is robust to lighting changes. Hence the first stage of the processing system implements fish-eye correction and RGB to HSI transformation. After color-based detection a noise deletion step is implemented and then, for the classification, a template-based correlation method is adopted to identify potential traffic signs, of different shapes, from acquired images. Starting from a segmented-image a matching with templates of the searched signs is carried out using a distance transform. These templates are organized hierarchically to reduce the number of operations and hence easing real-time processing for several types of traffic signs. Finally, for the recognition of the specific traffic sign, a technique based on extraction of signs characteristics and thresholding is adopted. Implemented on DSP platform the system recognizes traffic signs in less than 150 ms at a distance of about 15 meters from 640x480-pixel acquired images. Tests carried out with hundreds of images show a detection and recognition rate of about 93%.

  7. Automatic speech recognition using a predictive echo state network classifier.

    PubMed

    Skowronski, Mark D; Harris, John G

    2007-04-01

    We have combined an echo state network (ESN) with a competitive state machine framework to create a classification engine called the predictive ESN classifier. We derive the expressions for training the predictive ESN classifier and show that the model was significantly more noise robust compared to a hidden Markov model in noisy speech classification experiments by 8+/-1 dB signal-to-noise ratio. The simple training algorithm and noise robustness of the predictive ESN classifier make it an attractive classification engine for automatic speech recognition.

  8. TeraSCREEN: multi-frequency multi-mode Terahertz screening for border checks

    NASA Astrophysics Data System (ADS)

    Alexander, Naomi E.; Alderman, Byron; Allona, Fernando; Frijlink, Peter; Gonzalo, Ramón; Hägelen, Manfred; Ibáñez, Asier; Krozer, Viktor; Langford, Marian L.; Limiti, Ernesto; Platt, Duncan; Schikora, Marek; Wang, Hui; Weber, Marc Andree

    2014-06-01

    The challenge for any security screening system is to identify potentially harmful objects such as weapons and explosives concealed under clothing. Classical border and security checkpoints are no longer capable of fulfilling the demands of today's ever growing security requirements, especially with respect to the high throughput generally required which entails a high detection rate of threat material and a low false alarm rate. TeraSCREEN proposes to develop an innovative concept of multi-frequency multi-mode Terahertz and millimeter-wave detection with new automatic detection and classification functionalities. The system developed will demonstrate, at a live control point, the safe automatic detection and classification of objects concealed under clothing, whilst respecting privacy and increasing current throughput rates. This innovative screening system will combine multi-frequency, multi-mode images taken by passive and active subsystems which will scan the subjects and obtain complementary spatial and spectral information, thus allowing for automatic threat recognition. The TeraSCREEN project, which will run from 2013 to 2016, has received funding from the European Union's Seventh Framework Programme under the Security Call. This paper will describe the project objectives and approach.

  9. Data handling and analysis for the 1971 corn blight watch experiment.

    NASA Technical Reports Server (NTRS)

    Anuta, P. E.; Phillips, T. L.; Landgrebe, D. A.

    1972-01-01

    Review of the data handling and analysis methods used in the near-operational test of remote sensing systems provided by the 1971 corn blight watch experiment. The general data analysis techniques and, particularly, the statistical multispectral pattern recognition methods for automatic computer analysis of aircraft scanner data are described. Some of the results obtained are examined, and the implications of the experiment for future data communication requirements of earth resource survey systems are discussed.

  10. Digital-Electronic/Optical Apparatus Would Recognize Targets

    NASA Technical Reports Server (NTRS)

    Scholl, Marija S.

    1994-01-01

    Proposed automatic target-recognition apparatus consists mostly of digital-electronic/optical cross-correlator that processes infrared images of targets. Infrared images of unknown targets correlated quickly with images of known targets. Apparatus incorporates some features of correlator described in "Prototype Optical Correlator for Robotic Vision System" (NPO-18451), and some of correlator described in "Compact Optical Correlator" (NPO-18473). Useful in robotic system; to recognize and track infrared-emitting, moving objects as variously shaped hot workpieces on conveyor belt.

  11. Robust parameterization of time-frequency characteristics for recognition of musical genres of Mexican culture

    NASA Astrophysics Data System (ADS)

    Pérez Rosas, Osvaldo G.; Rivera Martínez, José L.; Maldonado Cano, Luis A.; López Rodríguez, Mario; Amaya Reyes, Laura M.; Cano Martínez, Elizabeth; García Vázquez, Mireya S.; Ramírez Acosta, Alejandro A.

    2017-09-01

    The automatic identification and classification of musical genres based on the sound similarities to form musical textures, it is a very active investigation area. In this context it has been created recognition systems of musical genres, formed by time-frequency characteristics extraction methods and by classification methods. The selection of this methods are important for a good development in the recognition systems. In this article they are proposed the Mel-Frequency Cepstral Coefficients (MFCC) methods as a characteristic extractor and Support Vector Machines (SVM) as a classifier for our system. The stablished parameters of the MFCC method in the system by our time-frequency analysis, represents the gamma of Mexican culture musical genres in this article. For the precision of a classification system of musical genres it is necessary that the descriptors represent the correct spectrum of each gender; to achieve this we must realize a correct parametrization of the MFCC like the one we present in this article. With the system developed we get satisfactory detection results, where the least identification percentage of musical genres was 66.67% and the one with the most precision was 100%.

  12. Optical implementation of a feature-based neural network with application to automatic target recognition

    NASA Technical Reports Server (NTRS)

    Chao, Tien-Hsin; Stoner, William W.

    1993-01-01

    An optical neural network based on the neocognitron paradigm is introduced. A novel aspect of the architecture design is shift-invariant multichannel Fourier optical correlation within each processing layer. Multilayer processing is achieved by feeding back the ouput of the feature correlator interatively to the input spatial light modulator and by updating the Fourier filters. By training the neural net with characteristic features extracted from the target images, successful pattern recognition with intraclass fault tolerance and interclass discrimination is achieved. A detailed system description is provided. Experimental demonstrations of a two-layer neural network for space-object discrimination is also presented.

  13. Automatic target recognition using a feature-based optical neural network

    NASA Technical Reports Server (NTRS)

    Chao, Tien-Hsin

    1992-01-01

    An optical neural network based upon the Neocognitron paradigm (K. Fukushima et al. 1983) is introduced. A novel aspect of the architectural design is shift-invariant multichannel Fourier optical correlation within each processing layer. Multilayer processing is achieved by iteratively feeding back the output of the feature correlator to the input spatial light modulator and updating the Fourier filters. By training the neural net with characteristic features extracted from the target images, successful pattern recognition with intra-class fault tolerance and inter-class discrimination is achieved. A detailed system description is provided. Experimental demonstration of a two-layer neural network for space objects discrimination is also presented.

  14. Testing Saliency Parameters for Automatic Target Recognition

    NASA Technical Reports Server (NTRS)

    Pandya, Sagar

    2012-01-01

    A bottom-up visual attention model (the saliency model) is tested to enhance the performance of Automated Target Recognition (ATR). JPL has developed an ATR system that identifies regions of interest (ROI) using a trained OT-MACH filter, and then classifies potential targets as true- or false-positives using machine-learning techniques. In this project, saliency is used as a pre-processing step to reduce the space for performing OT-MACH filtering. Saliency parameters, such as output level and orientation weight, are tuned to detect known target features. Preliminary results are promising and future work entails a rigrous and parameter-based search to gain maximum insight about this method.

  15. Speaker emotion recognition: from classical classifiers to deep neural networks

    NASA Astrophysics Data System (ADS)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2018-04-01

    Speaker emotion recognition is considered among the most challenging tasks in recent years. In fact, automatic systems for security, medicine or education can be improved when considering the speech affective state. In this paper, a twofold approach for speech emotion classification is proposed. At the first side, a relevant set of features is adopted, and then at the second one, numerous supervised training techniques, involving classic methods as well as deep learning, are experimented. Experimental results indicate that deep architecture can improve classification performance on two affective databases, the Berlin Dataset of Emotional Speech and the SAVEE Dataset Surrey Audio-Visual Expressed Emotion.

  16. System for analysis of LANDSAT agricultural data: Automatic computer-assisted proportion estimation of local areas

    NASA Technical Reports Server (NTRS)

    Nalepka, R. F. (Principal Investigator); Kauth, R. J.; Thomas, G. S.

    1976-01-01

    The author has identified the following significant results. A conceptual man machine system framework was created for a large scale agricultural remote sensing system. The system is based on and can grow out of the local recognition mode of LACIE, through a gradual transition wherein computer support functions supplement and replace AI functions. Local proportion estimation functions are broken into two broad classes: (1) organization of the data within the sample segment; and (2) identification of the fields or groups of fields in the sample segment.

  17. Thai Automatic Speech Recognition

    DTIC Science & Technology

    2005-01-01

    used in an external DARPA evaluation involving medical scenarios between an American Doctor and a naïve monolingual Thai patient. 2. Thai Language... dictionary generation more challenging, and (3) the lack of word segmentation, which calls for automatic segmentation approaches to make n-gram language...requires a dictionary and provides various segmentation algorithms to automatically select suitable segmentations. Here we used a maximal matching

  18. An algorithm for automatic target recognition using passive radar and an EKF for estimating aircraft orientation

    NASA Astrophysics Data System (ADS)

    Ehrman, Lisa M.

    2005-07-01

    Rather than emitting pulses, passive radar systems rely on "illuminators of opportunity," such as TV and FM radio, to illuminate potential targets. These systems are attractive since they allow receivers to operate without emitting energy, rendering them covert. Until recently, most of the research regarding passive radar has focused on detecting and tracking targets. This dissertation focuses on extending the capabilities of passive radar systems to include automatic target recognition. The target recognition algorithm described in this dissertation uses the radar cross section (RCS) of potential targets, collected over a short period of time, as the key information for target recognition. To make the simulated RCS as accurate as possible, the received signal model accounts for aircraft position and orientation, propagation losses, and antenna gain patterns. An extended Kalman filter (EKF) estimates the target's orientation (and uncertainty in the estimate) from velocity measurements obtained from the passive radar tracker. Coupling the aircraft orientation and state with the known antenna locations permits computation of the incident and observed azimuth and elevation angles. The Fast Illinois Solver Code (FISC) simulates the RCS of potential target classes as a function of these angles. Thus, the approximated incident and observed angles allow the appropriate RCS to be extracted from a database of FISC results. Using this process, the RCS of each aircraft in the target class is simulated as though each is executing the same maneuver as the target detected by the system. Two additional scaling processes are required to transform the RCS into a power profile (magnitude only) simulating the signal in the receiver. First, the RCS is scaled by the Advanced Refractive Effects Prediction System (AREPS) code to account for propagation losses that occur as functions of altitude and range. Then, the Numerical Electromagnetic Code (NEC2) computes the antenna gain pattern, further scaling the RCS. A Rician likelihood model compares the scaled RCS of the illuminated aircraft with those of the potential targets. To improve the robustness of the result, the algorithm jointly optimizes over feasible orientation profiles and target types via dynamic programming.

  19. Automatic updating and 3D modeling of airport information from high resolution images using GIS and LIDAR data

    NASA Astrophysics Data System (ADS)

    Lv, Zheng; Sui, Haigang; Zhang, Xilin; Huang, Xianfeng

    2007-11-01

    As one of the most important geo-spatial objects and military establishment, airport is always a key target in fields of transportation and military affairs. Therefore, automatic recognition and extraction of airport from remote sensing images is very important and urgent for updating of civil aviation and military application. In this paper, a new multi-source data fusion approach on automatic airport information extraction, updating and 3D modeling is addressed. Corresponding key technologies including feature extraction of airport information based on a modified Ostu algorithm, automatic change detection based on new parallel lines-based buffer detection algorithm, 3D modeling based on gradual elimination of non-building points algorithm, 3D change detecting between old airport model and LIDAR data, typical CAD models imported and so on are discussed in detail. At last, based on these technologies, we develop a prototype system and the results show our method can achieve good effects.

  20. [Research on automatic external defibrillator based on DSP].

    PubMed

    Jing, Jun; Ding, Jingyan; Zhang, Wei; Hong, Wenxue

    2012-10-01

    Electrical defibrillation is the most effective way to treat the ventricular tachycardia (VT) and ventricular fibrillation (VF). An automatic external defibrillator based on DSP is introduced in this paper. The whole design consists of the signal collection module, the microprocessor controlingl module, the display module, the defibrillation module and the automatic recognition algorithm for VF and non VF, etc. This automatic external defibrillator has achieved goals such as ECG signal real-time acquisition, ECG wave synchronous display, data delivering to U disk and automatic defibrillate when shockable rhythm appears, etc.

  1. Popular song and lyrics synchronization and its application to music information retrieval

    NASA Astrophysics Data System (ADS)

    Chen, Kai; Gao, Sheng; Zhu, Yongwei; Sun, Qibin

    2006-01-01

    An automatic synchronization system of the popular song and its lyrics is presented in the paper. The system includes two main components: a) automatically detecting vocal/non-vocal in the audio signal and b) automatically aligning the acoustic signal of the song with its lyric using speech recognition techniques and positioning the boundaries of the lyrics in its acoustic realization at the multiple levels simultaneously (e.g. the word / syllable level and phrase level). The GMM models and a set of HMM-based acoustic model units are carefully designed and trained for the detection and alignment. To eliminate the severe mismatch due to the diversity of musical signal and sparse training data available, the unsupervised adaptation technique such as maximum likelihood linear regression (MLLR) is exploited for tailoring the models to the real environment, which improves robustness of the synchronization system. To further reduce the effect of the missed non-vocal music on alignment, a novel grammar net is build to direct the alignment. As we know, this is the first automatic synchronization system only based on the low-level acoustic feature such as MFCC. We evaluate the system on a Chinese song dataset collecting from 3 popular singers. We obtain 76.1% for the boundary accuracy at the syllable level (BAS) and 81.5% for the boundary accuracy at the phrase level (BAP) using fully automatic vocal/non-vocal detection and alignment. The synchronization system has many applications such as multi-modality (audio and textual) content-based popular song browsing and retrieval. Through the study, we would like to open up the discussion of some challenging problems when developing a robust synchronization system for largescale database.

  2. Street curb recognition in 3d point cloud data using morphological operations

    NASA Astrophysics Data System (ADS)

    Rodríguez-Cuenca, Borja; Concepción Alonso-Rodríguez, María; García-Cortés, Silverio; Ordóñez, Celestino

    2015-04-01

    Accurate and automatic detection of cartographic-entities saves a great deal of time and money when creating and updating cartographic databases. The current trend in remote sensing feature extraction is to develop methods that are as automatic as possible. The aim is to develop algorithms that can obtain accurate results with the least possible human intervention in the process. Non-manual curb detection is an important issue in road maintenance, 3D urban modeling, and autonomous navigation fields. This paper is focused on the semi-automatic recognition of curbs and street boundaries using a 3D point cloud registered by a mobile laser scanner (MLS) system. This work is divided into four steps. First, a coordinate system transformation is carried out, moving from a global coordinate system to a local one. After that and in order to simplify the calculations involved in the procedure, a rasterization based on the projection of the measured point cloud on the XY plane was carried out, passing from the 3D original data to a 2D image. To determine the location of curbs in the image, different image processing techniques such as thresholding and morphological operations were applied. Finally, the upper and lower edges of curbs are detected by an unsupervised classification algorithm on the curvature and roughness of the points that represent curbs. The proposed method is valid in both straight and curved road sections and applicable both to laser scanner and stereo vision 3D data due to the independence of its scanning geometry. This method has been successfully tested with two datasets measured by different sensors. The first dataset corresponds to a point cloud measured by a TOPCON sensor in the Spanish town of Cudillero. That point cloud comprises more than 6,000,000 points and covers a 400-meter street. The second dataset corresponds to a point cloud measured by a RIEGL sensor in the Austrian town of Horn. That point cloud comprises 8,000,000 points and represents a 160-meter street. The proposed method provides success rates in curb recognition of over 85% in both datasets.

  3. 3D automatic anatomy segmentation based on iterative graph-cut-ASM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Xinjian; Bagci, Ulas; Radiology and Imaging Sciences, Clinical Center, National Institutes of Health, Building 10 Room 1C515, Bethesda, Maryland 20892-1182

    2011-08-15

    Purpose: This paper studies the feasibility of developing an automatic anatomy segmentation (AAS) system in clinical radiology and demonstrates its operation on clinical 3D images. Methods: The AAS system, the authors are developing consists of two main parts: object recognition and object delineation. As for recognition, a hierarchical 3D scale-based multiobject method is used for the multiobject recognition task, which incorporates intensity weighted ball-scale (b-scale) information into the active shape model (ASM). For object delineation, an iterative graph-cut-ASM (IGCASM) algorithm is proposed, which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability ofmore » the GC method. The presented IGCASM algorithm is a 3D generalization of the 2D GC-ASM method that they proposed previously in Chen et al.[Proc. SPIE, 7259, 72590C1-72590C-8 (2009)]. The proposed methods are tested on two datasets comprised of images obtained from 20 patients (10 male and 10 female) of clinical abdominal CT scans, and 11 foot magnetic resonance imaging (MRI) scans. The test is for four organs (liver, left and right kidneys, and spleen) segmentation, five foot bones (calcaneus, tibia, cuboid, talus, and navicular). The recognition and delineation accuracies were evaluated separately. The recognition accuracy was evaluated in terms of translation, rotation, and scale (size) error. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF, FPVF). The efficiency of the delineation method was also evaluated on an Intel Pentium IV PC with a 3.4 GHZ CPU machine. Results: The recognition accuracies in terms of translation, rotation, and scale error over all organs are about 8 mm, 10 deg. and 0.03, and over all foot bones are about 3.5709 mm, 0.35 deg. and 0.025, respectively. The accuracy of delineation over all organs for all subjects as expressed in TPVF and FPVF is 93.01% and 0.22%, and all foot bones for all subjects are 93.75% and 0.28%, respectively. While the delineations for the four organs can be accomplished quite rapidly with average of 78 s, the delineations for the five foot bones can be accomplished with average of 70 s. Conclusions: The experimental results showed the feasibility and efficacy of the proposed automatic anatomy segmentation system: (a) the incorporation of shape priors into the GC framework is feasible in 3D as demonstrated previously for 2D images; (b) our results in 3D confirm the accuracy behavior observed in 2D. The hybrid strategy IGCASM seems to be more robust and accurate than ASM and GC individually; and (c) delineations within body regions and foot bones of clinical importance can be accomplished quite rapidly within 1.5 min.« less

  4. 3D automatic anatomy segmentation based on iterative graph-cut-ASM

    PubMed Central

    Chen, Xinjian; Bagci, Ulas

    2011-01-01

    Purpose: This paper studies the feasibility of developing an automatic anatomy segmentation (AAS) system in clinical radiology and demonstrates its operation on clinical 3D images.Methods: The AAS system, the authors are developing consists of two main parts: object recognition and object delineation. As for recognition, a hierarchical 3D scale-based multiobject method is used for the multiobject recognition task, which incorporates intensity weighted ball-scale (b-scale) information into the active shape model (ASM). For object delineation, an iterative graph-cut-ASM (IGCASM) algorithm is proposed, which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. The presented IGCASM algorithm is a 3D generalization of the 2D GC-ASM method that they proposed previously in Chen et al. [Proc. SPIE, 7259, 72590C1–72590C-8 (2009)]. The proposed methods are tested on two datasets comprised of images obtained from 20 patients (10 male and 10 female) of clinical abdominal CT scans, and 11 foot magnetic resonance imaging (MRI) scans. The test is for four organs (liver, left and right kidneys, and spleen) segmentation, five foot bones (calcaneus, tibia, cuboid, talus, and navicular). The recognition and delineation accuracies were evaluated separately. The recognition accuracy was evaluated in terms of translation, rotation, and scale (size) error. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF, FPVF). The efficiency of the delineation method was also evaluated on an Intel Pentium IV PC with a 3.4 GHZ CPU machine.Results: The recognition accuracies in terms of translation, rotation, and scale error over all organs are about 8 mm, 10° and 0.03, and over all foot bones are about 3.5709 mm, 0.35° and 0.025, respectively. The accuracy of delineation over all organs for all subjects as expressed in TPVF and FPVF is 93.01% and 0.22%, and all foot bones for all subjects are 93.75% and 0.28%, respectively. While the delineations for the four organs can be accomplished quite rapidly with average of 78 s, the delineations for the five foot bones can be accomplished with average of 70 s.Conclusions: The experimental results showed the feasibility and efficacy of the proposed automatic anatomy segmentation system: (a) the incorporation of shape priors into the GC framework is feasible in 3D as demonstrated previously for 2D images; (b) our results in 3D confirm the accuracy behavior observed in 2D. The hybrid strategy IGCASM seems to be more robust and accurate than ASM and GC individually; and (c) delineations within body regions and foot bones of clinical importance can be accomplished quite rapidly within 1.5 min. PMID:21928634

  5. Fully automatic segmentation of the femur from 3D-CT images using primitive shape recognition and statistical shape models.

    PubMed

    Ben Younes, Lassad; Nakajima, Yoshikazu; Saito, Toki

    2014-03-01

    Femur segmentation is well established and widely used in computer-assisted orthopedic surgery. However, most of the robust segmentation methods such as statistical shape models (SSM) require human intervention to provide an initial position for the SSM. In this paper, we propose to overcome this problem and provide a fully automatic femur segmentation method for CT images based on primitive shape recognition and SSM. Femur segmentation in CT scans was performed using primitive shape recognition based on a robust algorithm such as the Hough transform and RANdom SAmple Consensus. The proposed method is divided into 3 steps: (1) detection of the femoral head as sphere and the femoral shaft as cylinder in the SSM and the CT images, (2) rigid registration between primitives of SSM and CT image to initialize the SSM into the CT image, and (3) fitting of the SSM to the CT image edge using an affine transformation followed by a nonlinear fitting. The automated method provided good results even with a high number of outliers. The difference of segmentation error between the proposed automatic initialization method and a manual initialization method is less than 1 mm. The proposed method detects primitive shape position to initialize the SSM into the target image. Based on primitive shapes, this method overcomes the problem of inter-patient variability. Moreover, the results demonstrate that our method of primitive shape recognition can be used for 3D SSM initialization to achieve fully automatic segmentation of the femur.

  6. Autonomous target recognition using remotely sensed surface vibration measurements

    NASA Astrophysics Data System (ADS)

    Geurts, James; Ruck, Dennis W.; Rogers, Steven K.; Oxley, Mark E.; Barr, Dallas N.

    1993-09-01

    The remotely measured surface vibration signatures of tactical military ground vehicles are investigated for use in target classification and identification friend or foe (IFF) systems. The use of remote surface vibration sensing by a laser radar reduces the effects of partial occlusion, concealment, and camouflage experienced by automatic target recognition systems using traditional imagery in a tactical battlefield environment. Linear Predictive Coding (LPC) efficiently represents the vibration signatures and nearest neighbor classifiers exploit the LPC feature set using a variety of distortion metrics. Nearest neighbor classifiers achieve an 88 percent classification rate in an eight class problem, representing a classification performance increase of thirty percent from previous efforts. A novel confidence figure of merit is implemented to attain a 100 percent classification rate with less than 60 percent rejection. The high classification rates are achieved on a target set which would pose significant problems to traditional image-based recognition systems. The targets are presented to the sensor in a variety of aspects and engine speeds at a range of 1 kilometer. The classification rates achieved demonstrate the benefits of using remote vibration measurement in a ground IFF system. The signature modeling and classification system can also be used to identify rotary and fixed-wing targets.

  7. Suspicious activity recognition in infrared imagery using Hidden Conditional Random Fields for outdoor perimeter surveillance

    NASA Astrophysics Data System (ADS)

    Rogotis, Savvas; Ioannidis, Dimosthenis; Tzovaras, Dimitrios; Likothanassis, Spiros

    2015-04-01

    The aim of this work is to present a novel approach for automatic recognition of suspicious activities in outdoor perimeter surveillance systems based on infrared video processing. Through the combination of size, speed and appearance based features, like the Center-Symmetric Local Binary Patterns, short-term actions are identified and serve as input, along with user location, for modeling target activities using the theory of Hidden Conditional Random Fields. HCRFs are used to directly link a set of observations to the most appropriate activity label and as such to discriminate high risk activities (e.g. trespassing) from zero risk activities (e.g loitering outside the perimeter). Experimental results demonstrate the effectiveness of our approach in identifying suspicious activities for video surveillance systems.

  8. Identifying images of handwritten digits using deep learning in H2O

    NASA Astrophysics Data System (ADS)

    Sadhasivam, Jayakumar; Charanya, R.; Kumar, S. Harish; Srinivasan, A.

    2017-11-01

    Automatic digit recognition is of popular interest today. Deep learning techniques make it possible for object recognition in image data. Perceiving the digit has turned into a fundamental part as far as certifiable applications. Since, digits are composed in various styles in this way to distinguish the digit it is important to perceive and arrange it with the assistance of machine learning methods. This exploration depends on supervised learning vector quantization neural system arranged under counterfeit artificial neural network. The pictures of digits are perceived, prepared and tried. After the system is made digits are prepared utilizing preparing dataset vectors and testing is connected to the pictures of digits which are separated to each other by fragmenting the picture and resizing the digit picture as needs be for better precision.

  9. The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system.

    PubMed

    Zekveld, Adriana A; Kramer, Sophia E; Kessens, Judith M; Vlaming, Marcel S M G; Houtgast, Tammo

    2009-04-01

    The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that speech comprehension in noise by young listeners with normal hearing improves when presenting partly incorrect, automatically generated subtitles. The current study focused on the effects of age, hearing loss, visual working memory capacity, and linguistic skills on the benefit obtained from automatically generated subtitles during listening to speech in noise. In order to investigate the effects of age and hearing loss, three groups of participants were included: 22 young persons with normal hearing (YNH, mean age = 21 years), 22 middle-aged adults with normal hearing (MA-NH, mean age = 55 years) and 30 middle-aged adults with hearing impairment (MA-HI, mean age = 57 years). The benefit from automatic subtitling was measured by Speech Reception Threshold (SRT) tests (Plomp & Mimpen, 1979). Both unimodal auditory and bimodal audiovisual SRT tests were performed. In the audiovisual tests, the subtitles were presented simultaneously with the speech, whereas in the auditory test, only speech was presented. The difference between the auditory and audiovisual SRT was defined as the audiovisual benefit. Participants additionally rated the listening effort. We examined the influences of ASR accuracy level and text delay on the audiovisual benefit and the listening effort using a repeated measures General Linear Model analysis. In a correlation analysis, we evaluated the relationships between age, auditory SRT, visual working memory capacity and the audiovisual benefit and listening effort. The automatically generated subtitles improved speech comprehension in noise for all ASR accuracies and delays covered by the current study. Higher ASR accuracy levels resulted in more benefit obtained from the subtitles. Speech comprehension improved even for relatively low ASR accuracy levels; for example, participants obtained about 2 dB SNR audiovisual benefit for ASR accuracies around 74%. Delaying the presentation of the text reduced the benefit and increased the listening effort. Participants with relatively low unimodal speech comprehension obtained greater benefit from the subtitles than participants with better unimodal speech comprehension. We observed an age-related decline in the working-memory capacity of the listeners with normal hearing. A higher age and a lower working memory capacity were associated with increased effort required to use the subtitles to improve speech comprehension. Participants were able to use partly incorrect and delayed subtitles to increase their comprehension of speech in noise, regardless of age and hearing loss. This supports the further development and evaluation of an assistive listening system that displays automatically recognized speech to aid speech comprehension by listeners with hearing impairment.

  10. Three-dimensional model-based object recognition and segmentation in cluttered scenes.

    PubMed

    Mian, Ajmal S; Bennamoun, Mohammed; Owens, Robyn

    2006-10-01

    Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency.

  11. Complex Event Recognition Architecture

    NASA Technical Reports Server (NTRS)

    Fitzgerald, William A.; Firby, R. James

    2009-01-01

    Complex Event Recognition Architecture (CERA) is the name of a computational architecture, and software that implements the architecture, for recognizing complex event patterns that may be spread across multiple streams of input data. One of the main components of CERA is an intuitive event pattern language that simplifies what would otherwise be the complex, difficult tasks of creating logical descriptions of combinations of temporal events and defining rules for combining information from different sources over time. In this language, recognition patterns are defined in simple, declarative statements that combine point events from given input streams with those from other streams, using conjunction, disjunction, and negation. Patterns can be built on one another recursively to describe very rich, temporally extended combinations of events. Thereafter, a run-time matching algorithm in CERA efficiently matches these patterns against input data and signals when patterns are recognized. CERA can be used to monitor complex systems and to signal operators or initiate corrective actions when anomalous conditions are recognized. CERA can be run as a stand-alone monitoring system, or it can be integrated into a larger system to automatically trigger responses to changing environments or problematic situations.

  12. Low-cost and high-speed optical mark reader based on an intelligent line camera

    NASA Astrophysics Data System (ADS)

    Hussmann, Stephan; Chan, Leona; Fung, Celine; Albrecht, Martin

    2003-08-01

    Optical Mark Recognition (OMR) is thoroughly reliable and highly efficient provided that high standards are maintained at both the planning and implementation stages. It is necessary to ensure that OMR forms are designed with due attention to data integrity checks, the best use is made of features built into the OMR, used data integrity is checked before the data is processed and data is validated before it is processed. This paper describes the design and implementation of an OMR prototype system for marking multiple-choice tests automatically. Parameter testing is carried out before the platform and the multiple-choice answer sheet has been designed. Position recognition and position verification methods have been developed and implemented in an intelligent line scan camera. The position recognition process is implemented into a Field Programmable Gate Array (FPGA), whereas the verification process is implemented into a micro-controller. The verified results are then sent to the Graphical User Interface (GUI) for answers checking and statistical analysis. At the end of the paper the proposed OMR system will be compared with commercially available system on the market.

  13. Semi-automated contour recognition using DICOMautomaton

    NASA Astrophysics Data System (ADS)

    Clark, H.; Wu, J.; Moiseenko, V.; Lee, R.; Gill, B.; Duzenli, C.; Thomas, S.

    2014-03-01

    Purpose: A system has been developed which recognizes and classifies Digital Imaging and Communication in Medicine contour data with minimal human intervention. It allows researchers to overcome obstacles which tax analysis and mining systems, including inconsistent naming conventions and differences in data age or resolution. Methods: Lexicographic and geometric analysis is used for recognition. Well-known lexicographic methods implemented include Levenshtein-Damerau, bag-of-characters, Double Metaphone, Soundex, and (word and character)-N-grams. Geometrical implementations include 3D Fourier Descriptors, probability spheres, boolean overlap, simple feature comparison (e.g. eccentricity, volume) and rule-based techniques. Both analyses implement custom, domain-specific modules (e.g. emphasis differentiating left/right organ variants). Contour labels from 60 head and neck patients are used for cross-validation. Results: Mixed-lexicographical methods show an effective improvement in more than 10% of recognition attempts compared with a pure Levenshtein-Damerau approach when withholding 70% of the lexicon. Domain-specific and geometrical techniques further boost performance. Conclusions: DICOMautomaton allows users to recognize contours semi-automatically. As usage increases and the lexicon is filled with additional structures, performance improves, increasing the overall utility of the system.

  14. Constant-Time Pattern Matching For Real-Time Production Systems

    NASA Astrophysics Data System (ADS)

    Parson, Dale E.; Blank, Glenn D.

    1989-03-01

    Many intelligent systems must respond to sensory data or critical environmental conditions in fixed, predictable time. Rule-based systems, including those based on the efficient Rete matching algorithm, cannot guarantee this result. Improvement in execution-time efficiency is not all that is needed here; it is important to ensure constant, 0(1) time limits for portions of the matching process. Our approach is inspired by two observations about human performance. First, cognitive psychologists distinguish between automatic and controlled processing. Analogously, we partition the matching process across two networks. The first is the automatic partition; it is characterized by predictable 0(1) time and space complexity, lack of persistent memory, and is reactive in nature. The second is the controlled partition; it includes the search-based goal-driven and data-driven processing typical of most production system programming. The former is responsible for recognition and response to critical environmental conditions. The latter is responsible for the more flexible problem-solving behaviors consistent with the notion of intelligence. Support for learning and refining the automatic partition can be placed in the controlled partition. Our second observation is that people are able to attend to more critical stimuli or requirements selectively. Our match algorithm uses priorities to focus matching. It compares priority of information during matching, rather than deferring this comparison until conflict resolution. Messages from the automatic partition are able to interrupt the controlled partition, enhancing system responsiveness. Our algorithm has numerous applications for systems that must exhibit time-constrained behavior.

  15. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection.

    PubMed

    Li, Baopu; Meng, Max Q-H

    2012-05-01

    Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.

  16. The CHEMDNER corpus of chemicals and drugs and its annotation principles.

    PubMed

    Krallinger, Martin; Rabal, Obdulia; Leitner, Florian; Vazquez, Miguel; Salgado, David; Lu, Zhiyong; Leaman, Robert; Lu, Yanan; Ji, Donghong; Lowe, Daniel M; Sayle, Roger A; Batista-Navarro, Riza Theresa; Rak, Rafal; Huber, Torsten; Rocktäschel, Tim; Matos, Sérgio; Campos, David; Tang, Buzhou; Xu, Hua; Munkhdalai, Tsendsuren; Ryu, Keun Ho; Ramanan, S V; Nathan, Senthil; Žitnik, Slavko; Bajec, Marko; Weber, Lutz; Irmer, Matthias; Akhondi, Saber A; Kors, Jan A; Xu, Shuo; An, Xin; Sikdar, Utpal Kumar; Ekbal, Asif; Yoshioka, Masaharu; Dieb, Thaer M; Choi, Miji; Verspoor, Karin; Khabsa, Madian; Giles, C Lee; Liu, Hongfang; Ravikumar, Komandur Elayavilli; Lamurias, Andre; Couto, Francisco M; Dai, Hong-Jie; Tsai, Richard Tzong-Han; Ata, Caglar; Can, Tolga; Usié, Anabel; Alves, Rui; Segura-Bedmar, Isabel; Martínez, Paloma; Oyarzabal, Julen; Valencia, Alfonso

    2015-01-01

    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/.

  17. The CHEMDNER corpus of chemicals and drugs and its annotation principles

    PubMed Central

    2015-01-01

    The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/ PMID:25810773

  18. The Automatic Recognition of the Abnormal Sky-subtraction Spectra Based on Hadoop

    NASA Astrophysics Data System (ADS)

    An, An; Pan, Jingchang

    2017-10-01

    The skylines, superimposing on the target spectrum as a main noise, If the spectrum still contains a large number of high strength skylight residuals after sky-subtraction processing, it will not be conducive to the follow-up analysis of the target spectrum. At the same time, the LAMOST can observe a quantity of spectroscopic data in every night. We need an efficient platform to proceed the recognition of the larger numbers of abnormal sky-subtraction spectra quickly. Hadoop, as a distributed parallel data computing platform, can deal with large amounts of data effectively. In this paper, we conduct the continuum normalization firstly and then a simple and effective method will be presented to automatic recognize the abnormal sky-subtraction spectra based on Hadoop platform. Obtain through the experiment, the Hadoop platform can implement the recognition with more speed and efficiency, and the simple method can recognize the abnormal sky-subtraction spectra and find the abnormal skyline positions of different residual strength effectively, can be applied to the automatic detection of abnormal sky-subtraction of large number of spectra.

  19. Neural-network classifiers for automatic real-world aerial image recognition

    NASA Astrophysics Data System (ADS)

    Greenberg, Shlomo; Guterman, Hugo

    1996-08-01

    We describe the application of the multilayer perceptron (MLP) network and a version of the adaptive resonance theory version 2-A (ART 2-A) network to the problem of automatic aerial image recognition (AAIR). The classification of aerial images, independent of their positions and orientations, is required for automatic tracking and target recognition. Invariance is achieved by the use of different invariant feature spaces in combination with supervised and unsupervised neural networks. The performance of neural-network-based classifiers in conjunction with several types of invariant AAIR global features, such as the Fourier-transform space, Zernike moments, central moments, and polar transforms, are examined. The advantages of this approach are discussed. The performance of the MLP network is compared with that of a classical correlator. The MLP neural-network correlator outperformed the binary phase-only filter (BPOF) correlator. It was found that the ART 2-A distinguished itself with its speed and its low number of required training vectors. However, only the MLP classifier was able to deal with a combination of shift and rotation geometric distortions.

  20. Neural-network classifiers for automatic real-world aerial image recognition.

    PubMed

    Greenberg, S; Guterman, H

    1996-08-10

    We describe the application of the multilayer perceptron (MLP) network and a version of the adaptive resonance theory version 2-A (ART 2-A) network to the problem of automatic aerial image recognition (AAIR). The classification of aerial images, independent of their positions and orientations, is required for automatic tracking and target recognition. Invariance is achieved by the use of different invariant feature spaces in combination with supervised and unsupervised neural networks. The performance of neural-network-based classifiers in conjunction with several types of invariant AAIR global features, such as the Fourier-transform space, Zernike moments, central moments, and polar transforms, are examined. The advantages of this approach are discussed. The performance of the MLP network is compared with that of a classical correlator. The MLP neural-network correlator outperformed the binary phase-only filter (BPOF) correlator. It was found that the ART 2-A distinguished itself with its speed and its low number of required training vectors. However, only the MLP classifier was able to deal with a combination of shift and rotation geometric distortions.

  1. Automatic recognition of light source from color negative films using sorting classification techniques

    NASA Astrophysics Data System (ADS)

    Sanger, Demas S.; Haneishi, Hideaki; Miyake, Yoichi

    1995-08-01

    This paper proposed a simple and automatic method for recognizing the light sources from various color negative film brands by means of digital image processing. First, we stretched the image obtained from a negative based on the standardized scaling factors, then extracted the dominant color component among red, green, and blue components of the stretched image. The dominant color component became the discriminator for the recognition. The experimental results verified that any one of the three techniques could recognize the light source from negatives of any film brands and all brands greater than 93.2 and 96.6% correct recognitions, respectively. This method is significant for the automation of color quality control in color reproduction from color negative film in mass processing and printing machine.

  2. Fashioning the Face: Sensorimotor Simulation Contributes to Facial Expression Recognition.

    PubMed

    Wood, Adrienne; Rychlowska, Magdalena; Korb, Sebastian; Niedenthal, Paula

    2016-03-01

    When we observe a facial expression of emotion, we often mimic it. This automatic mimicry reflects underlying sensorimotor simulation that supports accurate emotion recognition. Why this is so is becoming more obvious: emotions are patterns of expressive, behavioral, physiological, and subjective feeling responses. Activation of one component can therefore automatically activate other components. When people simulate a perceived facial expression, they partially activate the corresponding emotional state in themselves, which provides a basis for inferring the underlying emotion of the expresser. We integrate recent evidence in favor of a role for sensorimotor simulation in emotion recognition. We then connect this account to a domain-general understanding of how sensory information from multiple modalities is integrated to generate perceptual predictions in the brain. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem

    PubMed Central

    Liu, Xunying; Zhang, Chao; Woodland, Phil; Fonteneau, Elisabeth

    2017-01-01

    There is widespread interest in the relationship between the neurobiological systems supporting human cognition and emerging computational systems capable of emulating these capacities. Human speech comprehension, poorly understood as a neurobiological process, is an important case in point. Automatic Speech Recognition (ASR) systems with near-human levels of performance are now available, which provide a computationally explicit solution for the recognition of words in continuous speech. This research aims to bridge the gap between speech recognition processes in humans and machines, using novel multivariate techniques to compare incremental ‘machine states’, generated as the ASR analysis progresses over time, to the incremental ‘brain states’, measured using combined electro- and magneto-encephalography (EMEG), generated as the same inputs are heard by human listeners. This direct comparison of dynamic human and machine internal states, as they respond to the same incrementally delivered sensory input, revealed a significant correspondence between neural response patterns in human superior temporal cortex and the structural properties of ASR-derived phonetic models. Spatially coherent patches in human temporal cortex responded selectively to individual phonetic features defined on the basis of machine-extracted regularities in the speech to lexicon mapping process. These results demonstrate the feasibility of relating human and ASR solutions to the problem of speech recognition, and suggest the potential for further studies relating complex neural computations in human speech comprehension to the rapidly evolving ASR systems that address the same problem domain. PMID:28945744

  4. Remote Safety Monitoring for Elderly Persons Based on Omni-Vision Analysis

    PubMed Central

    Xiang, Yun; Tang, Yi-ping; Ma, Bao-qing; Yan, Hang-chen; Jiang, Jun; Tian, Xu-yuan

    2015-01-01

    Remote monitoring service for elderly persons is important as the aged populations in most developed countries continue growing. To monitor the safety and health of the elderly population, we propose a novel omni-directional vision sensor based system, which can detect and track object motion, recognize human posture, and analyze human behavior automatically. In this work, we have made the following contributions: (1) we develop a remote safety monitoring system which can provide real-time and automatic health care for the elderly persons and (2) we design a novel motion history or energy images based algorithm for motion object tracking. Our system can accurately and efficiently collect, analyze, and transfer elderly activity information and provide health care in real-time. Experimental results show that our technique can improve the data analysis efficiency by 58.5% for object tracking. Moreover, for the human posture recognition application, the success rate can reach 98.6% on average. PMID:25978761

  5. Combining Facial Recognition, Automatic License Plate Readers and Closed Circuit Television to Create an Interstate Identification System for Wanted Subjects

    DTIC Science & Technology

    2015-12-01

    IPSFRP search request. The candidate list will contain the agency’s requested number (minimum of2) of candidates or a default number of 20 candidates if...INTERSTATE IDENTIFICATION SYSTEM FOR WANTED SUBJECTS 5. FUNDING NUMBERS 6. AUTHOR(S) Michael J. Thomas 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES...Naval Postgraduate School Monterey, CA 93943-5000 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING /MONITORING AGENCY NAME(S) AND

  6. AAAIC '88 - Aerospace Applications of Artificial Intelligence; Proceedings of the Fourth Annual Conference, Dayton, OH, Oct. 25-27, 1988. Volumes 1 2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, J.R.; Netrologic, Inc., San Diego, CA)

    1988-01-01

    Topics presented include integrating neural networks and expert systems, neural networks and signal processing, machine learning, cognition and avionics applications, artificial intelligence and man-machine interface issues, real time expert systems, artificial intelligence, and engineering applications. Also considered are advanced problem solving techniques, combinational optimization for scheduling and resource control, data fusion/sensor fusion, back propagation with momentum, shared weights and recurrency, automatic target recognition, cybernetics, optical neural networks.

  7. Reliability of automatic biometric iris recognition after phacoemulsification or drug-induced pupil dilation.

    PubMed

    Seyeddain, Orang; Kraker, Hannes; Redlberger, Andreas; Dexl, Alois K; Grabner, Günther; Emesz, Martin

    2014-01-01

    To investigate the reliability of a biometric iris recognition system for personal authentication after cataract surgery or iatrogenic pupil dilation. This was a prospective, nonrandomized, single-center, cohort study for evaluating the performance of an iris recognition system 2-24 hours after phacoemulsification and intraocular lens implantation (group 1) and before and after iatrogenic pupil dilation (group 2). Of the 173 eyes that could be enrolled before cataract surgery, 164 (94.8%) were easily recognized postoperatively, whereas in 9 (5.2%) this was not possible. However, these 9 eyes could be reenrolled and afterwards recognized successfully. In group 2, of a total of 184 eyes that were enrolled in miosis, a total of 22 (11.9%) could not be recognized in mydriasis and therefore needed reenrollment. No single case of false-positive acceptance occurred in either group. The results of this trial indicate that standard cataract surgery seems not to be a limiting factor for iris recognition in the large majority of cases. Some patients (5.2% in this study) might need "reenrollment" after cataract surgery. Iris recognition was primarily successful in eyes with medically dilated pupils in nearly 9 out of 10 eyes. No single case of false-positive acceptance occurred in either group in this trial. It seems therefore that iris recognition is a valid biometric method in the majority of cases after cataract surgery or after pupil dilation.

  8. Automated Meteor Detection by All-Sky Digital Camera Systems

    NASA Astrophysics Data System (ADS)

    Suk, Tomáš; Šimberová, Stanislava

    2017-12-01

    We have developed a set of methods to detect meteor light traces captured by all-sky CCD cameras. Operating at small automatic observatories (stations), these cameras create a network spread over a large territory. Image data coming from these stations are merged in one central node. Since a vast amount of data is collected by the stations in a single night, robotic storage and analysis are essential to processing. The proposed methodology is adapted to data from a network of automatic stations equipped with digital fish-eye cameras and includes data capturing, preparation, pre-processing, analysis, and finally recognition of objects in time sequences. In our experiments we utilized real observed data from two stations.

  9. NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment.

    PubMed

    Mezgec, Simon; Koroušić Seljak, Barbara

    2017-06-27

    Automatic food image recognition systems are alleviating the process of food-intake estimation and dietary assessment. However, due to the nature of food images, their recognition is a particularly challenging task, which is why traditional approaches in the field have achieved a low classification accuracy. Deep neural networks have outperformed such solutions, and we present a novel approach to the problem of food and drink image detection and recognition that uses a newly-defined deep convolutional neural network architecture, called NutriNet. This architecture was tuned on a recognition dataset containing 225,953 512 × 512 pixel images of 520 different food and drink items from a broad spectrum of food groups, on which we achieved a classification accuracy of 86 . 72 % , along with an accuracy of 94 . 47 % on a detection dataset containing 130 , 517 images. We also performed a real-world test on a dataset of self-acquired images, combined with images from Parkinson's disease patients, all taken using a smartphone camera, achieving a top-five accuracy of 55 % , which is an encouraging result for real-world images. Additionally, we tested NutriNet on the University of Milano-Bicocca 2016 (UNIMIB2016) food image dataset, on which we improved upon the provided baseline recognition result. An online training component was implemented to continually fine-tune the food and drink recognition model on new images. The model is being used in practice as part of a mobile app for the dietary assessment of Parkinson's disease patients.

  10. Automatic polymerase chain reaction product detection system for food safety monitoring using zinc finger protein fused to luciferase.

    PubMed

    Yoshida, Wataru; Kezuka, Aki; Murakami, Yoshiyuki; Lee, Jinhee; Abe, Koichi; Motoki, Hiroaki; Matsuo, Takafumi; Shimura, Nobuaki; Noda, Mamoru; Igimi, Shizunobu; Ikebukuro, Kazunori

    2013-11-01

    An automatic polymerase chain reaction (PCR) product detection system for food safety monitoring using zinc finger (ZF) protein fused to luciferase was developed. ZF protein fused to luciferase specifically binds to target double stranded DNA sequence and has luciferase enzymatic activity. Therefore, PCR products that comprise ZF protein recognition sequence can be detected by measuring the luciferase activity of the fusion protein. We previously reported that PCR products from Legionella pneumophila and Escherichia coli (E. coli) O157 genomic DNA were detected by Zif268, a natural ZF protein, fused to luciferase. In this study, Zif268-luciferase was applied to detect the presence of Salmonella and coliforms. Moreover, an artificial zinc finger protein (B2) fused to luciferase was constructed for a Norovirus detection system. In the luciferase activity detection assay, several bound/free separation process is required. Therefore, an analyzer that automatically performed the bound/free separation process was developed to detect PCR products using the ZF-luciferase fusion protein. By means of the automatic analyzer with ZF-luciferase fusion protein, target pathogenic genomes were specifically detected in the presence of other pathogenic genomes. Moreover, we succeeded in the detection of 10 copies of E. coli BL21 without extraction of genomic DNA by the automatic analyzer and E. coli was detected with a logarithmic dependency in the range of 1.0×10 to 1.0×10(6) copies. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Melanoma recognition framework based on expert definition of ABCD for dermoscopic images.

    PubMed

    Abbas, Qaisar; Emre Celebi, M; Garcia, Irene Fondón; Ahmad, Waqar

    2013-02-01

    Melanoma Recognition based on clinical ABCD rule is widely used for clinical diagnosis of pigmented skin lesions in dermoscopy images. However, the current computer-aided diagnostic (CAD) systems for classification between malignant and nevus lesions using the ABCD criteria are imperfect due to use of ineffective computerized techniques. In this study, a novel melanoma recognition system (MRS) is presented by focusing more on extracting features from the lesions using ABCD criteria. The complete MRS system consists of the following six major steps: transformation to the CIEL*a*b* color space, preprocessing to enhance the tumor region, black-frame and hair artifacts removal, tumor-area segmentation, quantification of feature using ABCD criteria and normalization, and finally feature selection and classification. The MRS system for melanoma-nevus lesions is tested on a total of 120 dermoscopic images. To test the performance of the MRS diagnostic classifier, the area under the receiver operating characteristics curve (AUC) is utilized. The proposed classifier achieved a sensitivity of 88.2%, specificity of 91.3%, and AUC of 0.880. The experimental results show that the proposed MRS system can accurately distinguish between malignant and benign lesions. The MRS technique is fully automatic and can easily integrate to an existing CAD system. To increase the classification accuracy of MRS, the CASH pattern recognition technique, visual inspection of dermatologist, contextual information from the patients, and the histopathological tests can be included to investigate the impact with this system. © 2012 John Wiley & Sons A/S.

  12. Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition

    NASA Astrophysics Data System (ADS)

    Ryumin, D.; Karpov, A. A.

    2017-05-01

    In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.

  13. Semi-automatic recognition of marine debris on beaches

    PubMed Central

    Ge, Zhenpeng; Shi, Huahong; Mei, Xuefei; Dai, Zhijun; Li, Daoji

    2016-01-01

    An increasing amount of anthropogenic marine debris is pervading the earth’s environmental systems, resulting in an enormous threat to living organisms. Additionally, the large amount of marine debris around the world has been investigated mostly through tedious manual methods. Therefore, we propose the use of a new technique, light detection and ranging (LIDAR), for the semi-automatic recognition of marine debris on a beach because of its substantially more efficient role in comparison with other more laborious methods. Our results revealed that LIDAR should be used for the classification of marine debris into plastic, paper, cloth and metal. Additionally, we reconstructed a 3-dimensional model of different types of debris on a beach with a high validity of debris revivification using LIDAR-based individual separation. These findings demonstrate that the availability of this new technique enables detailed observations to be made of debris on a large beach that was previously not possible. It is strongly suggested that LIDAR could be implemented as an appropriate monitoring tool for marine debris by global researchers and governments. PMID:27156433

  14. A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic

    NASA Astrophysics Data System (ADS)

    Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier

    2015-01-01

    In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.

  15. Image Classification Using Biomimetic Pattern Recognition with Convolutional Neural Networks Features

    PubMed Central

    Huo, Guanying

    2017-01-01

    As a typical deep-learning model, Convolutional Neural Networks (CNNs) can be exploited to automatically extract features from images using the hierarchical structure inspired by mammalian visual system. For image classification tasks, traditional CNN models employ the softmax function for classification. However, owing to the limited capacity of the softmax function, there are some shortcomings of traditional CNN models in image classification. To deal with this problem, a new method combining Biomimetic Pattern Recognition (BPR) with CNNs is proposed for image classification. BPR performs class recognition by a union of geometrical cover sets in a high-dimensional feature space and therefore can overcome some disadvantages of traditional pattern recognition. The proposed method is evaluated on three famous image classification benchmarks, that is, MNIST, AR, and CIFAR-10. The classification accuracies of the proposed method for the three datasets are 99.01%, 98.40%, and 87.11%, respectively, which are much higher in comparison with the other four methods in most cases. PMID:28316614

  16. Terrain type recognition using ERTS-1 MSS images

    NASA Technical Reports Server (NTRS)

    Gramenopoulos, N.

    1973-01-01

    For the automatic recognition of earth resources from ERTS-1 digital tapes, both multispectral and spatial pattern recognition techniques are important. Recognition of terrain types is based on spatial signatures that become evident by processing small portions of an image through selected algorithms. An investigation of spatial signatures that are applicable to ERTS-1 MSS images is described. Artifacts in the spatial signatures seem to be related to the multispectral scanner. A method for suppressing such artifacts is presented. Finally, results of terrain type recognition for one ERTS-1 image are presented.

  17. Development of Collaborative Research Initiatives to Advance the Aerospace Sciences-via the Communications, Electronics, Information Systems Focus Group

    NASA Technical Reports Server (NTRS)

    Knasel, T. Michael

    1996-01-01

    The primary goal of the Adaptive Vision Laboratory Research project was to develop advanced computer vision systems for automatic target recognition. The approach used in this effort combined several machine learning paradigms including evolutionary learning algorithms, neural networks, and adaptive clustering techniques to develop the E-MOR.PH system. This system is capable of generating pattern recognition systems to solve a wide variety of complex recognition tasks. A series of simulation experiments were conducted using E-MORPH to solve problems in OCR, military target recognition, industrial inspection, and medical image analysis. The bulk of the funds provided through this grant were used to purchase computer hardware and software to support these computationally intensive simulations. The payoff from this effort is the reduced need for human involvement in the design and implementation of recognition systems. We have shown that the techniques used in E-MORPH are generic and readily transition to other problem domains. Specifically, E-MORPH is multi-phase evolutionary leaming system that evolves cooperative sets of features detectors and combines their response using an adaptive classifier to form a complete pattern recognition system. The system can operate on binary or grayscale images. In our most recent experiments, we used multi-resolution images that are formed by applying a Gabor wavelet transform to a set of grayscale input images. To begin the leaming process, candidate chips are extracted from the multi-resolution images to form a training set and a test set. A population of detector sets is randomly initialized to start the evolutionary process. Using a combination of evolutionary programming and genetic algorithms, the feature detectors are enhanced to solve a recognition problem. The design of E-MORPH and recognition results for a complex problem in medical image analysis are described at the end of this report. The specific task involves the identification of vertebrae in x-ray images of human spinal columns. This problem is extremely challenging because the individual vertebra exhibit variation in shape, scale, orientation, and contrast. E-MORPH generated several accurate recognition systems to solve this task. This dual use of this ATR technology clearly demonstrates the flexibility and power of our approach.

  18. Automated feature detection and identification in digital point-ordered signals

    DOEpatents

    Oppenlander, Jane E.; Loomis, Kent C.; Brudnoy, David M.; Levy, Arthur J.

    1998-01-01

    A computer-based automated method to detect and identify features in digital point-ordered signals. The method is used for processing of non-destructive test signals, such as eddy current signals obtained from calibration standards. The signals are first automatically processed to remove noise and to determine a baseline. Next, features are detected in the signals using mathematical morphology filters. Finally, verification of the features is made using an expert system of pattern recognition methods and geometric criteria. The method has the advantage that standard features can be, located without prior knowledge of the number or sequence of the features. Further advantages are that standard features can be differentiated from irrelevant signal features such as noise, and detected features are automatically verified by parameters extracted from the signals. The method proceeds fully automatically without initial operator set-up and without subjective operator feature judgement.

  19. Development of on line automatic separation device for apple and sleeve

    NASA Astrophysics Data System (ADS)

    Xin, Dengke; Ning, Duo; Wang, Kangle; Han, Yuhang

    2018-04-01

    Based on STM32F407 single chip microcomputer as control core, automatic separation device of fruit sleeve is designed. This design consists of hardware and software. In hardware, it includes mechanical tooth separator and three degree of freedom manipulator, as well as industrial control computer, image data acquisition card, end effector and other structures. The software system is based on Visual C++ development environment, to achieve localization and recognition of fruit sleeve with the technology of image processing and machine vision, drive manipulator of foam net sets of capture, transfer, the designated position task. Test shows: The automatic separation device of the fruit sleeve has the advantages of quick response speed and high separation success rate, and can realize separation of the apple and plastic foam sleeve, and lays the foundation for further studying and realizing the application of the enterprise production line.

  20. Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis

    NASA Astrophysics Data System (ADS)

    Büchler, Michael; Allegro, Silvia; Launer, Stefan; Dillier, Norbert

    2005-12-01

    A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.

  1. Toward End-to-End Face Recognition Through Alignment Learning

    NASA Astrophysics Data System (ADS)

    Zhong, Yuanyi; Chen, Jiansheng; Huang, Bo

    2017-08-01

    Plenty of effective methods have been proposed for face recognition during the past decade. Although these methods differ essentially in many aspects, a common practice of them is to specifically align the facial area based on the prior knowledge of human face structure before feature extraction. In most systems, the face alignment module is implemented independently. This has actually caused difficulties in the designing and training of end-to-end face recognition models. In this paper we study the possibility of alignment learning in end-to-end face recognition, in which neither prior knowledge on facial landmarks nor artificially defined geometric transformations are required. Specifically, spatial transformer layers are inserted in front of the feature extraction layers in a Convolutional Neural Network (CNN) for face recognition. Only human identity clues are used for driving the neural network to automatically learn the most suitable geometric transformation and the most appropriate facial area for the recognition task. To ensure reproducibility, our model is trained purely on the publicly available CASIA-WebFace dataset, and is tested on the Labeled Face in the Wild (LFW) dataset. We have achieved a verification accuracy of 99.08\\% which is comparable to state-of-the-art single model based methods.

  2. Tone classification of syllable-segmented Thai speech based on multilayer perception

    NASA Astrophysics Data System (ADS)

    Satravaha, Nuttavudh; Klinkhachorn, Powsiri; Lass, Norman

    2002-05-01

    Thai is a monosyllabic tonal language that uses tone to convey lexical information about the meaning of a syllable. Thus to completely recognize a spoken Thai syllable, a speech recognition system not only has to recognize a base syllable but also must correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system. Thai has five distinctive tones (``mid,'' ``low,'' ``falling,'' ``high,'' and ``rising'') and each tone is represented by a single fundamental frequency (F0) pattern. However, several factors, including tonal coarticulation, stress, intonation, and speaker variability, affect the F0 pattern of a syllable in continuous Thai speech. In this study, an efficient method for tone classification of syllable-segmented Thai speech, which incorporates the effects of tonal coarticulation, stress, and intonation, as well as a method to perform automatic syllable segmentation, were developed. Acoustic parameters were used as the main discriminating parameters. The F0 contour of a segmented syllable was normalized by using a z-score transformation before being presented to a tone classifier. The proposed system was evaluated on 920 test utterances spoken by 8 speakers. A recognition rate of 91.36% was achieved by the proposed system.

  3. An FPGA-Based WASN for Remote Real-Time Monitoring of Endangered Species: A Case Study on the Birdsong Recognition of Botaurus stellaris

    PubMed Central

    Hervás, Marcos; Alsina-Pagès, Rosa Ma; Alías, Francesc; Salvador, Martí

    2017-01-01

    Fast environmental variations due to climate change can cause mass decline or even extinctions of species, having a dramatic impact on the future of biodiversity. During the last decade, different approaches have been proposed to track and monitor endangered species, generally based on costly semi-automatic systems that require human supervision adding limitations in coverage and time. However, the recent emergence of Wireless Acoustic Sensor Networks (WASN) has allowed non-intrusive remote monitoring of endangered species in real time through the automatic identification of the sound they emit. In this work, an FPGA-based WASN centralized architecture is proposed and validated on a simulated operation environment. The feasibility of the architecture is evaluated in a case study designed to detect the threatened Botaurus stellaris among other 19 cohabiting birds species in The Parc Natural dels Aiguamolls de l’Empordà, showing an averaged recognition accuracy of 91% over 2h 55’ of representative data. The FPGA-based feature extraction implementation allows the system to process data from 30 acoustic sensors in real time with an affordable cost. Finally, several open questions derived from this research are discussed to be considered for future works. PMID:28594373

  4. Current Technologies and its Trends of Machine Vision in the Field of Security and Disaster Prevention

    NASA Astrophysics Data System (ADS)

    Hashimoto, Manabu; Fujino, Yozo

    Image sensing technologies are expected as useful and effective way to suppress damages by criminals and disasters in highly safe and relieved society. In this paper, we describe current important subjects, required functions, technical trends, and a couple of real examples of developed system. As for the video surveillance, recognition of human trajectory and human behavior using image processing techniques are introduced with real examples about the violence detection for elevators. In the field of facility monitoring technologies as civil engineering, useful machine vision applications such as automatic detection of concrete cracks on walls of a building or recognition of crowded people on bridge for effective guidance in emergency are shown.

  5. Puzzle test: A tool for non-analytical clinical reasoning assessment.

    PubMed

    Monajemi, Alireza; Yaghmaei, Minoo

    2016-01-01

    Most contemporary clinical reasoning tests typically assess non-automatic thinking. Therefore, a test is needed to measure automatic reasoning or pattern recognition, which has been largely neglected in clinical reasoning tests. The Puzzle Test (PT) is dedicated to assess automatic clinical reasoning in routine situations. This test has been introduced first in 2009 by Monajemi et al in the Olympiad for Medical Sciences Students.PT is an item format that has gained acceptance in medical education, but no detailed guidelines exist for this test's format, construction and scoring. In this article, a format is described and the steps to prepare and administer valid and reliable PTs are presented. PT examines a specific clinical reasoning task: Pattern recognition. PT does not replace other clinical reasoning assessment tools. However, it complements them in strategies for assessing comprehensive clinical reasoning.

  6. Multi-stream face recognition on dedicated mobile devices for crime-fighting

    NASA Astrophysics Data System (ADS)

    Jassim, Sabah A.; Sellahewa, Harin

    2006-09-01

    Automatic face recognition is a useful tool in the fight against crime and terrorism. Technological advance in mobile communication systems and multi-application mobile devices enable the creation of hybrid platforms for active and passive surveillance. A dedicated mobile device that incorporates audio-visual sensors would not only complement existing networks of fixed surveillance devices (e.g. CCTV) but could also provide wide geographical coverage in almost any situation and anywhere. Such a device can hold a small portion of a law-enforcing agency biometric database that consist of audio and/or visual data of a number of suspects/wanted or missing persons who are expected to be in a local geographical area. This will assist law-enforcing officers on the ground in identifying persons whose biometric templates are downloaded onto their devices. Biometric data on the device can be regularly updated which will reduce the number of faces an officer has to remember. Such a dedicated device would act as an active/passive mobile surveillance unit that incorporate automatic identification. This paper is concerned with the feasibility of using wavelet-based face recognition schemes on such devices. The proposed schemes extend our recently developed face verification scheme for implementation on a currently available PDA. In particular we will investigate the use of a combination of wavelet frequency channels for multi-stream face recognition. We shall present experimental results on the performance of our proposed schemes for a number of publicly available face databases including a new AV database of videos recorded on a PDA.

  7. Recognition of chemical entities: combining dictionary-based and grammar-based approaches.

    PubMed

    Akhondi, Saber A; Hettne, Kristina M; van der Horst, Eelke; van Mulligen, Erik M; Kors, Jan A

    2015-01-01

    The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance.

  8. Recognition of chemical entities: combining dictionary-based and grammar-based approaches

    PubMed Central

    2015-01-01

    Background The past decade has seen an upsurge in the number of publications in chemistry. The ever-swelling volume of available documents makes it increasingly hard to extract relevant new information from such unstructured texts. The BioCreative CHEMDNER challenge invites the development of systems for the automatic recognition of chemicals in text (CEM task) and for ranking the recognized compounds at the document level (CDI task). We investigated an ensemble approach where dictionary-based named entity recognition is used along with grammar-based recognizers to extract compounds from text. We assessed the performance of ten different commercial and publicly available lexical resources using an open source indexing system (Peregrine), in combination with three different chemical compound recognizers and a set of regular expressions to recognize chemical database identifiers. The effect of different stop-word lists, case-sensitivity matching, and use of chunking information was also investigated. We focused on lexical resources that provide chemical structure information. To rank the different compounds found in a text, we used a term confidence score based on the normalized ratio of the term frequencies in chemical and non-chemical journals. Results The use of stop-word lists greatly improved the performance of the dictionary-based recognition, but there was no additional benefit from using chunking information. A combination of ChEBI and HMDB as lexical resources, the LeadMine tool for grammar-based recognition, and the regular expressions, outperformed any of the individual systems. On the test set, the F-scores were 77.8% (recall 71.2%, precision 85.8%) for the CEM task and 77.6% (recall 71.7%, precision 84.6%) for the CDI task. Missed terms were mainly due to tokenization issues, poor recognition of formulas, and term conjunctions. Conclusions We developed an ensemble system that combines dictionary-based and grammar-based approaches for chemical named entity recognition, outperforming any of the individual systems that we considered. The system is able to provide structure information for most of the compounds that are found. Improved tokenization and better recognition of specific entity types is likely to further improve system performance. PMID:25810767

  9. Salient Feature Identification and Analysis using Kernel-Based Classification Techniques for Synthetic Aperture Radar Automatic Target Recognition

    DTIC Science & Technology

    2014-03-27

    and machine learning for a range of research including such topics as medical imaging [10] and handwriting recognition [11]. The type of feature...1989. [11] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “Online handwriting recognition with support vector machines-a kernel approach,” in Eighth...International Workshop on Frontiers in Handwriting Recognition, pp. 49–54, IEEE, 2002. [12] C. Cortes and V. Vapnik, “Support-vector networks,” Machine

  10. A beat-to-beat calculator for the diastolic pressure time index and the tension time index.

    PubMed

    Nose, Y; Tajimi, T; Watanabe, Y; Yokota, M; Akazawa, K; Nakamura, M

    1987-01-01

    We have developed a beat-to-beat calculator which can calculate in real-time the ratio of the diastolic pressure time index (DPTI), and the tension time index (TTI) as an index of the myocardial oxygen supply/demand balance. Physicians set up presumed value for the left ventricular endodiastolic pressure, a search area for the dicrotic notch, a threshold for the onset of the up-slope and the corresponding value of the calibration signal on the digital switches of the calculator. Next, the arterial pressure analog signal is input into the calculator. The calculator searches automatically for both the onset of the up-slope and the dicrotic notch. The arterial pressure curve is displayed beat-to-beat with the recognized onset and the dicrotic notch on the CRT to be confirmed by physicians. When physicians do not agree with the automatic recognition they can fit the automatic recognition to the observation. If the recognition of the onset is inadequate, the threshold can be re-adjusted to trigger the onset. If recognition of the dicrotic notch is inadequate, the physician can adjust the search-area. Therefore, physicians who operate the calculator can rely on the calculated DPTI/TTI. This calculator can continuously monitor the myocardial oxygen supply/demand balance in patients with acute myocardial infarction or just after open-heart surgery.

  11. Electrophysiological Evidence of Automatic Early Semantic Processing

    ERIC Educational Resources Information Center

    Hinojosa, Jose A.; Martin-Loeches, Manuel; Munoz, Francisco; Casado, Pilar; Pozo, Miguel A.

    2004-01-01

    This study investigates the automatic-controlled nature of early semantic processing by means of the Recognition Potential (RP), an event-related potential response that reflects lexical selection processes. For this purpose tasks differing in their processing requirements were used. Half of the participants performed a physical task involving a…

  12. Automatic neutron dosimetry system based on fluorescent nuclear track detector technology.

    PubMed

    Akselrod, M S; Fomenko, V V; Bartz, J A; Haslett, T L

    2014-10-01

    For the first time, the authors are describing an automatic fluorescent nuclear track detector (FNTD) reader for neutron dosimetry. FNTD is a luminescent integrating type of detector made of aluminium oxide crystals that does not require electronics or batteries during irradiation. Non-destructive optical readout of the detector is performed using a confocal laser scanning fluorescence imaging with near-diffraction limited resolution. The fully automatic table-top reader allows one to load up to 216 detectors on a tray, read their engraved IDs using a CCD camera and optical character recognition, scan and process simultaneously two types of images in fluorescent and reflected laser light contrast to eliminate false-positive tracks related to surface and volume crystal imperfections. The FNTD dosimetry system allows one to measure neutron doses from 0.1 mSv to 20 Sv and covers neutron energies from thermal to 20 MeV. The reader is characterised by a robust, compact optical design, fast data processing electronics and user-friendly software. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  13. Photonics: From target recognition to lesion detection

    NASA Technical Reports Server (NTRS)

    Henry, E. Michael

    1994-01-01

    Since 1989, Martin Marietta has invested in the development of an innovative concept for robust real-time pattern recognition for any two-dimensioanal sensor. This concept has been tested in simulation, and in laboratory and field hardware, for a number of DOD and commercial uses from automatic target recognition to manufacturing inspection. We have now joined Rose Health Care Systems in developing its use for medical diagnostics. The concept is based on determining regions of interest by using optical Fourier bandpassing as a scene segmentation technique, enhancing those regions using wavelet filters, passing the enhanced regions to a neural network for analysis and initial pattern identification, and following this initial identification with confirmation by optical correlation. The optical scene segmentation and pattern confirmation are performed by the same optical module. The neural network is a recursive error minimization network with a small number of connections and nodes that rapidly converges to a global minimum.

  14. Material recognition based on thermal cues: Mechanisms and applications.

    PubMed

    Ho, Hsin-Ni

    2018-01-01

    Some materials feel colder to the touch than others, and we can use this difference in perceived coldness for material recognition. This review focuses on the mechanisms underlying material recognition based on thermal cues. It provides an overview of the physical, perceptual, and cognitive processes involved in material recognition. It also describes engineering domains in which material recognition based on thermal cues have been applied. This includes haptic interfaces that seek to reproduce the sensations associated with contact in virtual environments and tactile sensors aim for automatic material recognition. The review concludes by considering the contributions of this line of research in both science and engineering.

  15. Material recognition based on thermal cues: Mechanisms and applications

    PubMed Central

    Ho, Hsin-Ni

    2018-01-01

    ABSTRACT Some materials feel colder to the touch than others, and we can use this difference in perceived coldness for material recognition. This review focuses on the mechanisms underlying material recognition based on thermal cues. It provides an overview of the physical, perceptual, and cognitive processes involved in material recognition. It also describes engineering domains in which material recognition based on thermal cues have been applied. This includes haptic interfaces that seek to reproduce the sensations associated with contact in virtual environments and tactile sensors aim for automatic material recognition. The review concludes by considering the contributions of this line of research in both science and engineering. PMID:29687043

  16. A model of traffic signs recognition with convolutional neural network

    NASA Astrophysics Data System (ADS)

    Hu, Haihe; Li, Yujian; Zhang, Ting; Huo, Yi; Kuang, Wenqing

    2016-10-01

    In real traffic scenes, the quality of captured images are generally low due to some factors such as lighting conditions, and occlusion on. All of these factors are challengeable for automated recognition algorithms of traffic signs. Deep learning has provided a new way to solve this kind of problems recently. The deep network can automatically learn features from a large number of data samples and obtain an excellent recognition performance. We therefore approach this task of recognition of traffic signs as a general vision problem, with few assumptions related to road signs. We propose a model of Convolutional Neural Network (CNN) and apply the model to the task of traffic signs recognition. The proposed model adopts deep CNN as the supervised learning model, directly takes the collected traffic signs image as the input, alternates the convolutional layer and subsampling layer, and automatically extracts the features for the recognition of the traffic signs images. The proposed model includes an input layer, three convolutional layers, three subsampling layers, a fully-connected layer, and an output layer. To validate the proposed model, the experiments are implemented using the public dataset of China competition of fuzzy image processing. Experimental results show that the proposed model produces a recognition accuracy of 99.01 % on the training dataset, and yield a record of 92% on the preliminary contest within the fourth best.

  17. Automaticity of Basic-Level Categorization Accounts for Labeling Effects in Visual Recognition Memory

    ERIC Educational Resources Information Center

    Richler, Jennifer J.; Gauthier, Isabel; Palmeri, Thomas J.

    2011-01-01

    Are there consequences of calling objects by their names? Lupyan (2008) suggested that overtly labeling objects impairs subsequent recognition memory because labeling shifts stored memory representations of objects toward the category prototype (representational shift hypothesis). In Experiment 1, we show that processing objects at the basic…

  18. Variogram-based feature extraction for neural network recognition of logos

    NASA Astrophysics Data System (ADS)

    Pham, Tuan D.

    2003-03-01

    This paper presents a new approach for extracting spatial features of images based on the theory of regionalized variables. These features can be effectively used for automatic recognition of logo images using neural networks. Experimental results on a public-domain logo database show the effectiveness of the proposed approach.

  19. Separating Speed from Accuracy in Beginning Reading Development

    ERIC Educational Resources Information Center

    Juul, Holger; Poulsen, Mads; Elbro, Carsten

    2014-01-01

    Phoneme awareness, letter knowledge, and rapid automatized naming (RAN) are well-known kindergarten predictors of later word recognition skills, but it is not clear whether they predict developments in accuracy or speed, or both. The present longitudinal study of 172 Danish beginning readers found that speed of word recognition mainly developed…

  20. TU-FG-209-12: Treatment Site and View Recognition in X-Ray Images with Hierarchical Multiclass Recognition Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, X; Mazur, T; Yang, D

    Purpose: To investigate an approach of automatically recognizing anatomical sites and imaging views (the orientation of the image acquisition) in 2D X-ray images. Methods: A hierarchical (binary tree) multiclass recognition model was developed to recognize the treatment sites and views in x-ray images. From top to bottom of the tree, the treatment sites are grouped hierarchically from more general to more specific. Each node in the hierarchical model was designed to assign images to one of two categories of anatomical sites. The binary image classification function of each node in the hierarchical model is implemented by using a PCA transformationmore » and a support vector machine (SVM) model. The optimal PCA transformation matrices and SVM models are obtained by learning from a set of sample images. Alternatives of the hierarchical model were developed to support three scenarios of site recognition that may happen in radiotherapy clinics, including two or one X-ray images with or without view information. The performance of the approach was tested with images of 120 patients from six treatment sites – brain, head-neck, breast, lung, abdomen and pelvis – with 20 patients per site and two views (AP and RT) per patient. Results: Given two images in known orthogonal views (AP and RT), the hierarchical model achieved a 99% average F1 score to recognize the six sites. Site specific view recognition models have 100 percent accuracy. The computation time to process a new patient case (preprocessing, site and view recognition) is 0.02 seconds. Conclusion: The proposed hierarchical model of site and view recognition is effective and computationally efficient. It could be useful to automatically and independently confirm the treatment sites and views in daily setup x-ray 2D images. It could also be applied to guide subsequent image processing tasks, e.g. site and view dependent contrast enhancement and image registration. The senior author received research grants from ViewRay Inc. and Varian Medical System.« less

  1. Model-based vision using geometric hashing

    NASA Astrophysics Data System (ADS)

    Akerman, Alexander, III; Patton, Ronald

    1991-04-01

    The Geometric Hashing technique developed by the NYU Courant Institute has been applied to various automatic target recognition applications. In particular, I-MATH has extended the hashing algorithm to perform automatic target recognition ofsynthetic aperture radar (SAR) imagery. For this application, the hashing is performed upon the geometric locations of dominant scatterers. In addition to being a robust model-based matching algorithm -- invariant under translation, scale, and 3D rotations of the target -- hashing is of particular utility because it can still perform effective matching when the target is partially obscured. Moreover, hashing is very amenable to a SIMD parallel processing architecture, and thus potentially realtime implementable.

  2. User Evaluation of a Communication System That Automatically Generates Captions to Improve Telephone Communication

    PubMed Central

    Zekveld, Adriana A.; Kramer, Sophia E.; Kessens, Judith M.; Vlaming, Marcel S. M. G.; Houtgast, Tammo

    2009-01-01

    This study examined the subjective benefit obtained from automatically generated captions during telephone-speech comprehension in the presence of babble noise. Short stories were presented by telephone either with or without captions that were generated offline by an automatic speech recognition (ASR) system. To simulate online ASR, the word accuracy (WA) level of the captions was 60% or 70% and the text was presented delayed to the speech. After each test, the hearing impaired participants (n = 20) completed the NASA-Task Load Index and several rating scales evaluating the support from the captions. Participants indicated that using the erroneous text in speech comprehension was difficult and the reported task load did not differ between the audio + text and audio-only conditions. In a follow-up experiment (n = 10), the perceived benefit of presenting captions increased with an increase of WA levels to 80% and 90%, and elimination of the text delay. However, in general, the task load did not decrease when captions were presented. These results suggest that the extra effort required to process the text could have been compensated for by less effort required to comprehend the speech. Future research should aim at reducing the complexity of the task to increase the willingness of hearing impaired persons to use an assistive communication system automatically providing captions. The current results underline the need for obtaining both objective and subjective measures of benefit when evaluating assistive communication systems. PMID:19126551

  3. Data handling and analysis for the 1971 corn blight watch experiment

    NASA Technical Reports Server (NTRS)

    Anuta, P. E.; Phillips, T. L.

    1973-01-01

    The overall corn blight watch experiment data flow is described and the organization of the LARS/Purdue data center is discussed. Data analysis techniques are discussed in general and the use of statistical multispectral pattern recognition methods for automatic computer analysis of aircraft scanner data is described. Some of the results obtained are discussed and the implications of the experiment on future data communication requirements for earth resource survey systems is discussed.

  4. DCL System Using Deep Learning Approaches for Land-Based or Ship-Based Real-Time Recognition and Localization of Marine Mammals

    DTIC Science & Technology

    2013-09-30

    method has been successfully implemented to automatically detect and recognize pulse trains from minke whales ( songs ) and sperm whales (Physeter...workshops, conferences and data challenges 2. Enhancements of the ASR algorithm for frequency-modulated sounds: Right Whale Study 3...Enhancements of the ASR algorithm for pulse trains: Minke Whale Study 4. Mining Big Data Sound Archives using High Performance Computing software and hardware

  5. Review of Current Aided/Automatic Target Acquisition Technology for Military Target Acquisition Tasks

    DTIC Science & Technology

    2011-07-01

    radar [e.g., synthetic aperture radar (SAR)]. EO/IR includes multi- and hyperspectral imaging. Signal processing of data from nonimaging sensors, such...enhanced recognition ability. Other nonimage -based techniques, such as category theory,45 hierarchical systems,46 and gradient index flow,47 are possible...the battle- field. There is a plethora of imaging and nonimaging sensors on the battlefield that are being networked together for trans- mission of

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bolme, David S; Tokola, Ryan A; Boehnen, Chris Bensing

    Automatic recognition systems are a valuable tool for identifying unknown deceased individuals. Immediately af- ter death fingerprint and face biometric samples are easy to collect using standard sensors and cameras and can be easily matched to anti-mortem biometric samples. Even though post-mortem fingerprints and faces have been used for decades, there are no studies that track these biomet- rics through the later stages of decomposition to determine the length of time the biometrics remain viable. This paper discusses a multimodal dataset of fingerprints, faces, and irises from 14 human cadavers that decomposed outdoors under natural conditions. Results include predictive modelsmore » relating time and temperature, measured as Accumulated Degree Days (ADD), and season (winter, spring, summer) to the predicted probably of automatic verification using a commercial algorithm.« less

  7. Development of A Two-Stage Procedure for the Automatic Recognition of Dysfluencies in the Speech of Children Who Stutter: I. Psychometric Procedures Appropriate for Selection of Training Material for Lexical Dysfluency Classifiers

    PubMed Central

    Howell, Peter; Sackin, Stevie; Glenn, Kazan

    2007-01-01

    This program of work is intended to develop automatic recognition procedures to locate and assess stuttered dysfluencies. This and the following article together, develop and test recognizers for repetitions and prolongations. The automatic recognizers classify the speech in two stages: In the first, the speech is segmented and in the second the segments are categorized. The units that are segmented are words. Here assessments by human judges on the speech of 12 children who stutter are described using a corresponding procedure. The accuracy of word boundary placement across judges, categorization of the words as fluent, repetition or prolongation, and duration of the different fluency categories are reported. These measures allow reliable instances of repetitions and prolongations to be selected for training and assessing the recognizers in the subsequent paper. PMID:9328878

  8. Fine grained recognition of masonry walls for built heritage assessment

    NASA Astrophysics Data System (ADS)

    Oses, N.; Dornaika, F.; Moujahid, A.

    2015-01-01

    This paper presents the ground work carried out to achieve automatic fine grained recognition of stone masonry. This is a necessary first step in the development of the analysis tool. The built heritage that will be assessed consists of stone masonry constructions and many of the features analysed can be characterized according to the geometry and arrangement of the stones. Much of the assessment is carried out through visual inspection. Thus, we apply image processing on digital images of the elements under inspection. The main contribution of the paper is the performance evaluation of the automatic categorization of masonry walls from a set of extracted straight line segments. The element chosen to perform this evaluation is the stone arrangement of masonry walls. The validity of the proposed framework is assessed on real images of masonry walls using machine learning paradigms. These include classifiers as well as automatic feature selection.

  9. VASIR: An Open-Source Research Platform for Advanced Iris Recognition Technologies.

    PubMed

    Lee, Yooyoung; Micheals, Ross J; Filliben, James J; Phillips, P Jonathon

    2013-01-01

    The performance of iris recognition systems is frequently affected by input image quality, which in turn is vulnerable to less-than-optimal conditions due to illuminations, environments, and subject characteristics (e.g., distance, movement, face/body visibility, blinking, etc.). VASIR (Video-based Automatic System for Iris Recognition) is a state-of-the-art NIST-developed iris recognition software platform designed to systematically address these vulnerabilities. We developed VASIR as a research tool that will not only provide a reference (to assess the relative performance of alternative algorithms) for the biometrics community, but will also advance (via this new emerging iris recognition paradigm) NIST's measurement mission. VASIR is designed to accommodate both ideal (e.g., classical still images) and less-than-ideal images (e.g., face-visible videos). VASIR has three primary modules: 1) Image Acquisition 2) Video Processing, and 3) Iris Recognition. Each module consists of several sub-components that have been optimized by use of rigorous orthogonal experiment design and analysis techniques. We evaluated VASIR performance using the MBGC (Multiple Biometric Grand Challenge) NIR (Near-Infrared) face-visible video dataset and the ICE (Iris Challenge Evaluation) 2005 still-based dataset. The results showed that even though VASIR was primarily developed and optimized for the less-constrained video case, it still achieved high verification rates for the traditional still-image case. For this reason, VASIR may be used as an effective baseline for the biometrics community to evaluate their algorithm performance, and thus serves as a valuable research platform.

  10. VASIR: An Open-Source Research Platform for Advanced Iris Recognition Technologies

    PubMed Central

    Lee, Yooyoung; Micheals, Ross J; Filliben, James J; Phillips, P Jonathon

    2013-01-01

    The performance of iris recognition systems is frequently affected by input image quality, which in turn is vulnerable to less-than-optimal conditions due to illuminations, environments, and subject characteristics (e.g., distance, movement, face/body visibility, blinking, etc.). VASIR (Video-based Automatic System for Iris Recognition) is a state-of-the-art NIST-developed iris recognition software platform designed to systematically address these vulnerabilities. We developed VASIR as a research tool that will not only provide a reference (to assess the relative performance of alternative algorithms) for the biometrics community, but will also advance (via this new emerging iris recognition paradigm) NIST’s measurement mission. VASIR is designed to accommodate both ideal (e.g., classical still images) and less-than-ideal images (e.g., face-visible videos). VASIR has three primary modules: 1) Image Acquisition 2) Video Processing, and 3) Iris Recognition. Each module consists of several sub-components that have been optimized by use of rigorous orthogonal experiment design and analysis techniques. We evaluated VASIR performance using the MBGC (Multiple Biometric Grand Challenge) NIR (Near-Infrared) face-visible video dataset and the ICE (Iris Challenge Evaluation) 2005 still-based dataset. The results showed that even though VASIR was primarily developed and optimized for the less-constrained video case, it still achieved high verification rates for the traditional still-image case. For this reason, VASIR may be used as an effective baseline for the biometrics community to evaluate their algorithm performance, and thus serves as a valuable research platform. PMID:26401431

  11. Character-level neural network for biomedical named entity recognition.

    PubMed

    Gridach, Mourad

    2017-06-01

    Biomedical named entity recognition (BNER), which extracts important named entities such as genes and proteins, is a challenging task in automated systems that mine knowledge in biomedical texts. The previous state-of-the-art systems required large amounts of task-specific knowledge in the form of feature engineering, lexicons and data pre-processing to achieve high performance. In this paper, we introduce a novel neural network architecture that benefits from both word- and character-level representations automatically, by using a combination of bidirectional long short-term memory (LSTM) and conditional random field (CRF) eliminating the need for most feature engineering tasks. We evaluate our system on two datasets: JNLPBA corpus and the BioCreAtIvE II Gene Mention (GM) corpus. We obtained state-of-the-art performance by outperforming the previous systems. To the best of our knowledge, we are the first to investigate the combination of deep neural networks, CRF, word embeddings and character-level representation in recognizing biomedical named entities. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment

    PubMed Central

    Koroušić Seljak, Barbara

    2017-01-01

    Automatic food image recognition systems are alleviating the process of food-intake estimation and dietary assessment. However, due to the nature of food images, their recognition is a particularly challenging task, which is why traditional approaches in the field have achieved a low classification accuracy. Deep neural networks have outperformed such solutions, and we present a novel approach to the problem of food and drink image detection and recognition that uses a newly-defined deep convolutional neural network architecture, called NutriNet. This architecture was tuned on a recognition dataset containing 225,953 512 × 512 pixel images of 520 different food and drink items from a broad spectrum of food groups, on which we achieved a classification accuracy of 86.72%, along with an accuracy of 94.47% on a detection dataset containing 130,517 images. We also performed a real-world test on a dataset of self-acquired images, combined with images from Parkinson’s disease patients, all taken using a smartphone camera, achieving a top-five accuracy of 55%, which is an encouraging result for real-world images. Additionally, we tested NutriNet on the University of Milano-Bicocca 2016 (UNIMIB2016) food image dataset, on which we improved upon the provided baseline recognition result. An online training component was implemented to continually fine-tune the food and drink recognition model on new images. The model is being used in practice as part of a mobile app for the dietary assessment of Parkinson’s disease patients. PMID:28653995

  13. Evaluation of the automatic optical authentication technologies for control systems of objects

    NASA Astrophysics Data System (ADS)

    Averkin, Vladimir V.; Volegov, Peter L.; Podgornov, Vladimir A.

    2000-03-01

    The report considers the evaluation of the automatic optical authentication technologies for the automated integrated system of physical protection, control and accounting of nuclear materials at RFNC-VNIITF, and for providing of the nuclear materials nonproliferation regime. The report presents the nuclear object authentication objectives and strategies, the methodology of the automatic optical authentication and results of the development of pattern recognition techniques carried out under the ISTC project #772 with the purpose of identification of unique features of surface structure of a controlled object and effects of its random treatment. The current decision of following functional control tasks is described in the report: confirmation of the item authenticity (proof of the absence of its substitution by an item of similar shape), control over unforeseen change of item state, control over unauthorized access to the item. The most important distinctive feature of all techniques is not comprehensive description of some properties of controlled item, but unique identification of item using minimum necessary set of parameters, properly comprising identification attribute of the item. The main emphasis in the technical approach is made on the development of rather simple technological methods for the first time intended for use in the systems of physical protection, control and accounting of nuclear materials. The developed authentication devices and system are described.

  14. Improved Techniques for Automatic Chord Recognition from Music Audio Signals

    ERIC Educational Resources Information Center

    Cho, Taemin

    2014-01-01

    This thesis is concerned with the development of techniques that facilitate the effective implementation of capable automatic chord transcription from music audio signals. Since chord transcriptions can capture many important aspects of music, they are useful for a wide variety of music applications and also useful for people who learn and perform…

  15. Automatic Cataloguing and Searching for Retrospective Data by Use of OCR Text.

    ERIC Educational Resources Information Center

    Tseng, Yuen-Hsien

    2001-01-01

    Describes efforts in supporting information retrieval from OCR (optical character recognition) degraded text. Reports on approaches used in an automatic cataloging and searching contest for books in multiple languages, including a vector space retrieval model, an n-gram indexing method, and a weighting scheme; and discusses problems of Asian…

  16. RFID: A Revolution in Automatic Data Recognition

    ERIC Educational Resources Information Center

    Deal, Walter F., III

    2004-01-01

    Radio frequency identification, or RFID, is a generic term for technologies that use radio waves to automatically identify people or objects. There are several methods of identification, but the most common is to store a serial number that identifies a person or object, and perhaps other information, on a microchip that is attached to an antenna…

  17. 38 CFR 51.31 - Automatic recognition.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...

  18. 38 CFR 51.31 - Automatic recognition.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...

  19. 38 CFR 51.31 - Automatic recognition.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...

  20. 38 CFR 51.31 - Automatic recognition.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...

  1. 38 CFR 51.31 - Automatic recognition.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...) PER DIEM FOR NURSING HOME CARE OF VETERANS IN STATE HOMES Obtaining Per Diem for Nursing Home Care in... that already is recognized by VA as a State home for nursing home care at the time this part becomes effective, automatically will continue to be recognized as a State home for nursing home care but will be...

  2. Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment

    ERIC Educational Resources Information Center

    Cox, Troy L.

    2013-01-01

    Speaking assessments for second language learners have traditionally been expensive to administer because of the cost of rating the speech samples. To reduce the cost, many researchers are investigating the potential of using automatic speech recognition (ASR) as a means to score examinee responses to open-ended prompts. This study examined the…

  3. Psychopaths lack the automatic avoidance of social threat: relation to instrumental aggression.

    PubMed

    Louise von Borries, Anna Katinka; Volman, Inge; de Bruijn, Ellen Rosalia Aloïs; Bulten, Berend Hendrik; Verkes, Robbert Jan; Roelofs, Karin

    2012-12-30

    Psychopathy (PP) is associated with marked abnormalities in social emotional behaviour, such as high instrumental aggression (IA). A crucial but largely ignored question is whether automatic social approach-avoidance tendencies may underlie this condition. We tested whether offenders with PP show lack of automatic avoidance tendencies, usually activated when (healthy) individuals are confronted with social threat stimuli (angry faces). We applied a computerized approach-avoidance task (AAT), where participants pushed or pulled pictures of emotional faces using a joystick, upon which the faces decreased or increased in size, respectively. Furthermore, participants completed an emotion recognition task which was used to control for differences in recognition of facial emotions. In contrast to healthy controls (HC), PP patients showed total absence of avoidance tendencies towards angry faces. Interestingly, those responses were related to levels of instrumental aggression and the (in)ability to experience personal distress (PD). These findings suggest that social performance in psychopaths is disturbed on a basic level of automatic action tendencies. The lack of implicit threat avoidance tendencies may underlie their aggressive behaviour. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  4. Investigation of air transportation technology at Princeton University, 1985

    NASA Technical Reports Server (NTRS)

    Stengel, Robert F.

    1987-01-01

    The program proceeded along five avenues during 1985. Guidance and control strategies for penetration of microbursts and wind shear, application of artificial intelligence in flight control and air traffic control systems, the use of voice recognition in the cockpit, the effects of control saturation on closed-loop stability and response of open-loop unstable aircraft, and computer aided control system design are among the topics briefly considered. Areas of investigation relate to guidance and control of commercial transports as well as general aviation aircraft. Interaction between the flight crew and automatic systems is the subject of principal concern.

  5. LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 JOHNS HOPKINS SUMMER WORKSHOP.

    PubMed

    Hasegawa-Johnson, Mark; Baker, James; Borys, Sarah; Chen, Ken; Coogan, Emily; Greenberg, Steven; Juneja, Amit; Kirchhoff, Katrin; Livescu, Karen; Mohan, Srividya; Muller, Jennifer; Sonmez, Kemal; Wang, Tianyu

    2005-01-01

    Three research prototype speech recognition systems are described, all of which use recently developed methods from artificial intelligence (specifically support vector machines, dynamic Bayesian networks, and maximum entropy classification) in order to implement, in the form of an automatic speech recognizer, current theories of human speech perception and phonology (specifically landmark-based speech perception, nonlinear phonology, and articulatory phonology). All three systems begin with a high-dimensional multiframe acoustic-to-distinctive feature transformation, implemented using support vector machines trained to detect and classify acoustic phonetic landmarks. Distinctive feature probabilities estimated by the support vector machines are then integrated using one of three pronunciation models: a dynamic programming algorithm that assumes canonical pronunciation of each word, a dynamic Bayesian network implementation of articulatory phonology, or a discriminative pronunciation model trained using the methods of maximum entropy classification. Log probability scores computed by these models are then combined, using log-linear combination, with other word scores available in the lattice output of a first-pass recognizer, and the resulting combination score is used to compute a second-pass speech recognition output.

  6. An investigation into non-invasive physical activity recognition using smartphones.

    PubMed

    Kelly, Daniel; Caulfield, Brian

    2012-01-01

    Technology utilized to automatically monitor Activities of Daily Living (ADL) could be a key component in identifying deviations from normal functional profiles and providing feedback on interventions aimed at improving health. However, if activity recognition systems are to be implemented in real world scenarios such as health and wellness monitoring, the activity sensing modality must unobtrusively fit the human environment rather than forcing humans to adhere to sensor specific conditions. Modern smart phones represent a ubiquitous computing device which has already undergone mainstream adoption. In this paper, we investigate the feasibility of using a modern smartphone, with limited placement constraints, as the sensing modality for an activity recognition system. A dataset of 4 subjects performing 7 activities, using varying sensor placement conditions, is utilized to investigate this. Initial experiments show that a decision tree classifier performs activity classification with precision and recall scores of 0.75 and 0.73 respectively. More importantly, as part of this initial experiment, 3 main problems, and subsequently 3 solutions, relating to unconstrained sensor placement were identified. Using our proposed solutions, classification precision and recall scores were improved by +13% and +14.6% respectively.

  7. Study on road sign recognition in LabVIEW

    NASA Astrophysics Data System (ADS)

    Panoiu, M.; Rat, C. L.; Panoiu, C.

    2016-02-01

    Road and traffic sign identification is a field of study that can be used to aid the development of in-car advisory systems. It uses computer vision and artificial intelligence to extract the road signs from outdoor images acquired by a camera in uncontrolled lighting conditions where they may be occluded by other objects, or may suffer from problems such as color fading, disorientation, variations in shape and size, etc. An automatic means of identifying traffic signs, in these conditions, can make a significant contribution to develop an Intelligent Transport Systems (ITS) that continuously monitors the driver, the vehicle, and the road. Road and traffic signs are characterized by a number of features which make them recognizable from the environment. Road signs are located in standard positions and have standard shapes, standard colors, and known pictograms. These characteristics make them suitable for image identification. Traffic sign identification covers two problems: traffic sign detection and traffic sign recognition. Traffic sign detection is meant for the accurate localization of traffic signs in the image space, while traffic sign recognition handles the labeling of such detections into specific traffic sign types or subcategories [1].

  8. Speech recognition-based and automaticity programs to help students with severe reading and spelling problems.

    PubMed

    Higgins, Eleanor L; Raskind, Marshall H

    2004-12-01

    This study was conducted to assess the effectiveness of two programs developed by the Frostig Center Research Department to improve the reading and spelling of students with learning disabilities (LD): a computer Speech Recognition-based Program (SRBP) and a computer and text-based Automaticity Program (AP). Twenty-eight LD students with reading and spelling difficulties (aged 8 to 18) received each program for 17 weeks and were compared with 16 students in a contrast group who did not receive either program. After adjusting for age and IQ, both the SRBP and AP groups showed significant differences over the contrast group in improving word recognition and reading comprehension. Neither program showed significant differences over contrasts in spelling. The SRBP also improved the performance of the target group when compared with the contrast group on phonological elision and nonword reading efficiency tasks. The AP showed significant differences in all process and reading efficiency measures.

  9. The Automaticity of Emotional Face-Context Integration

    PubMed Central

    Aviezer, Hillel; Dudarev, Veronica; Bentin, Shlomo; Hassin, Ran R.

    2011-01-01

    Recent studies have demonstrated that context can dramatically influence the recognition of basic facial expressions, yet the nature of this phenomenon is largely unknown. In the present paper we begin to characterize the underlying process of face-context integration. Specifically, we examine whether it is a relatively controlled or automatic process. In Experiment 1 participants were motivated and instructed to avoid using the context while categorizing contextualized facial expression, or they were led to believe that the context was irrelevant. Nevertheless, they were unable to disregard the context, which exerted a strong effect on their emotion recognition. In Experiment 2, participants categorized contextualized facial expressions while engaged in a concurrent working memory task. Despite the load, the context exerted a strong influence on their recognition of facial expressions. These results suggest that facial expressions and their body contexts are integrated in an unintentional, uncontrollable, and relatively effortless manner. PMID:21707150

  10. ATR applications of minimax entropy models of texture and shape

    NASA Astrophysics Data System (ADS)

    Zhu, Song-Chun; Yuille, Alan L.; Lanterman, Aaron D.

    2001-10-01

    Concepts from information theory have recently found favor in both the mainstream computer vision community and the military automatic target recognition community. In the computer vision literature, the principles of minimax entropy learning theory have been used to generate rich probabilitistic models of texture and shape. In addition, the method of types and large deviation theory has permitted the difficulty of various texture and shape recognition tasks to be characterized by 'order parameters' that determine how fundamentally vexing a task is, independent of the particular algorithm used. These information-theoretic techniques have been demonstrated using traditional visual imagery in applications such as simulating cheetah skin textures and such as finding roads in aerial imagery. We discuss their application to problems in the specific application domain of automatic target recognition using infrared imagery. We also review recent theoretical and algorithmic developments which permit learning minimax entropy texture models for infrared textures in reasonable timeframes.

  11. A Horizontal Tilt Correction Method for Ship License Numbers Recognition

    NASA Astrophysics Data System (ADS)

    Liu, Baolong; Zhang, Sanyuan; Hong, Zhenjie; Ye, Xiuzi

    2018-02-01

    An automatic ship license numbers (SLNs) recognition system plays a significant role in intelligent waterway transportation systems since it can be used to identify ships by recognizing the characters in SLNs. Tilt occurs frequently in many SLNs because the monitors and the ships usually have great vertical or horizontal angles, which decreases the accuracy and robustness of a SLNs recognition system significantly. In this paper, we present a horizontal tilt correction method for SLNs. For an input tilt SLN image, the proposed method accomplishes the correction task through three main steps. First, a MSER-based characters’ center-points computation algorithm is designed to compute the accurate center-points of the characters contained in the input SLN image. Second, a L 1- L 2 distance-based straight line is fitted to the computed center-points using M-estimator algorithm. The tilt angle is estimated at this stage. Finally, based on the computed tilt angle, an affine transformation rotation is conducted to rotate and to correct the input SLN horizontally. At last, the proposed method is tested on 200 tilt SLN images, the proposed method is proved to be effective with a tilt correction rate of 80.5%.

  12. Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.

    PubMed

    Hsu, Wei-Yen

    2013-12-01

    In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.

  13. [Study on the automatic parameters identification of water pipe network model].

    PubMed

    Jia, Hai-Feng; Zhao, Qi-Feng

    2010-01-01

    Based on the problems analysis on development and application of water pipe network model, the model parameters automatic identification is regarded as a kernel bottleneck of model's application in water supply enterprise. The methodology of water pipe network model parameters automatic identification based on GIS and SCADA database is proposed. Then the kernel algorithm of model parameters automatic identification is studied, RSA (Regionalized Sensitivity Analysis) is used for automatic recognition of sensitive parameters, and MCS (Monte-Carlo Sampling) is used for automatic identification of parameters, the detail technical route based on RSA and MCS is presented. The module of water pipe network model parameters automatic identification is developed. At last, selected a typical water pipe network as a case, the case study on water pipe network model parameters automatic identification is conducted and the satisfied results are achieved.

  14. Gender-specific automatic valence recognition of affective olfactory stimulation through the analysis of the electrodermal activity.

    PubMed

    Greco, Alberto; Lanata, Antonio; Valenza, Gaetano; Di Francesco, Fabio; Scilingo, Enzo Pasquale

    2016-08-01

    This study reports on the development of a gender-specific classification system able to discern between two valence levels of smell, through information gathered from electrodermal activity (EDA) dynamics. Specifically, two odorants were administered to 32 healthy volunteers (16 males) while monitoring EDA. CvxEDA model was used to process the EDA signal and extract features from both tonic and phasic components. The feature set was used as input to a K-NN classifier implementing a leave-one-subject-out procedure. Results show strong differences in the accuracy of valence recognition between men (62.5%) and women (78%). We can conclude that affective olfactory stimulation significantly affect EDA dynamics with a highly specific gender dependency.

  15. A distinguishing method of printed and handwritten legal amount on Chinese bank check

    NASA Astrophysics Data System (ADS)

    Zhu, Ningbo; Lou, Zhen; Yang, Jingyu

    2003-09-01

    While carrying out Optical Chinese Character Recognition, distinguishing the font between printed and handwritten characters at the early phase is necessary, because there is so much difference between the methods on recognizing these two types of characters. In this paper, we proposed a good method on how to banish seals and its relative standards that can judge whether they should be banished. Meanwhile, an approach on clearing up scattered noise shivers after image segmentation is presented. Four sets of classifying features that show discrimination between printed and handwritten characters are well adopted. The proposed approach was applied to an automatic check processing system and tested on about 9031 checks. The recognition rate is more than 99.5%.

  16. Detection of artery interfaces: a real-time system and its clinical applications

    NASA Astrophysics Data System (ADS)

    Faita, Francesco; Gemignani, Vincenzo; Bianchini, Elisabetta; Giannarelli, Chiara; Ghiadoni, Lorenzo; Demi, Marcello

    2008-03-01

    Analyzing the artery mechanics is a crucial issue because of its close relationship with several cardiovascular risk factors, such as hypertension and diabetes. Moreover, most of the work can be carried out by analyzing image sequences obtained with ultrasounds, that is with a non-invasive technique which allows a real-time visualization of the observed structures. For this reason, therefore, an accurate temporal localization of the main vessel interfaces becomes a central task for which the manual approach should be avoided since such a method is rather unreliable and time consuming. Real-time automatic systems are advantageously used to automatically locate the arterial interfaces. The automatic measurement reduces the inter/intra-observer variability with respect to the manual measurement which unavoidably depends on the experience of the operator. The real-time visual feedback, moreover, guides physicians when looking for the best position of the ultrasound probe, thus increasing the global robustness of the system. The automatic system which we developed is a stand-alone video processing system which acquires the analog video signal from the ultrasound equipment, performs all the measurements and shows the results in real-time. The localization algorithm of the artery tunics is based on a new mathematical operator (the first order absolute moment) and on a pattern recognition approach. Various clinical applications have been developed on board and validated through a comparison with gold-standard techniques: the assessment of intima-media thickness, the arterial distension, the flow-mediated dilation and the pulse wave velocity. With this paper, the results obtained on clinical trials are presented.

  17. Thalamic volume and related visual recognition are associated with freezing of gait in non-demented patients with Parkinson's disease.

    PubMed

    Sunwoo, Mun Kyung; Cho, Kyoo H; Hong, Jin Yong; Lee, Ji E; Sohn, Young H; Lee, Phil Hyu

    2013-12-01

    The pathophysiology of freezing of gait (FOG) in non-demented Parkinson's disease (PD) patients remains poorly understood. Recent studies have suggested that neurochemical alterations in the cholinergic systems play a role in the development of FOG. Here, we evaluated the association between subcortical cholinergic structures and FOG in patients with non-demented PD. We recruited 46 non-demented patients with PD, categorized into PD with (n = 16) and without FOG (n = 30) groups. We performed neuropsychological test, region-of-interest-based volumetric analysis of the substantia innominata (SI) and automatic analysis of subcortical brain structures using a computerized segmentation procedure. The comprehensive neuropsychological assessment showed that PD patients with FOG had lower cognitive performance in the frontal executive and visual-related functions compared with those without freezing of gait. The normalized SI volume did not differ significantly between the two groups (1.65 ± 0.18 vs. 1.68 ± 0.31). The automatic analysis of subcortical structures revealed that the thalamic volumes were significantly reduced in PD patients with FOG compared with those without FOG after adjusting for age, sex, disease duration, the Unified PD Rating Scale scores and total intracranial volume (left: 6.71 vs. 7.16 cm3, p = 0.029, right: 6.47 vs. 6.91 cm3, p = 0.026). Multiple linear regression analysis revealed that thalamic volume showed significant positive correlations with visual recognition memory (left: β = 0.441, p = 0.037, right: β = 0.498, p = 0.04). These data suggest that thalamic volume and related visual recognition, rather than the cortical cholinergic system arising from the SI, may be a major contributor to the development of freezing of gait in non-demented patients with PD. Copyright © 2013. Published by Elsevier Ltd.

  18. Vision-Based Finger Detection, Tracking, and Event Identification Techniques for Multi-Touch Sensing and Display Systems

    PubMed Central

    Chen, Yen-Lin; Liang, Wen-Yew; Chiang, Chuan-Yen; Hsieh, Tung-Ju; Lee, Da-Cheng; Yuan, Shyan-Ming; Chang, Yang-Lang

    2011-01-01

    This study presents efficient vision-based finger detection, tracking, and event identification techniques and a low-cost hardware framework for multi-touch sensing and display applications. The proposed approach uses a fast bright-blob segmentation process based on automatic multilevel histogram thresholding to extract the pixels of touch blobs obtained from scattered infrared lights captured by a video camera. The advantage of this automatic multilevel thresholding approach is its robustness and adaptability when dealing with various ambient lighting conditions and spurious infrared noises. To extract the connected components of these touch blobs, a connected-component analysis procedure is applied to the bright pixels acquired by the previous stage. After extracting the touch blobs from each of the captured image frames, a blob tracking and event recognition process analyzes the spatial and temporal information of these touch blobs from consecutive frames to determine the possible touch events and actions performed by users. This process also refines the detection results and corrects for errors and occlusions caused by noise and errors during the blob extraction process. The proposed blob tracking and touch event recognition process includes two phases. First, the phase of blob tracking associates the motion correspondence of blobs in succeeding frames by analyzing their spatial and temporal features. The touch event recognition process can identify meaningful touch events based on the motion information of touch blobs, such as finger moving, rotating, pressing, hovering, and clicking actions. Experimental results demonstrate that the proposed vision-based finger detection, tracking, and event identification system is feasible and effective for multi-touch sensing applications in various operational environments and conditions. PMID:22163990

  19. Expert system for automatically correcting OCR output

    NASA Astrophysics Data System (ADS)

    Taghva, Kazem; Borsack, Julie; Condit, Allen

    1994-03-01

    This paper describes a new expert system for automatically correcting errors made by optical character recognition (OCR) devices. The system, which we call the post-processing system, is designed to improve the quality of text produced by an OCR device in preparation for subsequent retrieval from an information system. The system is composed of numerous parts: an information retrieval system, an English dictionary, a domain-specific dictionary, and a collection of algorithms and heuristics designed to correct as many OCR errors as possible. For the remaining errors that cannot be corrected, the system passes them on to a user-level editing program. This post-processing system can be viewed as part of a larger system that would streamline the steps of taking a document from its hard copy form to its usable electronic form, or it can be considered a stand alone system for OCR error correction. An earlier version of this system has been used to process approximately 10,000 pages of OCR generated text. Among the OCR errors discovered by this version, about 87% were corrected. We implement numerous new parts of the system, test this new version, and present the results.

  20. Text recognition and correction for automated data collection by mobile devices

    NASA Astrophysics Data System (ADS)

    Ozarslan, Suleyman; Eren, P. Erhan

    2014-03-01

    Participatory sensing is an approach which allows mobile devices such as mobile phones to be used for data collection, analysis and sharing processes by individuals. Data collection is the first and most important part of a participatory sensing system, but it is time consuming for the participants. In this paper, we discuss automatic data collection approaches for reducing the time required for collection, and increasing the amount of collected data. In this context, we explore automated text recognition on images of store receipts which are captured by mobile phone cameras, and the correction of the recognized text. Accordingly, our first goal is to evaluate the performance of the Optical Character Recognition (OCR) method with respect to data collection from store receipt images. Images captured by mobile phones exhibit some typical problems, and common image processing methods cannot handle some of them. Consequently, the second goal is to address these types of problems through our proposed Knowledge Based Correction (KBC) method used in support of the OCR, and also to evaluate the KBC method with respect to the improvement on the accurate recognition rate. Results of the experiments show that the KBC method improves the accurate data recognition rate noticeably.

  1. Transcribe Your Class: Using Speech Recognition to Improve Access for At-Risk Students

    ERIC Educational Resources Information Center

    Bain, Keith; Lund-Lucas, Eunice; Stevens, Janice

    2012-01-01

    Through a project supported by Canada's Social Development Partnerships Program, a team of leading National Disability Organizations, universities, and industry partners are piloting a prototype Hosted Transcription Service that uses speech recognition to automatically create multimedia transcripts that can be used by students for study purposes.…

  2. Speech Recognition Software for Language Learning: Toward an Evaluation of Validity and Student Perceptions

    ERIC Educational Resources Information Center

    Cordier, Deborah

    2009-01-01

    A renewed focus on foreign language (FL) learning and speech for communication has resulted in computer-assisted language learning (CALL) software developed with Automatic Speech Recognition (ASR). ASR features for FL pronunciation (Lafford, 2004) are functional components of CALL designs used for FL teaching and learning. The ASR features…

  3. Cortical Reorganization in Dyslexic Children after Phonological Training: Evidence from Early Evoked Potentials

    ERIC Educational Resources Information Center

    Spironelli, Chiara; Penolazzi, Barbara; Vio, Claudio; Angrilli, Alessandro

    2010-01-01

    Brain plasticity was investigated in 14 Italian children affected by developmental dyslexia after 6 months of phonological training. The means used to measure language reorganization was the recognition potential, an early wave, also called N150, elicited by automatic word recognition. This component peaks over the left temporo-occipital cortex…

  4. Does the cost function matter in Bayes decision rule?

    PubMed

    Schlü ter, Ralf; Nussbaum-Thom, Markus; Ney, Hermann

    2012-02-01

    In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high.

  5. Assessing the performance of a covert automatic target recognition algorithm

    NASA Astrophysics Data System (ADS)

    Ehrman, Lisa M.; Lanterman, Aaron D.

    2005-05-01

    Passive radar systems exploit illuminators of opportunity, such as TV and FM radio, to illuminate potential targets. Doing so allows them to operate covertly and inexpensively. Our research seeks to enhance passive radar systems by adding automatic target recognition (ATR) capabilities. In previous papers we proposed conducting ATR by comparing the radar cross section (RCS) of aircraft detected by a passive radar system to the precomputed RCS of aircraft in the target class. To effectively model the low-frequency setting, the comparison is made via a Rician likelihood model. Monte Carlo simulations indicate that the approach is viable. This paper builds on that work by developing a method for quickly assessing the potential performance of the ATR algorithm without using exhaustive Monte Carlo trials. This method exploits the relation between the probability of error in a binary hypothesis test under the Bayesian framework to the Chernoff information. Since the data are well-modeled as Rician, we begin by deriving a closed-form approximation for the Chernoff information between two Rician densities. This leads to an approximation for the probability of error in the classification algorithm that is a function of the number of available measurements. We conclude with an application that would be particularly cumbersome to accomplish via Monte Carlo trials, but that can be quickly addressed using the Chernoff information approach. This application evaluates the length of time that an aircraft must be tracked before the probability of error in the ATR algorithm drops below a desired threshold.

  6. Assessment of voice, speech, and related quality of life in advanced head and neck cancer patients 10-years+ after chemoradiotherapy.

    PubMed

    Kraaijenga, S A C; Oskam, I M; van Son, R J J H; Hamming-Vrieze, O; Hilgers, F J M; van den Brekel, M W M; van der Molen, L

    2016-04-01

    Assessment of long-term objective and subjective voice, speech, articulation, and quality of life in patients with head and neck cancer (HNC) treated with concurrent chemoradiotherapy (CRT) for advanced, stage IV disease. Twenty-two disease-free survivors, treated with cisplatin-based CRT for inoperable HNC (1999-2004), were evaluated at 10-years post-treatment. A standard Dutch text was recorded. Perceptual analysis of voice, speech, and articulation was conducted by two expert listeners (SLPs). Also an experimental expert system based on automatic speech recognition was used. Patients' perception of voice and speech and related quality of life was assessed with the Voice Handicap Index (VHI) and Speech Handicap Index (SHI) questionnaires. At a median follow-up of 11-years, perceptual evaluation showed abnormal scores in up to 64% of cases, depending on the outcome parameter analyzed. Automatic assessment of voice and speech parameters correlated moderate to strong with perceptual outcome scores. Patient-reported problems with voice (VHI>15) and speech (SHI>6) in daily life were present in 68% and 77% of patients, respectively. Patients treated with IMRT showed significantly less impairment compared to those treated with conventional radiotherapy. More than 10-years after organ-preservation treatment, voice and speech problems are common in this patient cohort, as assessed with perceptual evaluation, automatic speech recognition, and with validated structured questionnaires. There were fewer complaints in patients treated with IMRT than with conventional radiotherapy. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Basic level scene understanding: categories, attributes and structures

    PubMed Central

    Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude

    2013-01-01

    A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590

  8. Feature Extraction and Selection Strategies for Automated Target Recognition

    NASA Technical Reports Server (NTRS)

    Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

    2010-01-01

    Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory region of-interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.

  9. Feature extraction and selection strategies for automated target recognition

    NASA Astrophysics Data System (ADS)

    Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin

    2010-04-01

    Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory regionof- interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.

  10. Multiple-scanning-probe tunneling microscope with nanoscale positional recognition function.

    PubMed

    Higuchi, Seiji; Kuramochi, Hiromi; Laurent, Olivier; Komatsubara, Takashi; Machida, Shinichi; Aono, Masakazu; Obori, Kenichi; Nakayama, Tomonobu

    2010-07-01

    Over the past decade, multiple-scanning-probe microscope systems with independently controlled probes have been developed for nanoscale electrical measurements. We developed a quadruple-scanning-probe tunneling microscope (QSPTM) that can determine and control the probe position through scanning-probe imaging. The difficulty of operating multiple probes with submicrometer precision drastically increases with the number of probes. To solve problems such as determining the relative positions of the probes and avoiding of contact between the probes, we adopted sample-scanning methods to obtain four images simultaneously and developed an original control system for QSPTM operation with a function of automatic positional recognition. These improvements make the QSPTM a more practical and useful instrument since four images can now be reliably produced, and consequently the positioning of the four probes becomes easier owing to the reduced chance of accidental contact between the probes.

  11. Potential fault region detection in TFDS images based on convolutional neural network

    NASA Astrophysics Data System (ADS)

    Sun, Junhua; Xiao, Zhongwen

    2016-10-01

    In recent years, more than 300 sets of Trouble of Running Freight Train Detection System (TFDS) have been installed on railway to monitor the safety of running freight trains in China. However, TFDS is simply responsible for capturing, transmitting, and storing images, and fails to recognize faults automatically due to some difficulties such as such as the diversity and complexity of faults and some low quality images. To improve the performance of automatic fault recognition, it is of great importance to locate the potential fault areas. In this paper, we first introduce a convolutional neural network (CNN) model to TFDS and propose a potential fault region detection system (PFRDS) for simultaneously detecting four typical types of potential fault regions (PFRs). The experimental results show that this system has a higher performance of image detection to PFRs in TFDS. An average detection recall of 98.95% and precision of 100% are obtained, demonstrating the high detection ability and robustness against various poor imaging situations.

  12. Multimodal fusion of polynomial classifiers for automatic person recgonition

    NASA Astrophysics Data System (ADS)

    Broun, Charles C.; Zhang, Xiaozheng

    2001-03-01

    With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics are viewed as essential components of current evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions. There are two significant problem areas in current generation speaker verification systems. The first is the difficulty in acquiring clean audio signals in all environments without encumbering the user with a head- mounted close-talking microphone. Second, unimodal biometric systems do not work with a significant percentage of the population. To combat these issues, multimodal techniques are being investigated to improve system robustness to environmental conditions, as well as improve overall accuracy across the population. We propose a multi modal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality-giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion. This is chosen, rather than static face or iris recognition, because it provides dynamic information about the individual. In addition, the lip dynamics can aid speech recognition to provide liveness testing. The visual processing method makes use of both color and edge information, combined within Markov random field MRF framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late integration approach, based on a probabilistic model, is employed to combine the two modalities. The system is tested on the XM2VTS database combined with AWGN in the audio domain over a range of signal-to-noise ratios.

  13. Clustering-Based Ensemble Learning for Activity Recognition in Smart Homes

    PubMed Central

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-01-01

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks. PMID:25014095

  14. Novel dynamic Bayesian networks for facial action element recognition and understanding

    NASA Astrophysics Data System (ADS)

    Zhao, Wei; Park, Jeong-Seon; Choi, Dong-You; Lee, Sang-Woong

    2011-12-01

    In daily life, language is an important tool of communication between people. Besides language, facial action can also provide a great amount of information. Therefore, facial action recognition has become a popular research topic in the field of human-computer interaction (HCI). However, facial action recognition is quite a challenging task due to its complexity. In a literal sense, there are thousands of facial muscular movements, many of which have very subtle differences. Moreover, muscular movements always occur simultaneously when the pose is changed. To address this problem, we first build a fully automatic facial points detection system based on a local Gabor filter bank and principal component analysis. Then, novel dynamic Bayesian networks are proposed to perform facial action recognition using the junction tree algorithm over a limited number of feature points. In order to evaluate the proposed method, we have used the Korean face database for model training. For testing, we used the CUbiC FacePix, facial expressions and emotion database, Japanese female facial expression database, and our own database. Our experimental results clearly demonstrate the feasibility of the proposed approach.

  15. Clustering-based ensemble learning for activity recognition in smart homes.

    PubMed

    Jurek, Anna; Nugent, Chris; Bi, Yaxin; Wu, Shengli

    2014-07-10

    Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.

  16. Generation, recognition, and consistent fusion of partial boundary representations from range images

    NASA Astrophysics Data System (ADS)

    Kohlhepp, Peter; Hanczak, Andrzej M.; Li, Gang

    1994-10-01

    This paper presents SOMBRERO, a new system for recognizing and locating 3D, rigid, non- moving objects from range data. The objects may be polyhedral or curved, partially occluding, touching or lying flush with each other. For data collection, we employ 2D time- of-flight laser scanners mounted to a moving gantry robot. By combining sensor and robot coordinates, we obtain 3D cartesian coordinates. Boundary representations (Brep's) provide view independent geometry models that are both efficiently recognizable and derivable automatically from sensor data. SOMBRERO's methods for generating, matching and fusing Brep's are highly synergetic. A split-and-merge segmentation algorithm with dynamic triangular builds a partial (21/2D) Brep from scattered data. The recognition module matches this scene description with a model database and outputs recognized objects, their positions and orientations, and possibly surfaces corresponding to unknown objects. We present preliminary results in scene segmentation and recognition. Partial Brep's corresponding to different range sensors or viewpoints can be merged into a consistent, complete and irredundant 3D object or scene model. This fusion algorithm itself uses the recognition and segmentation methods.

  17. Detection, recognition, identification, and tracking of military vehicles using biomimetic intelligence

    NASA Astrophysics Data System (ADS)

    Pace, Paul W.; Sutherland, John

    2001-10-01

    This project is aimed at analyzing EO/IR images to provide automatic target detection/recognition/identification (ATR/D/I) of militarily relevant land targets. An increase in performance was accomplished using a biomimetic intelligence system functioning on low-cost, commercially available processing chips. Biomimetic intelligence has demonstrated advanced capabilities in the areas of hand- printed character recognition, real-time detection/identification of multiple faces in full 3D perspectives in cluttered environments, advanced capabilities in classification of ground-based military vehicles from SAR, and real-time ATR/D/I of ground-based military vehicles from EO/IR/HRR data in cluttered environments. The investigation applied these tools to real data sets and examined the parameters such as the minimum resolution for target recognition, the effect of target size, rotation, line-of-sight changes, contrast, partial obscuring, background clutter etc. The results demonstrated a real-time ATR/D/I capability against a subset of militarily relevant land targets operating in a realistic scenario. Typical results on the initial EO/IR data indicate probabilities of correct classification of resolved targets to be greater than 95 percent.

  18. End-to-End Multimodal Emotion Recognition Using Deep Neural Networks

    NASA Astrophysics Data System (ADS)

    Tzirakis, Panagiotis; Trigeorgis, George; Nicolaou, Mihalis A.; Schuller, Bjorn W.; Zafeiriou, Stefanos

    2017-12-01

    Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a Convolutional Neural Network (CNN) to extract features from the speech, while for the visual modality a deep residual network (ResNet) of 50 layers. In addition to the importance of feature extraction, a machine learning algorithm needs also to be insensitive to outliers while being able to model the context. To tackle this problem, Long Short-Term Memory (LSTM) networks are utilized. The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.

  19. DBSCAN-based ROI extracted from SAR images and the discrimination of multi-feature ROI

    NASA Astrophysics Data System (ADS)

    He, Xin Yi; Zhao, Bo; Tan, Shu Run; Zhou, Xiao Yang; Jiang, Zhong Jin; Cui, Tie Jun

    2009-10-01

    The purpose of the paper is to extract the region of interest (ROI) from the coarse detected synthetic aperture radar (SAR) images and discriminate if the ROI contains a target or not, so as to eliminate the false alarm, and prepare for the target recognition. The automatic target clustering is one of the most difficult tasks in the SAR-image automatic target recognition system. The density-based spatial clustering of applications with noise (DBSCAN) relies on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN was first used in the SAR image processing, which has many excellent features: only two insensitivity parameters (radius of neighborhood and minimum number of points) are needed; clusters of arbitrary shapes which fit in with the coarse detected SAR images can be discovered; and the calculation time and memory can be reduced. In the multi-feature ROI discrimination scheme, we extract several target features which contain the geometry features such as the area discriminator and Radon-transform based target profile discriminator, the distribution characteristics such as the EFF discriminator, and the EM scattering property such as the PPR discriminator. The synthesized judgment effectively eliminates the false alarms.

  20. Classifying dysmorphic syndromes by using artificial neural network based hierarchical decision tree.

    PubMed

    Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf

    2018-05-01

    Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.

Top