The MITLL NIST LRE 2015 Language Recognition System
2016-05-06
The MITLL NIST LRE 2015 Language Recognition System Pedro Torres-Carrasquillo, Najim Dehak*, Elizabeth Godoy, Douglas Reynolds, Fred Richardson...most recent MIT Lincoln Laboratory language recognition system developed for the NIST 2015 Language Recognition Evaluation (LRE). The submission...Task The National Institute of Science and Technology ( NIST ) has conducted formal evaluations of language detection algorithms since 1994. In
The MITLL NIST LRE 2015 Language Recognition system
2016-02-05
The MITLL NIST LRE 2015 Language Recognition System Pedro Torres-Carrasquillo, Najim Dehak*, Elizabeth Godoy, Douglas Reynolds, Fred Richardson...recent MIT Lincoln Laboratory language recognition system developed for the NIST 2015 Language Recognition Evaluation (LRE). The submission features a...National Institute of Science and Technology ( NIST ) has conducted formal evaluations of language detection algorithms since 1994. In previous
Lozano-Diez, Alicia; Zazo, Ruben; Toledano, Doroteo T; Gonzalez-Rodriguez, Joaquin
2017-01-01
Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for the task of automatic speech recognition (ASR). This DNN aims to compress information in one of its layers, known as bottleneck (BN) layer, which is used to obtain a new frame representation of the audio signal. This representation has been proven to be useful for the task of language identification (LID). Thus, bottleneck features are used as input to the language recognition system, instead of a classical parameterization of the signal based on cepstral feature vectors such as MFCCs (Mel Frequency Cepstral Coefficients). Despite the success of this approach in language recognition, there is a lack of studies analyzing in a systematic way how the topology of the DNN influences the performance of bottleneck feature-based language recognition systems. In this work, we try to fill-in this gap, analyzing language recognition results with different topologies for the DNN used to extract the bottleneck features, comparing them and against a reference system based on a more classical cepstral representation of the input signal with a total variability model. This way, we obtain useful knowledge about how the DNN configuration influences bottleneck feature-based language recognition systems performance.
ERIC Educational Resources Information Center
Matsumoto, Kazumi
2013-01-01
This study investigated whether learners of Japanese with different first language (L1) writing systems use different recognition strategies and whether second language (L2) exposure affects L2 kanji recognition. The study used a computerized lexical judgment task with 3 types of kanji characters to investigate these questions: (a)…
Niijima, H; Ito, N; Ogino, S; Takatori, T; Iwase, H; Kobayashi, M
2000-11-01
For the purpose of practical use of speech recognition technology for recording of forensic autopsy, a language model of the speech recording system, specialized for the forensic autopsy, was developed. The language model for the forensic autopsy by applying 3-gram model was created, and an acoustic model for Japanese speech recognition by Hidden Markov Model in addition to the above were utilized to customize the speech recognition engine for forensic autopsy. A forensic vocabulary set of over 10,000 words was compiled and some 300,000 sentence patterns were made to create the forensic language model, then properly mixing with a general language model to attain high exactitude. When tried by dictating autopsy findings, this speech recognition system was proved to be about 95% of recognition rate that seems to have reached to the practical usability in view of speech recognition software, though there remains rooms for improving its hardware and application-layer software.
Experimental study on GMM-based speaker recognition
NASA Astrophysics Data System (ADS)
Ye, Wenxing; Wu, Dapeng; Nucci, Antonio
2010-04-01
Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent.
Construction of language models for an handwritten mail reading system
NASA Astrophysics Data System (ADS)
Morillot, Olivier; Likforman-Sulem, Laurence; Grosicki, Emmanuèle
2012-01-01
This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system does not require any segmentation into words or characters and directly works at line level. To take into account linguistic information and enhance performance, a language model is introduced. This language model is based on bigrams and built from training document transcriptions only. Different experiments with various vocabulary sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show the interest of specific language models, fit to handwritten mail recognition task.
V2S: Voice to Sign Language Translation System for Malaysian Deaf People
NASA Astrophysics Data System (ADS)
Mean Foong, Oi; Low, Tang Jung; La, Wai Wan
The process of learning and understand the sign language may be cumbersome to some, and therefore, this paper proposes a solution to this problem by providing a voice (English Language) to sign language translation system using Speech and Image processing technique. Speech processing which includes Speech Recognition is the study of recognizing the words being spoken, regardless of whom the speaker is. This project uses template-based recognition as the main approach in which the V2S system first needs to be trained with speech pattern based on some generic spectral parameter set. These spectral parameter set will then be stored as template in a database. The system will perform the recognition process through matching the parameter set of the input speech with the stored templates to finally display the sign language in video format. Empirical results show that the system has 80.3% recognition rate.
A Kinect based sign language recognition system using spatio-temporal features
NASA Astrophysics Data System (ADS)
Memiş, Abbas; Albayrak, Songül
2013-12-01
This paper presents a sign language recognition system that uses spatio-temporal features on RGB video images and depth maps for dynamic gestures of Turkish Sign Language. Proposed system uses motion differences and accumulation approach for temporal gesture analysis. Motion accumulation method, which is an effective method for temporal domain analysis of gestures, produces an accumulated motion image by combining differences of successive video frames. Then, 2D Discrete Cosine Transform (DCT) is applied to accumulated motion images and temporal domain features transformed into spatial domain. These processes are performed on both RGB images and depth maps separately. DCT coefficients that represent sign gestures are picked up via zigzag scanning and feature vectors are generated. In order to recognize sign gestures, K-Nearest Neighbor classifier with Manhattan distance is performed. Performance of the proposed sign language recognition system is evaluated on a sign database that contains 1002 isolated dynamic signs belongs to 111 words of Turkish Sign Language (TSL) in three different categories. Proposed sign language recognition system has promising success rates.
Multi-Lingual Deep Neural Networks for Language Recognition
2016-08-08
training configurations for the NIST 2011 and 2015 lan- guage recognition evaluations (LRE11 and LRE15). The best per- forming multi-lingual BN-DNN...very ef- fective approach in the NIST 2015 language recognition evaluation (LRE15) open training condition [4, 5]. In this work we evaluate the impact...language are summarized in Table 2. Two language recognition tasks are used for evaluating the multi-lingual bottleneck systems. The first is the NIST
Voice Recognition Software Accuracy with Second Language Speakers of English.
ERIC Educational Resources Information Center
Coniam, D.
1999-01-01
Explores the potential of the use of voice-recognition technology with second-language speakers of English. Involves the analysis of the output produced by a small group of very competent second-language subjects reading a text into the voice recognition software Dragon Systems "Dragon NaturallySpeaking." (Author/VWL)
Evaluating Automatic Speech Recognition-Based Language Learning Systems: A Case Study
ERIC Educational Resources Information Center
van Doremalen, Joost; Boves, Lou; Colpaert, Jozef; Cucchiarini, Catia; Strik, Helmer
2016-01-01
The purpose of this research was to evaluate a prototype of an automatic speech recognition (ASR)-based language learning system that provides feedback on different aspects of speaking performance (pronunciation, morphology and syntax) to students of Dutch as a second language. We carried out usability reviews, expert reviews and user tests to…
Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval
NASA Astrophysics Data System (ADS)
Zhang, Qingqing; Pan, Jielin; Lin, Yang; Shao, Jian; Yan, Yonghong
In recent decades, there has been a great deal of research into the problem of bilingual speech recognition-to develop a recognizer that can handle inter- and intra-sentential language switching between two languages. This paper presents our recent work on the development of a grammar-constrained, Mandarin-English bilingual Speech Recognition System (MESRS) for real world music retrieval. Two of the main difficult issues in handling the bilingual speech recognition systems for real world applications are tackled in this paper. One is to balance the performance and the complexity of the bilingual speech recognition system; the other is to effectively deal with the matrix language accents in embedded language**. In order to process the intra-sentential language switching and reduce the amount of data required to robustly estimate statistical models, a compact single set of bilingual acoustic models derived by phone set merging and clustering is developed instead of using two separate monolingual models for each language. In our study, a novel Two-pass phone clustering method based on Confusion Matrix (TCM) is presented and compared with the log-likelihood measure method. Experiments testify that TCM can achieve better performance. Since potential system users' native language is Mandarin which is regarded as a matrix language in our application, their pronunciations of English as the embedded language usually contain Mandarin accents. In order to deal with the matrix language accents in embedded language, different non-native adaptation approaches are investigated. Experiments show that model retraining method outperforms the other common adaptation methods such as Maximum A Posteriori (MAP). With the effective incorporation of approaches on phone clustering and non-native adaptation, the Phrase Error Rate (PER) of MESRS for English utterances was reduced by 24.47% relatively compared to the baseline monolingual English system while the PER on Mandarin utterances was comparable to that of the baseline monolingual Mandarin system. The performance for bilingual utterances achieved 22.37% relative PER reduction.
Exploiting Hidden Layer Responses of Deep Neural Networks for Language Recognition
2016-09-08
trained DNNs. We evaluated this ap- proach in NIST 2015 language recognition evaluation. The per- formances achieved by the proposed approach are very...activations, used in direct DNN-LID. Results from the LID experiments support our hypothesis. The LID experiments are performed on NIST Language Recognition...of-the-art I- vector system [3, 10, 11] in evaluation (eval) set of NIST LRE 2015. Combination of proposed technique and state-of-the-art I-vector
ERIC Educational Resources Information Center
Hsiao, Janet H.; Lam, Sze Man
2013-01-01
Through computational modeling, here we examine whether visual and task characteristics of writing systems alone can account for lateralization differences in visual word recognition between different languages without assuming influence from left hemisphere (LH) lateralized language processes. We apply a hemispheric processing model of face…
Sign Language Recognition System using Neural Network for Digital Hardware Implementation
NASA Astrophysics Data System (ADS)
Vargas, Lorena P.; Barba, Leiner; Torres, C. O.; Mattos, L.
2011-01-01
This work presents an image pattern recognition system using neural network for the identification of sign language to deaf people. The system has several stored image that show the specific symbol in this kind of language, which is employed to teach a multilayer neural network using a back propagation algorithm. Initially, the images are processed to adapt them and to improve the performance of discriminating of the network, including in this process of filtering, reduction and elimination noise algorithms as well as edge detection. The system is evaluated using the signs without including movement in their representation.
Recognition of sign language with an inertial sensor-based data glove.
Kim, Kyung-Won; Lee, Mi-So; Soon, Bo-Ram; Ryu, Mun-Ho; Kim, Je-Nam
2015-01-01
Communication between people with normal hearing and hearing impairment is difficult. Recently, a variety of studies on sign language recognition have presented benefits from the development of information technology. This study presents a sign language recognition system using a data glove composed of 3-axis accelerometers, magnetometers, and gyroscopes. Each data obtained by the data glove is transmitted to a host application (implemented in a Window program on a PC). Next, the data is converted into angle data, and the angle information is displayed on the host application and verified by outputting three-dimensional models to the display. An experiment was performed with five subjects, three females and two males, and a performance set comprising numbers from one to nine was repeated five times. The system achieves a 99.26% movement detection rate, and approximately 98% recognition rate for each finger's state. The proposed system is expected to be a more portable and useful system when this algorithm is applied to smartphone applications for use in some situations such as in emergencies.
NASA Astrophysics Data System (ADS)
Maskeliunas, Rytis; Rudzionis, Vytautas
2011-06-01
In recent years various commercial speech recognizers have become available. These recognizers provide the possibility to develop applications incorporating various speech recognition techniques easily and quickly. All of these commercial recognizers are typically targeted to widely spoken languages having large market potential; however, it may be possible to adapt available commercial recognizers for use in environments where less widely spoken languages are used. Since most commercial recognition engines are closed systems the single avenue for the adaptation is to try set ways for the selection of proper phonetic transcription methods between the two languages. This paper deals with the methods to find the phonetic transcriptions for Lithuanian voice commands to be recognized using English speech engines. The experimental evaluation showed that it is possible to find phonetic transcriptions that will enable the recognition of Lithuanian voice commands with recognition accuracy of over 90%.
Meinhardt-Injac, Bozana; Daum, Moritz M.; Meinhardt, Günter; Persike, Malte
2018-01-01
According to the two-systems account of theory of mind (ToM), understanding mental states of others involves both fast social-perceptual processes, as well as slower, reflexive cognitive operations (Frith and Frith, 2008; Apperly and Butterfill, 2009). To test the respective roles of specific abilities in either of these processes we administered 15 experimental procedures to a large sample of 343 participants, testing ability in face recognition and holistic perception, language, and reasoning. ToM was measured by a set of tasks requiring ability to track and to infer complex emotional and mental states of others from faces, eyes, spoken language, and prosody. We used structural equation modeling to test the relative strengths of a social-perceptual (face processing related) and reflexive-cognitive (language and reasoning related) path in predicting ToM ability. The two paths accounted for 58% of ToM variance, thus validating a general two-systems framework. Testing specific predictor paths revealed language and face recognition as strong and significant predictors of ToM. For reasoning, there were neither direct nor mediated effects, albeit reasoning was strongly associated with language. Holistic face perception also failed to show a direct link with ToM ability, while there was a mediated effect via face recognition. These results highlight the respective roles of face recognition and language for the social brain, and contribute closer empirical specification of the general two-systems account. PMID:29445336
Improving language models for radiology speech recognition.
Paulett, John M; Langlotz, Curtis P
2009-02-01
Speech recognition systems have become increasingly popular as a means to produce radiology reports, for reasons both of efficiency and of cost. However, the suboptimal recognition accuracy of these systems can affect the productivity of the radiologists creating the text reports. We analyzed a database of over two million de-identified radiology reports to determine the strongest determinants of word frequency. Our results showed that body site and imaging modality had a similar influence on the frequency of words and of three-word phrases as did the identity of the speaker. These findings suggest that the accuracy of speech recognition systems could be significantly enhanced by further tailoring their language models to body site and imaging modality, which are readily available at the time of report creation.
The Legal Recognition of Sign Languages
ERIC Educational Resources Information Center
De Meulder, Maartje
2015-01-01
This article provides an analytical overview of the different types of explicit legal recognition of sign languages. Five categories are distinguished: constitutional recognition, recognition by means of general language legislation, recognition by means of a sign language law or act, recognition by means of a sign language law or act including…
Recognition of Arabic Sign Language Alphabet Using Polynomial Classifiers
NASA Astrophysics Data System (ADS)
Assaleh, Khaled; Al-Rousan, M.
2005-12-01
Building an accurate automatic sign language recognition system is of great importance in facilitating efficient communication with deaf people. In this paper, we propose the use of polynomial classifiers as a classification engine for the recognition of Arabic sign language (ArSL) alphabet. Polynomial classifiers have several advantages over other classifiers in that they do not require iterative training, and that they are highly computationally scalable with the number of classes. Based on polynomial classifiers, we have built an ArSL system and measured its performance using real ArSL data collected from deaf people. We show that the proposed system provides superior recognition results when compared with previously published results using ANFIS-based classification on the same dataset and feature extraction methodology. The comparison is shown in terms of the number of misclassified test patterns. The reduction in the rate of misclassified patterns was very significant. In particular, we have achieved a 36% reduction of misclassifications on the training data and 57% on the test data.
Halim, Zahid; Abbas, Ghulam
2015-01-01
Sign language provides hearing and speech impaired individuals with an interface to communicate with other members of the society. Unfortunately, sign language is not understood by most of the common people. For this, a gadget based on image processing and pattern recognition can provide with a vital aid for detecting and translating sign language into a vocal language. This work presents a system for detecting and understanding the sign language gestures by a custom built software tool and later translating the gesture into a vocal language. For the purpose of recognizing a particular gesture, the system employs a Dynamic Time Warping (DTW) algorithm and an off-the-shelf software tool is employed for vocal language generation. Microsoft(®) Kinect is the primary tool used to capture video stream of a user. The proposed method is capable of successfully detecting gestures stored in the dictionary with an accuracy of 91%. The proposed system has the ability to define and add custom made gestures. Based on an experiment in which 10 individuals with impairments used the system to communicate with 5 people with no disability, 87% agreed that the system was useful.
A segmentation-free approach to Arabic and Urdu OCR
NASA Astrophysics Data System (ADS)
Sabbour, Nazly; Shafait, Faisal
2013-01-01
In this paper, we present a generic Optical Character Recognition system for Arabic script languages called Nabocr. Nabocr uses OCR approaches specific for Arabic script recognition. Performing recognition on Arabic script text is relatively more difficult than Latin text due to the nature of Arabic script, which is cursive and context sensitive. Moreover, Arabic script has different writing styles that vary in complexity. Nabocr is initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts. However, it can be trained by users to be used for other Arabic script languages. We have evaluated our system's performance for both Urdu and Arabic. In order to evaluate Urdu recognition, we have generated a dataset of Urdu text called UPTI (Urdu Printed Text Image Database), which measures different aspects of a recognition system. The performance of our system for Urdu clean text is 91%. For Arabic clean text, the performance is 86%. Moreover, we have compared the performance of our system against Tesseract's newly released Arabic recognition, and the performance of both systems on clean images is almost the same.
Automatic Mexican sign language and digits recognition using normalized central moments
NASA Astrophysics Data System (ADS)
Solís, Francisco; Martínez, David; Espinosa, Oscar; Toxqui, Carina
2016-09-01
This work presents a framework for automatic Mexican sign language and digits recognition based on computer vision system using normalized central moments and artificial neural networks. Images are captured by digital IP camera, four LED reflectors and a green background in order to reduce computational costs and prevent the use of special gloves. 42 normalized central moments are computed per frame and used in a Multi-Layer Perceptron to recognize each database. Four versions per sign and digit were used in training phase. 93% and 95% of recognition rates were achieved for Mexican sign language and digits respectively.
NASA Astrophysics Data System (ADS)
Moses, David A.; Mesgarani, Nima; Leonard, Matthew K.; Chang, Edward F.
2016-10-01
Objective. The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arbitrary vocabulary sizes using the high gamma band power of local field potentials in the STG and neighboring cortical areas obtained via electrocorticography. Approach. The system implements a Viterbi decoder that incorporates phoneme likelihood estimates from a linear discriminant analysis model and transition probabilities from an n-gram phonemic language model. Grid searches were used in an attempt to determine optimal parameterizations of the feature vectors and Viterbi decoder. Main results. The performance of the system was significantly improved by using spatiotemporal representations of the neural activity (as opposed to purely spatial representations) and by including language modeling and Viterbi decoding in the NSR system. Significance. These results emphasize the importance of modeling the temporal dynamics of neural responses when analyzing their variations with respect to varying stimuli and demonstrate that speech recognition techniques can be successfully leveraged when decoding speech from neural signals. Guided by the results detailed in this work, further development of the NSR system could have applications in the fields of automatic speech recognition and neural prosthetics.
Representations of the language recognition problem for a theorem prover
NASA Technical Reports Server (NTRS)
Minker, J.; Vanderbrug, G. J.
1972-01-01
Two representations of the language recognition problem for a theorem prover in first order logic are presented and contrasted. One of the representations is based on the familiar method of generating sentential forms of the language, and the other is based on the Cocke parsing algorithm. An augmented theorem prover is described which permits recognition of recursive languages. The state-transformation method developed by Cordell Green to construct problem solutions in resolution-based systems can be used to obtain the parse tree. In particular, the end-order traversal of the parse tree is derived in one of the representations. An inference system, termed the cycle inference system, is defined which makes it possible for the theorem prover to model the method on which the representation is based. The general applicability of the cycle inference system to state space problems is discussed. Given an unsatisfiable set S, where each clause has at most one positive literal, it is shown that there exists an input proof. The clauses for the two representations satisfy these conditions, as do many state space problems.
Static hand gesture recognition from a video
NASA Astrophysics Data System (ADS)
Rokade, Rajeshree S.; Doye, Dharmpal
2011-10-01
A sign language (also signed language) is a language which, instead of acoustically conveyed sound patterns, uses visually transmitted sign patterns to convey meaning- "simultaneously combining hand shapes, orientation and movement of the hands". Sign languages commonly develop in deaf communities, which can include interpreters, friends and families of deaf people as well as people who are deaf or hard of hearing themselves. In this paper, we proposed a novel system for recognition of static hand gestures from a video, based on Kohonen neural network. We proposed algorithm to separate out key frames, which include correct gestures from a video sequence. We segment, hand images from complex and non uniform background. Features are extracted by applying Kohonen on key frames and recognition is done.
Intonation and dialog context as constraints for speech recognition.
Taylor, P; King, S; Isard, S; Wright, H
1998-01-01
This paper describes a way of using intonation and dialog context to improve the performance of an automatic speech recognition (ASR) system. Our experiments were run on the DCIEM Maptask corpus, a corpus of spontaneous task-oriented dialog speech. This corpus has been tagged according to a dialog analysis scheme that assigns each utterance to one of 12 "move types," such as "acknowledge," "query-yes/no" or "instruct." Most ASR systems use a bigram language model to constrain the possible sequences of words that might be recognized. Here we use a separate bigram language model for each move type. We show that when the "correct" move-specific language model is used for each utterance in the test set, the word error rate of the recognizer drops. Of course when the recognizer is run on previously unseen data, it cannot know in advance what move type the speaker has just produced. To determine the move type we use an intonation model combined with a dialog model that puts constraints on possible sequences of move types, as well as the speech recognizer likelihoods for the different move-specific models. In the full recognition system, the combination of automatic move type recognition with the move specific language models reduces the overall word error rate by a small but significant amount when compared with a baseline system that does not take intonation or dialog acts into account. Interestingly, the word error improvement is restricted to "initiating" move types, where word recognition is important. In "response" move types, where the important information is conveyed by the move type itself--for example, positive versus negative response--there is no word error improvement, but recognition of the response types themselves is good. The paper discusses the intonation model, the language models, and the dialog model in detail and describes the architecture in which they are combined.
A Human Mirror Neuron System for Language: Perspectives from Signed Languages of the Deaf
ERIC Educational Resources Information Center
Knapp, Heather Patterson; Corina, David P.
2010-01-01
Language is proposed to have developed atop the human analog of the macaque mirror neuron system for action perception and production [Arbib M.A. 2005. From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics (with commentaries and author's response). "Behavioral and Brain Sciences, 28", 105-167; Arbib…
User Experience of a Mobile Speaking Application with Automatic Speech Recognition for EFL Learning
ERIC Educational Resources Information Center
Ahn, Tae youn; Lee, Sangmin-Michelle
2016-01-01
With the spread of mobile devices, mobile phones have enormous potential regarding their pedagogical use in language education. The goal of this study is to analyse user experience of a mobile-based learning system that is enhanced by speech recognition technology for the improvement of EFL (English as a foreign language) learners' speaking…
Detailed Phonetic Labeling of Multi-language Database for Spoken Language Processing Applications
2015-03-01
which contains about 60 interfering speakers as well as background music in a bar. The top panel is again clean training /noisy testing settings, and...recognition system for Mandarin was developed and tested. Character recognition rates as high as 88% were obtained, using an approximately 40 training ...Tool_ComputeFeat.m) .............................................................................................................. 50 6.3. Training
ERIC Educational Resources Information Center
Franco, Horacio; Bratt, Harry; Rossier, Romain; Rao Gadde, Venkata; Shriberg, Elizabeth; Abrash, Victor; Precoda, Kristin
2010-01-01
SRI International's EduSpeak[R] system is a software development toolkit that enables developers of interactive language education software to use state-of-the-art speech recognition and pronunciation scoring technology. Automatic pronunciation scoring allows the computer to provide feedback on the overall quality of pronunciation and to point to…
Bilingual Language Switching: Production vs. Recognition
Mosca, Michela; de Bot, Kees
2017-01-01
This study aims at assessing how bilinguals select words in the appropriate language in production and recognition while minimizing interference from the non-appropriate language. Two prominent models are considered which assume that when one language is in use, the other is suppressed. The Inhibitory Control (IC) model suggests that, in both production and recognition, the amount of inhibition on the non-target language is greater for the stronger compared to the weaker language. In contrast, the Bilingual Interactive Activation (BIA) model proposes that, in language recognition, the amount of inhibition on the weaker language is stronger than otherwise. To investigate whether bilingual language production and recognition can be accounted for by a single model of bilingual processing, we tested a group of native speakers of Dutch (L1), advanced speakers of English (L2) in a bilingual recognition and production task. Specifically, language switching costs were measured while participants performed a lexical decision (recognition) and a picture naming (production) task involving language switching. Results suggest that while in language recognition the amount of inhibition applied to the non-appropriate language increases along with its dominance as predicted by the IC model, in production the amount of inhibition applied to the non-relevant language is not related to language dominance, but rather it may be modulated by speakers' unconscious strategies to foster the weaker language. This difference indicates that bilingual language recognition and production might rely on different processing mechanisms and cannot be accounted within one of the existing models of bilingual language processing. PMID:28638361
Bilingual Language Switching: Production vs. Recognition.
Mosca, Michela; de Bot, Kees
2017-01-01
This study aims at assessing how bilinguals select words in the appropriate language in production and recognition while minimizing interference from the non-appropriate language. Two prominent models are considered which assume that when one language is in use, the other is suppressed. The Inhibitory Control (IC) model suggests that, in both production and recognition, the amount of inhibition on the non-target language is greater for the stronger compared to the weaker language. In contrast, the Bilingual Interactive Activation (BIA) model proposes that, in language recognition, the amount of inhibition on the weaker language is stronger than otherwise. To investigate whether bilingual language production and recognition can be accounted for by a single model of bilingual processing, we tested a group of native speakers of Dutch (L1), advanced speakers of English (L2) in a bilingual recognition and production task. Specifically, language switching costs were measured while participants performed a lexical decision (recognition) and a picture naming (production) task involving language switching. Results suggest that while in language recognition the amount of inhibition applied to the non-appropriate language increases along with its dominance as predicted by the IC model, in production the amount of inhibition applied to the non-relevant language is not related to language dominance, but rather it may be modulated by speakers' unconscious strategies to foster the weaker language. This difference indicates that bilingual language recognition and production might rely on different processing mechanisms and cannot be accounted within one of the existing models of bilingual language processing.
ERIC Educational Resources Information Center
Suendermann-Oeft, David; Ramanarayanan, Vikram; Yu, Zhou; Qian, Yao; Evanini, Keelan; Lange, Patrick; Wang, Xinhao; Zechner, Klaus
2017-01-01
We present work in progress on a multimodal dialog system for English language assessment using a modular cloud-based architecture adhering to open industry standards. Among the modules being developed for the system, multiple modules heavily exploit machine learning techniques, including speech recognition, spoken language proficiency rating,…
The A2iA French handwriting recognition system at the Rimes-ICDAR2011 competition
NASA Astrophysics Data System (ADS)
Menasri, Farès; Louradour, Jérôme; Bianne-Bernard, Anne-Laure; Kermorvant, Christopher
2012-01-01
This paper describes the system for the recognition of French handwriting submitted by A2iA to the competition organized at ICDAR2011 using the Rimes database. This system is composed of several recognizers based on three different recognition technologies, combined using a novel combination method. A framework multi-word recognition based on weighted finite state transducers is presented, using an explicit word segmentation, a combination of isolated word recognizers and a language model. The system was tested both for isolated word recognition and for multi-word line recognition and submitted to the RIMES-ICDAR2011 competition. This system outperformed all previously proposed systems on these tasks.
Continuous Chinese sign language recognition with CNN-LSTM
NASA Astrophysics Data System (ADS)
Yang, Su; Zhu, Qing
2017-07-01
The goal of sign language recognition (SLR) is to translate the sign language into text, and provide a convenient tool for the communication between the deaf-mute and the ordinary. In this paper, we formulate an appropriate model based on convolutional neural network (CNN) combined with Long Short-Term Memory (LSTM) network, in order to accomplish the continuous recognition work. With the strong ability of CNN, the information of pictures captured from Chinese sign language (CSL) videos can be learned and transformed into vector. Since the video can be regarded as an ordered sequence of frames, LSTM model is employed to connect with the fully-connected layer of CNN. As a recurrent neural network (RNN), it is suitable for sequence learning tasks with the capability of recognizing patterns defined by temporal distance. Compared with traditional RNN, LSTM has performed better on storing and accessing information. We evaluate this method on our self-built dataset including 40 daily vocabularies. The experimental results show that the recognition method with CNN-LSTM can achieve a high recognition rate with small training sets, which will meet the needs of real-time SLR system.
Manasse, N J; Hux, K; Rankin-Erickson, J L
2000-11-01
Impairments in motor functioning, language processing, and cognitive status may impact the written language performance of traumatic brain injury (TBI) survivors. One strategy to minimize the impact of these impairments is to use a speech recognition system. The purpose of this study was to explore the effect of mild dysarthria and mild cognitive-communication deficits secondary to TBI on a 19-year-old survivor's mastery and use of such a system-specifically, Dragon Naturally Speaking. Data included the % of the participant's words accurately perceived by the system over time, the participant's accuracy over time in using commands for navigation and error correction, and quantitative and qualitative changes in the participant's written texts generated with and without the use of the speech recognition system. Results showed that Dragon NaturallySpeaking was approximately 80% accurate in perceiving words spoken by the participant, and the participant quickly and easily mastered all navigation and error correction commands presented. Quantitatively, the participant produced a greater amount of text using traditional word processing and a standard keyboard than using the speech recognition system. Minimal qualitative differences appeared between writing samples. Discussion of factors that may have contributed to the obtained results and that may affect the generalization of the findings to other TBI survivors is provided.
R, Elakkiya; K, Selvamani
2017-09-22
Subunit segmenting and modelling in medical sign language is one of the important studies in linguistic-oriented and vision-based Sign Language Recognition (SLR). Many efforts were made in the precedent to focus the functional subunits from the view of linguistic syllables but the problem is implementing such subunit extraction using syllables is not feasible in real-world computer vision techniques. And also, the present recognition systems are designed in such a way that it can detect the signer dependent actions under restricted and laboratory conditions. This research paper aims at solving these two important issues (1) Subunit extraction and (2) Signer independent action on visual sign language recognition. Subunit extraction involved in the sequential and parallel breakdown of sign gestures without any prior knowledge on syllables and number of subunits. A novel Bayesian Parallel Hidden Markov Model (BPaHMM) is introduced for subunit extraction to combine the features of manual and non-manual parameters to yield better results in classification and recognition of signs. Signer independent action aims in using a single web camera for different signer behaviour patterns and for cross-signer validation. Experimental results have proved that the proposed signer independent subunit level modelling for sign language classification and recognition has shown improvement and variations when compared with other existing works.
Building intelligent communication systems for handicapped aphasiacs.
Fu, Yu-Fen; Ho, Cheng-Seen
2010-01-01
This paper presents an intelligent system allowing handicapped aphasiacs to perform basic communication tasks. It has the following three key features: (1) A 6-sensor data glove measures the finger gestures of a patient in terms of the bending degrees of his fingers. (2) A finger language recognition subsystem recognizes language components from the finger gestures. It employs multiple regression analysis to automatically extract proper finger features so that the recognition model can be fast and correctly constructed by a radial basis function neural network. (3) A coordinate-indexed virtual keyboard allows the users to directly access the letters on the keyboard at a practical speed. The system serves as a viable tool for natural and affordable communication for handicapped aphasiacs through continuous finger language input.
NASA Astrophysics Data System (ADS)
Wang, Hongcui; Kawahara, Tatsuya
CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.
The Development of Word Recognition in a Second Language.
ERIC Educational Resources Information Center
Muljani, D.; Koda, Keiko; Moates, Danny R.
1998-01-01
A study investigated differences in English word recognition in native speakers of Indonesian (an alphabetic language) and Chinese (a logographic languages) learning English as a Second Language. Results largely confirmed the hypothesis that an alphabetic first language would predict better word recognition in speakers of an alphabetic language,…
Parton, Becky Sue
2006-01-01
In recent years, research has progressed steadily in regard to the use of computers to recognize and render sign language. This paper reviews significant projects in the field beginning with finger-spelling hands such as "Ralph" (robotics), CyberGloves (virtual reality sensors to capture isolated and continuous signs), camera-based projects such as the CopyCat interactive American Sign Language game (computer vision), and sign recognition software (Hidden Markov Modeling and neural network systems). Avatars such as "Tessa" (Text and Sign Support Assistant; three-dimensional imaging) and spoken language to sign language translation systems such as Poland's project entitled "THETOS" (Text into Sign Language Automatic Translator, which operates in Polish; natural language processing) are addressed. The application of this research to education is also explored. The "ICICLE" (Interactive Computer Identification and Correction of Language Errors) project, for example, uses intelligent computer-aided instruction to build a tutorial system for deaf or hard-of-hearing children that analyzes their English writing and makes tailored lessons and recommendations. Finally, the article considers synthesized sign, which is being added to educational material and has the potential to be developed by students themselves.
Auditory Modeling for Noisy Speech Recognition.
2000-01-01
multiple platforms including PCs, workstations, and DSPs. A prototype version of the SOS process was tested on the Japanese Hiragana language with good...judgment among linguists. American English has 48 phonetic sounds in the ARPABET representation. Hiragana , the Japanese phonetic language, has only 20... Japanese Hiragana ," H.L. Pfister, FL 95, 1995. "State Recognition for Noisy Dynamic Systems," H.L. Pfister, Tech 2005, Chicago, 1995. "Experiences
Language deficits in poor comprehenders: a case for the simple view of reading.
Catts, Hugh W; Adlof, Suzanne M; Ellis Weismer, Susan
2006-04-01
To examine concurrently and retrospectively the language abilities of children with specific reading comprehension deficits ("poor comprehenders") and compare them to typical readers and children with specific decoding deficits ("poor decoders"). In Study 1, the authors identified 57 poor comprehenders, 27 poor decoders, and 98 typical readers on the basis of 8th-grade reading achievement. These subgroups' performances on 8th-grade measures of language comprehension and phonological processing were investigated. In Study 2, the authors examined retrospectively subgroups' performances on measures of language comprehension and phonological processing in kindergarten, 2nd, and 4th grades. Word recognition and reading comprehension in 2nd and 4th grades were also considered. Study 1 showed that poor comprehenders had concurrent deficits in language comprehension but normal abilities in phonological processing. Poor decoders were characterized by the opposite pattern of language abilities. Study 2 results showed that subgroups had language (and word recognition) profiles in the earlier grades that were consistent with those observed in 8th grade. Subgroup differences in reading comprehension were inconsistent across grades but reflective of the changes in the components of reading comprehension over time. The results support the simple view of reading and the phonological deficit hypothesis. Furthermore, the findings indicate that a classification system that is based on the simple view has advantages over standard systems that focus only on word recognition and/or reading comprehension.
The development of cross-cultural recognition of vocal emotion during childhood and adolescence.
Chronaki, Georgia; Wigelsworth, Michael; Pell, Marc D; Kotz, Sonja A
2018-06-14
Humans have an innate set of emotions recognised universally. However, emotion recognition also depends on socio-cultural rules. Although adults recognise vocal emotions universally, they identify emotions more accurately in their native language. We examined developmental trajectories of universal vocal emotion recognition in children. Eighty native English speakers completed a vocal emotion recognition task in their native language (English) and foreign languages (Spanish, Chinese, and Arabic) expressing anger, happiness, sadness, fear, and neutrality. Emotion recognition was compared across 8-to-10, 11-to-13-year-olds, and adults. Measures of behavioural and emotional problems were also taken. Results showed that although emotion recognition was above chance for all languages, native English speaking children were more accurate in recognising vocal emotions in their native language. There was a larger improvement in recognising vocal emotion from the native language during adolescence. Vocal anger recognition did not improve with age for the non-native languages. This is the first study to demonstrate universality of vocal emotion recognition in children whilst supporting an "in-group advantage" for more accurate recognition in the native language. Findings highlight the role of experience in emotion recognition, have implications for child development in modern multicultural societies and address important theoretical questions about the nature of emotions.
Medical Named Entity Recognition for Indonesian Language Using Word Representations
NASA Astrophysics Data System (ADS)
Rahman, Arief
2018-03-01
Nowadays, Named Entity Recognition (NER) system is used in medical texts to obtain important medical information, like diseases, symptoms, and drugs. While most NER systems are applied to formal medical texts, informal ones like those from social media (also called semi-formal texts) are starting to get recognition as a gold mine for medical information. We propose a theoretical Named Entity Recognition (NER) model for semi-formal medical texts in our medical knowledge management system by comparing two kinds of word representations: cluster-based word representation and distributed representation.
Spoken Grammar Practice and Feedback in an ASR-Based CALL System
ERIC Educational Resources Information Center
de Vries, Bart Penning; Cucchiarini, Catia; Bodnar, Stephen; Strik, Helmer; van Hout, Roeland
2015-01-01
Speaking practice is important for learners of a second language. Computer assisted language learning (CALL) systems can provide attractive opportunities for speaking practice when combined with automatic speech recognition (ASR) technology. In this paper, we present a CALL system that offers spoken practice of word order, an important aspect of…
ERIC Educational Resources Information Center
Barbour, Ross; Ostler, Catherine; Templeman, Elizabeth; West, Elizabeth
2007-01-01
The British Columbia (BC) English as a Second Language (ESL) Articulation Committee's Canadian Language Benchmarks project was precipitated by ESL instructors' desire to address transfer difficulties of ESL students within the BC transfer system and to respond to the recognition that the Canadian Language Benchmarks, a descriptive scale of ESL…
Multilingual Data Selection for Low Resource Speech Recognition
2016-09-12
Figure 1: Identification of language clusters using scores from an LID system training languages used in the Base and OP1 evaluation periods of the Babel...the posterior scores over frames. For a set of languages that are used to train the lan- guage identification (LID) network, pairs of languages that...which are combined during test time to produce 10 dimensional language 3854 Figure 3: Identification of language clusters using scores from individually
Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks
ERIC Educational Resources Information Center
Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya
2016-01-01
This study investigates the use of Automatic Speech Recognition (ASR) systems to epitomize second language (L2) listeners' problems in perception of TED talks. ASR-generated transcripts of videos often involve recognition errors, which may indicate difficult segments for L2 listeners. This paper aims to discover the root-causes of the ASR errors…
ERIC Educational Resources Information Center
Kidd, Joanna C.; Shum, Kathy K.; Wong, Anita M.-Y.; Ho, Connie S.-H.
2017-01-01
Auditory processing and spoken word recognition difficulties have been observed in Specific Language Impairment (SLI), raising the possibility that auditory perceptual deficits disrupt word recognition and, in turn, phonological processing and oral language. In this study, fifty-seven kindergarten children with SLI and fifty-three language-typical…
Research on Spoken Dialogue Systems
NASA Technical Reports Server (NTRS)
Aist, Gregory; Hieronymus, James; Dowding, John; Hockey, Beth Ann; Rayner, Manny; Chatzichrisafis, Nikos; Farrell, Kim; Renders, Jean-Michel
2010-01-01
Research in the field of spoken dialogue systems has been performed with the goal of making such systems more robust and easier to use in demanding situations. The term "spoken dialogue systems" signifies unified software systems containing speech-recognition, speech-synthesis, dialogue management, and ancillary components that enable human users to communicate, using natural spoken language or nearly natural prescribed spoken language, with other software systems that provide information and/or services.
Foreign Language Analysis and Recognition (FLARe)
2016-10-08
10 7 Chinese CER ...Rates ( CERs ) were obtained with each feature set: (1) 19.2%, (2) 17.3%, and (3) 15.3%. Based on these results, a GMM-HMM speech recognition system...These systems were evaluated on the HUB4 and HKUST test partitions. Table 7 shows the CER obtained on each test set. Whereas including the HKUST data
Su, Ruiliang; Chen, Xiang; Cao, Shuai; Zhang, Xu
2016-01-14
Sign language recognition (SLR) has been widely used for communication amongst the hearing-impaired and non-verbal community. This paper proposes an accurate and robust SLR framework using an improved decision tree as the base classifier of random forests. This framework was used to recognize Chinese sign language subwords using recordings from a pair of portable devices worn on both arms consisting of accelerometers (ACC) and surface electromyography (sEMG) sensors. The experimental results demonstrated the validity of the proposed random forest-based method for recognition of Chinese sign language (CSL) subwords. With the proposed method, 98.25% average accuracy was obtained for the classification of a list of 121 frequently used CSL subwords. Moreover, the random forests method demonstrated a superior performance in resisting the impact of bad training samples. When the proportion of bad samples in the training set reached 50%, the recognition error rate of the random forest-based method was only 10.67%, while that of a single decision tree adopted in our previous work was almost 27.5%. Our study offers a practical way of realizing a robust and wearable EMG-ACC-based SLR systems.
ERIC Educational Resources Information Center
Yao Sua, Tan; Hooi See, Teoh
2014-01-01
The Chinese language movement was launched by the Chinese educationists to demand the recognition of Chinese as an official language to legitimise the status of Chinese education in the national education system in Malaysia. It began in 1952 as a response to the British attempt to establish national primary schools teaching in English and Malay to…
Knowledge of a Second Language Influences Auditory Word Recognition in the Native Language
ERIC Educational Resources Information Center
Lagrou, Evelyne; Hartsuiker, Robert J.; Duyck, Wouter
2011-01-01
Many studies in bilingual visual word recognition have demonstrated that lexical access is not language selective. However, research on bilingual word recognition in the auditory modality has been scarce, and it has yielded mixed results with regard to the degree of this language nonselectivity. In the present study, we investigated whether…
Foreign Language Analysis and Recognition (FLARe) Initial Progress
2012-11-29
University Language Modeling ToolKit CoMMA Count Mediated Morphological Analysis CRUD Create, Read , Update & Delete CPAN Comprehensive Perl Archive...DATES COVERED (From - To) 1 October 2010 – 30 September 2012 4. TITLE AND SUBTITLE Foreign Language Analysis and Recognition (FLARe) Initial Progress...AFRL-RH-WP-TR-2012-0165 FOREIGN LANGUAGE ANALYSIS AND RECOGNITION (FLARE) INITIAL PROGRESS Brian M. Ore
EFL Learners' Production of Questions over Time: Linguistic, Usage-Based, and Developmental Features
ERIC Educational Resources Information Center
Nekrasova-Beker, Tatiana M.
2011-01-01
The recognition of second language (L2) development as a dynamic process has led to different claims about how language development unfolds, what represents a learner's linguistic system (i.e., interlanguage) at a certain point in time, and how that system changes over time (Verspoor, de Bot, & Lowie, 2011). Responding to de Bot and…
ERIC Educational Resources Information Center
Ryba, Ken; McIvor, Tom; Shakir, Maha; Paez, Di
2006-01-01
This study examined continuous automated speech recognition in the university lecture theatre. The participants were both native speakers of English (L1) and English as a second language students (L2) enrolled in an information systems course (Total N=160). After an initial training period, an L2 lecturer in information systems delivered three…
A novel probabilistic framework for event-based speech recognition
NASA Astrophysics Data System (ADS)
Juneja, Amit; Espy-Wilson, Carol
2003-10-01
One of the reasons for unsatisfactory performance of the state-of-the-art automatic speech recognition (ASR) systems is the inferior acoustic modeling of low-level acoustic-phonetic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal, but such a system for continuous speech recognition (CSR) is not known to exist. A probabilistic and statistical framework for CSR based on the idea of the representation of speech sounds by bundles of binary valued articulatory phonetic features is proposed. Multiple probabilistic sequences of linguistically motivated landmarks are obtained using binary classifiers of manner phonetic features-syllabic, sonorant and continuant-and the knowledge-based acoustic parameters (APs) that are acoustic correlates of those features. The landmarks are then used for the extraction of knowledge-based APs for source and place phonetic features and their binary classification. Probabilistic landmark sequences are constrained using manner class language models for isolated or connected word recognition. The proposed method could overcome the disadvantages encountered by the early acoustic-phonetic knowledge-based systems that led the ASR community to switch to systems highly dependent on statistical pattern analysis methods and probabilistic language or grammar models.
NASA Technical Reports Server (NTRS)
1973-01-01
The users manual for the word recognition computer program contains flow charts of the logical diagram, the memory map for templates, the speech analyzer card arrangement, minicomputer input/output routines, and assembly language program listings.
Recognition of Indian Sign Language in Live Video
NASA Astrophysics Data System (ADS)
Singha, Joyeeta; Das, Karen
2013-05-01
Sign Language Recognition has emerged as one of the important area of research in Computer Vision. The difficulty faced by the researchers is that the instances of signs vary with both motion and appearance. Thus, in this paper a novel approach for recognizing various alphabets of Indian Sign Language is proposed where continuous video sequences of the signs have been considered. The proposed system comprises of three stages: Preprocessing stage, Feature Extraction and Classification. Preprocessing stage includes skin filtering, histogram matching. Eigen values and Eigen Vectors were considered for feature extraction stage and finally Eigen value weighted Euclidean distance is used to recognize the sign. It deals with bare hands, thus allowing the user to interact with the system in natural way. We have considered 24 different alphabets in the video sequences and attained a success rate of 96.25%.
Ding, Huijun; He, Qing; Zhou, Yongjin; Dan, Guo; Cui, Song
2017-01-01
Motion-intent-based finger gesture recognition systems are crucial for many applications such as prosthesis control, sign language recognition, wearable rehabilitation system, and human–computer interaction. In this article, a motion-intent-based finger gesture recognition system is designed to correctly identify the tapping of every finger for the first time. Two auto-event annotation algorithms are firstly applied and evaluated for detecting the finger tapping frame. Based on the truncated signals, the Wavelet packet transform (WPT) coefficients are calculated and compressed as the features, followed by a feature selection method that is able to improve the performance by optimizing the feature set. Finally, three popular classifiers including naive Bayes (NBC), K-nearest neighbor (KNN), and support vector machine (SVM) are applied and evaluated. The recognition accuracy can be achieved up to 94%. The design and the architecture of the system are presented with full system characterization results. PMID:29167655
Optical character recognition of camera-captured images based on phase features
NASA Astrophysics Data System (ADS)
Diaz-Escobar, Julia; Kober, Vitaly
2015-09-01
Nowadays most of digital information is obtained using mobile devices specially smartphones. In particular, it brings the opportunity for optical character recognition in camera-captured images. For this reason many recognition applications have been recently developed such as recognition of license plates, business cards, receipts and street signal; document classification, augmented reality, language translator and so on. Camera-captured images are usually affected by geometric distortions, nonuniform illumination, shadow, noise, which make difficult the recognition task with existing systems. It is well known that the Fourier phase contains a lot of important information regardless of the Fourier magnitude. So, in this work we propose a phase-based recognition system exploiting phase-congruency features for illumination/scale invariance. The performance of the proposed system is tested in terms of miss classifications and false alarms with the help of computer simulation.
Data-driven approach to human motion modeling with Lua and gesture description language
NASA Astrophysics Data System (ADS)
Hachaj, Tomasz; Koptyra, Katarzyna; Ogiela, Marek R.
2017-03-01
The aim of this paper is to present the novel proposition of the human motion modelling and recognition approach that enables real time MoCap signal evaluation. By motions (actions) recognition we mean classification. The role of this approach is to propose the syntactic description procedure that can be easily understood, learnt and used in various motion modelling and recognition tasks in all MoCap systems no matter if they are vision or wearable sensor based. To do so we have prepared extension of Gesture Description Language (GDL) methodology that enables movements description and real-time recognition so that it can use not only positional coordinates of body joints but virtually any type of discreetly measured output MoCap signals like accelerometer, magnetometer or gyroscope. We have also prepared and evaluated the cross-platform implementation of this approach using Lua scripting language and JAVA technology. This implementation is called Data Driven GDL (DD-GDL). In tested scenarios the average execution speed is above 100 frames per second which is an acquisition time of many popular MoCap solutions.
Tone Attrition in Mandarin Speakers of Varying English Proficiency
Creel, Sarah C.
2017-01-01
Purpose The purpose of this study was to determine whether the degree of dominance of Mandarin–English bilinguals' languages affects phonetic processing of tone content in their native language, Mandarin. Method We tested 72 Mandarin–English bilingual college students with a range of language-dominance profiles in the 2 languages and ages of acquisition of English. Participants viewed 2 photographs at a time while hearing a familiar Mandarin word referring to 1 photograph. The names of the 2 photographs diverged in tone, vowels, or both. Word recognition was evaluated using clicking accuracy, reaction times, and an online recognition measure (gaze) and was compared in the 3 conditions. Results Relative proficiency in English was correlated with reduced word recognition success in tone-disambiguated trials, but not in vowel-disambiguated trials, across all 3 dependent measures. This selective attrition for tone content emerged even though all bilinguals had learned Mandarin from birth. Lengthy experience with English thus weakened tone use. Conclusions This finding has implications for the question of the extent to which bilinguals' 2 phonetic systems interact. It suggests that bilinguals may not process pitch information language-specifically and that processing strategies from the dominant language may affect phonetic processing in the nondominant language—even when the latter was learned natively. PMID:28124064
Indonesian Sign Language Number Recognition using SIFT Algorithm
NASA Astrophysics Data System (ADS)
Mahfudi, Isa; Sarosa, Moechammad; Andrie Asmara, Rosa; Azrino Gustalika, M.
2018-04-01
Indonesian sign language (ISL) is generally used for deaf individuals and poor people communication in communicating. They use sign language as their primary language which consists of 2 types of action: sign and finger spelling. However, not all people understand their sign language so that this becomes a problem for them to communicate with normal people. this problem also becomes a factor they are isolated feel from the social life. It needs a solution that can help them to be able to interacting with normal people. Many research that offers a variety of methods in solving the problem of sign language recognition based on image processing. SIFT (Scale Invariant Feature Transform) algorithm is one of the methods that can be used to identify an object. SIFT is claimed very resistant to scaling, rotation, illumination and noise. Using SIFT algorithm for Indonesian sign language recognition number result rate recognition to 82% with the use of a total of 100 samples image dataset consisting 50 sample for training data and 50 sample images for testing data. Change threshold value get affect the result of the recognition. The best value threshold is 0.45 with rate recognition of 94%.
The Role of Morphology in Word Recognition of Hebrew as a Templatic Language
ERIC Educational Resources Information Center
Oganyan, Marina
2017-01-01
Research on recognition of complex words has primarily focused on affixational complexity in concatenative languages. This dissertation investigates both templatic and affixational complexity in Hebrew, a templatic language, with particular focus on the role of the root and template morphemes in recognition. It also explores the role of morphology…
Foundations for a syntatic pattern recognition system for genomic DNA sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Searles, D.B.
1993-03-01
The goal of the proposed work is the creation of a software system that will perform sophisticated pattern recognition and related functions at a level of abstraction and with expressive power beyond current general-purpose pattern-matching systems for biological sequences; and with a more uniform language, environment, and graphical user interface, and with greater flexibility, extensibility, embeddability, and ability to incorporate other algorithms, than current special-purpose analytic software.
Modern & Classical Languages: K-12 Program EValuation 1988-89.
ERIC Educational Resources Information Center
Martinez, Margaret Perea
This evaluation of the modern and classical languages programs, K-12, in the Albuquerque (New Mexico) public school system provides general information on the program's history, philosophy, recognition, curriculum development, teachers, and activities. Specific information is offered on the different program components, namely, the elementary…
Artificial intelligence, expert systems, computer vision, and natural language processing
NASA Technical Reports Server (NTRS)
Gevarter, W. B.
1984-01-01
An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Orthographic effects in spoken word recognition: Evidence from Chinese.
Qu, Qingqing; Damian, Markus F
2017-06-01
Extensive evidence from alphabetic languages demonstrates a role of orthography in the processing of spoken words. Because alphabetic systems explicitly code speech sounds, such effects are perhaps not surprising. However, it is less clear whether orthographic codes are involuntarily accessed from spoken words in languages with non-alphabetic systems, in which the sound-spelling correspondence is largely arbitrary. We investigated the role of orthography via a semantic relatedness judgment task: native Mandarin speakers judged whether or not spoken word pairs were related in meaning. Word pairs were either semantically related, orthographically related, or unrelated. Results showed that relatedness judgments were made faster for word pairs that were semantically related than for unrelated word pairs. Critically, orthographic overlap on semantically unrelated word pairs induced a significant increase in response latencies. These findings indicate that orthographic information is involuntarily accessed in spoken-word recognition, even in a non-alphabetic language such as Chinese.
2010-03-01
allows the programmer to use the English language in an expressive manor while still maintaining the logical structure of a programming language ( Pressman ...and Choudhury Tanzeem. 2000. Face Recognition for Smart Environments, IEEE Computer, pp. 50–55. Pressman , Roger. 2010. Software Engineering A
NASA Astrophysics Data System (ADS)
Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan
2017-02-01
For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.
Improving Tone Recognition with Nucleus Modeling and Sequential Learning
ERIC Educational Resources Information Center
Wang, Siwei
2010-01-01
Mandarin Chinese and many other tonal languages use tones that are defined as specific pitch patterns to distinguish syllables otherwise ambiguous. It had been shown that tones carry at least as much information as vowels in Mandarin Chinese [Surendran et al., 2005]. Surprisingly, though, many speech recognition systems for Mandarin Chinese have…
ERIC Educational Resources Information Center
McKee, Rachel Locker; Manning, Victoria
2015-01-01
Status planning through legislation made New Zealand Sign Language (NZSL) an official language in 2006. But this strong symbolic action did not create resources or mechanisms to further the aims of the act. In this article we discuss the extent to which legal recognition and ensuing language-planning activities by state and community have affected…
ERIC Educational Resources Information Center
Morford, Jill P.; Kroll, Judith F.; Piñar, Pilar; Wilkinson, Erin
2014-01-01
Recent evidence demonstrates that American Sign Language (ASL) signs are active during print word recognition in deaf bilinguals who are highly proficient in both ASL and English. In the present study, we investigate whether signs are active during print word recognition in two groups of unbalanced bilinguals: deaf ASL-dominant and hearing…
NOBLE - Flexible concept recognition for large-scale biomedical natural language processing.
Tseytlin, Eugene; Mitchell, Kevin; Legowski, Elizabeth; Corrigan, Julia; Chavan, Girish; Jacobson, Rebecca S
2016-01-14
Natural language processing (NLP) applications are increasingly important in biomedical data analysis, knowledge engineering, and decision support. Concept recognition is an important component task for NLP pipelines, and can be either general-purpose or domain-specific. We describe a novel, flexible, and general-purpose concept recognition component for NLP pipelines, and compare its speed and accuracy against five commonly used alternatives on both a biological and clinical corpus. NOBLE Coder implements a general algorithm for matching terms to concepts from an arbitrary vocabulary set. The system's matching options can be configured individually or in combination to yield specific system behavior for a variety of NLP tasks. The software is open source, freely available, and easily integrated into UIMA or GATE. We benchmarked speed and accuracy of the system against the CRAFT and ShARe corpora as reference standards and compared it to MMTx, MGrep, Concept Mapper, cTAKES Dictionary Lookup Annotator, and cTAKES Fast Dictionary Lookup Annotator. We describe key advantages of the NOBLE Coder system and associated tools, including its greedy algorithm, configurable matching strategies, and multiple terminology input formats. These features provide unique functionality when compared with existing alternatives, including state-of-the-art systems. On two benchmarking tasks, NOBLE's performance exceeded commonly used alternatives, performing almost as well as the most advanced systems. Error analysis revealed differences in error profiles among systems. NOBLE Coder is comparable to other widely used concept recognition systems in terms of accuracy and speed. Advantages of NOBLE Coder include its interactive terminology builder tool, ease of configuration, and adaptability to various domains and tasks. NOBLE provides a term-to-concept matching system suitable for general concept recognition in biomedical NLP pipelines.
A multilingual gold-standard corpus for biomedical concept recognition: the Mantra GSC
Clematide, Simon; Akhondi, Saber A; van Mulligen, Erik M; Rebholz-Schuhmann, Dietrich
2015-01-01
Objective To create a multilingual gold-standard corpus for biomedical concept recognition. Materials and methods We selected text units from different parallel corpora (Medline abstract titles, drug labels, biomedical patent claims) in English, French, German, Spanish, and Dutch. Three annotators per language independently annotated the biomedical concepts, based on a subset of the Unified Medical Language System and covering a wide range of semantic groups. To reduce the annotation workload, automatically generated preannotations were provided. Individual annotations were automatically harmonized and then adjudicated, and cross-language consistency checks were carried out to arrive at the final annotations. Results The number of final annotations was 5530. Inter-annotator agreement scores indicate good agreement (median F-score 0.79), and are similar to those between individual annotators and the gold standard. The automatically generated harmonized annotation set for each language performed equally well as the best annotator for that language. Discussion The use of automatic preannotations, harmonized annotations, and parallel corpora helped to keep the manual annotation efforts manageable. The inter-annotator agreement scores provide a reference standard for gauging the performance of automatic annotation techniques. Conclusion To our knowledge, this is the first gold-standard corpus for biomedical concept recognition in languages other than English. Other distinguishing features are the wide variety of semantic groups that are being covered, and the diversity of text genres that were annotated. PMID:25948699
Kannada character recognition system using neural network
NASA Astrophysics Data System (ADS)
Kumar, Suresh D. S.; Kamalapuram, Srinivasa K.; Kumar, Ajay B. R.
2013-03-01
Handwriting recognition has been one of the active and challenging research areas in the field of pattern recognition. It has numerous applications which include, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. As there is no sufficient number of works on Indian language character recognition especially Kannada script among 15 major scripts in India. In this paper an attempt is made to recognize handwritten Kannada characters using Feed Forward neural networks. A handwritten Kannada character is resized into 20x30 Pixel. The resized character is used for training the neural network. Once the training process is completed the same character is given as input to the neural network with different set of neurons in hidden layer and their recognition accuracy rate for different Kannada characters has been calculated and compared. The results show that the proposed system yields good recognition accuracy rates comparable to that of other handwritten character recognition systems.
The Design of Hand Gestures for Human-Computer Interaction: Lessons from Sign Language Interpreters.
Rempel, David; Camilleri, Matt J; Lee, David L
2015-10-01
The design and selection of 3D modeled hand gestures for human-computer interaction should follow principles of natural language combined with the need to optimize gesture contrast and recognition. The selection should also consider the discomfort and fatigue associated with distinct hand postures and motions, especially for common commands. Sign language interpreters have extensive and unique experience forming hand gestures and many suffer from hand pain while gesturing. Professional sign language interpreters (N=24) rated discomfort for hand gestures associated with 47 characters and words and 33 hand postures. Clear associations of discomfort with hand postures were identified. In a nominal logistic regression model, high discomfort was associated with gestures requiring a flexed wrist, discordant adjacent fingers, or extended fingers. These and other findings should be considered in the design of hand gestures to optimize the relationship between human cognitive and physical processes and computer gesture recognition systems for human-computer input.
Deaf-And-Mute Sign Language Generation System
NASA Astrophysics Data System (ADS)
Kawai, Hideo; Tamura, Shinichi
1984-08-01
We have developed a system which can recognize speech and generate the corresponding animation-like sign language sequence. The system is implemented in a popular personal computer. This has three video-RAM's and a voice recognition board which can recognize only registered voice of a specific speaker. Presently, fourty sign language patterns and fifty finger spellings are stored in two floppy disks. Each sign pattern is composed of one to four sub-patterns. That is, if the pattern is composed of one sub-pattern, it is displayed as a still pattern. If not, it is displayed as a motion pattern. This system will help communications between deaf-and-mute persons and healthy persons. In order to display in high speed, almost programs are written in a machine language.
Markert, H; Kaufmann, U; Kara Kayikci, Z; Palm, G
2009-03-01
Language understanding is a long-standing problem in computer science. However, the human brain is capable of processing complex languages with seemingly no difficulties. This paper shows a model for language understanding using biologically plausible neural networks composed of associative memories. The model is able to deal with ambiguities on the single word and grammatical level. The language system is embedded into a robot in order to demonstrate the correct semantical understanding of the input sentences by letting the robot perform corresponding actions. For that purpose, a simple neural action planning system has been combined with neural networks for visual object recognition and visual attention control mechanisms.
Post interaural neural net-based vowel recognition
NASA Astrophysics Data System (ADS)
Jouny, Ismail I.
2001-10-01
Interaural head related transfer functions are used to process speech signatures prior to neural net based recognition. Data representing the head related transfer function of a dummy has been collected at MIT and made available on the Internet. This data is used to pre-process vowel signatures to mimic the effects of human ear on speech perception. Signatures representing various vowels of the English language are then presented to a multi-layer perceptron trained using the back propagation algorithm for recognition purposes. The focus in this paper is to assess the effects of human interaural system on vowel recognition performance particularly when using a classification system that mimics the human brain such as a neural net.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Searles, D.B.
1993-03-01
The goal of the proposed work is the creation of a software system that will perform sophisticated pattern recognition and related functions at a level of abstraction and with expressive power beyond current general-purpose pattern-matching systems for biological sequences; and with a more uniform language, environment, and graphical user interface, and with greater flexibility, extensibility, embeddability, and ability to incorporate other algorithms, than current special-purpose analytic software.
Individual differences in language and working memory affect children's speech recognition in noise.
McCreery, Ryan W; Spratford, Meredith; Kirby, Benjamin; Brennan, Marc
2017-05-01
We examined how cognitive and linguistic skills affect speech recognition in noise for children with normal hearing. Children with better working memory and language abilities were expected to have better speech recognition in noise than peers with poorer skills in these domains. As part of a prospective, cross-sectional study, children with normal hearing completed speech recognition in noise for three types of stimuli: (1) monosyllabic words, (2) syntactically correct but semantically anomalous sentences and (3) semantically and syntactically anomalous word sequences. Measures of vocabulary, syntax and working memory were used to predict individual differences in speech recognition in noise. Ninety-six children with normal hearing, who were between 5 and 12 years of age. Higher working memory was associated with better speech recognition in noise for all three stimulus types. Higher vocabulary abilities were associated with better recognition in noise for sentences and word sequences, but not for words. Working memory and language both influence children's speech recognition in noise, but the relationships vary across types of stimuli. These findings suggest that clinical assessment of speech recognition is likely to reflect underlying cognitive and linguistic abilities, in addition to a child's auditory skills, consistent with the Ease of Language Understanding model.
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review
NASA Astrophysics Data System (ADS)
Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH
2017-09-01
This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.
ERIC Educational Resources Information Center
Loucas, Tom; Riches, Nick; Baird, Gillian; Pickles, Andrew; Simonoff, Emily; Chandler, Susie; Charman, Tony
2013-01-01
Spoken word recognition, during gating, appears intact in specific language impairment (SLI). This study used gating to investigate the process in adolescents with autism spectrum disorders plus language impairment (ALI). Adolescents with ALI, SLI, and typical language development (TLD), matched on nonverbal IQ listened to gated words that varied…
Conclusiveness of natural languages and recognition of images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wojcik, Z.M.
1983-01-01
The conclusiveness is investigated using recognition processes and one-one correspondence between expressions of a natural language and graphs representing events. The graphs, as conceived in psycholinguistics, are obtained as a result of perception processes. It is possible to generate and process the graphs automatically, using computers and then to convert the resulting graphs into expressions of a natural language. Correctness and conclusiveness of the graphs and sentences are investigated using the fundamental condition for events representation processes. Some consequences of the conclusiveness are discussed, e.g. undecidability of arithmetic, human brain assymetry, correctness of statistical calculations and operations research. It ismore » suggested that the group theory should be imposed on mathematical models of any real system. Proof of the fundamental condition is also presented. 14 references.« less
Complex Event Recognition Architecture
NASA Technical Reports Server (NTRS)
Fitzgerald, William A.; Firby, R. James
2009-01-01
Complex Event Recognition Architecture (CERA) is the name of a computational architecture, and software that implements the architecture, for recognizing complex event patterns that may be spread across multiple streams of input data. One of the main components of CERA is an intuitive event pattern language that simplifies what would otherwise be the complex, difficult tasks of creating logical descriptions of combinations of temporal events and defining rules for combining information from different sources over time. In this language, recognition patterns are defined in simple, declarative statements that combine point events from given input streams with those from other streams, using conjunction, disjunction, and negation. Patterns can be built on one another recursively to describe very rich, temporally extended combinations of events. Thereafter, a run-time matching algorithm in CERA efficiently matches these patterns against input data and signals when patterns are recognized. CERA can be used to monitor complex systems and to signal operators or initiate corrective actions when anomalous conditions are recognized. CERA can be run as a stand-alone monitoring system, or it can be integrated into a larger system to automatically trigger responses to changing environments or problematic situations.
Paats, A; Alumäe, T; Meister, E; Fridolin, I
2018-04-30
The aim of this study was to analyze retrospectively the influence of different acoustic and language models in order to determine the most important effects to the clinical performance of an Estonian language-based non-commercial radiology-oriented automatic speech recognition (ASR) system. An ASR system was developed for Estonian language in radiology domain by utilizing open-source software components (Kaldi toolkit, Thrax). The ASR system was trained with the real radiology text reports and dictations collected during development phases. The final version of the ASR system was tested by 11 radiologists who dictated 219 reports in total, in spontaneous manner in a real clinical environment. The audio files collected in the final phase were used to measure the performance of different versions of the ASR system retrospectively. ASR system versions were evaluated by word error rate (WER) for each speaker and modality and by WER difference for the first and the last version of the ASR system. Total average WER for the final version throughout all material was improved from 18.4% of the first version (v1) to 5.8% of the last (v8) version which corresponds to relative improvement of 68.5%. WER improvement was strongly related to modality and radiologist. In summary, the performance of the final ASR system version was close to optimal, delivering similar results to all modalities and being independent on user, the complexity of the radiology reports, user experience, and speech characteristics.
Episodic memory retrieval in adolescents with and without developmental language disorder (DLD).
Lee, Joanna C
2018-03-01
Two reasons may explain the discrepant findings regarding declarative memory in developmental language disorder (DLD) in the literature. First, standardized tests are one of the primary tools used to assess declarative memory in previous studies. It is possible they are not sensitive enough to subtle memory impairment. Second, the system underlying declarative memory is complex, and thus results may vary depending on the types of encoding and retrieval processes measured (e.g., item specific or relational) and/or task demands (e.g., recall or recognition during memory retrieval). To adopt an experimental paradigm to examine episodic memory functioning in adolescents with and without DLD, with the focus on memory recognition of item-specific and relational information. Two groups of adolescents, one with DLD (n = 23; mean age = 16.73 years) and the other without (n = 23; mean age = 16.75 years), participated in the study. The Relational and Item-Specific Encoding (RISE) paradigm was used to assess the effect of different encoding processes on episodic memory retrieval in DLD. The advantage of using the RISE task is that both item-specific and relational encoding/retrieval can be examined within the same learning paradigm. Adolescents with DLD and those with typical language development showed comparable engagement during the encoding phase. The DLD group showed significantly poorer item recognition than the comparison group. Associative recognition was not significantly different between the two groups; however, there was a non-significant trend for to be poorer in the DLD group than in the comparison group, suggesting a possible impairment in associative recognition in individuals with DLD, but to a lesser magnitude. These results indicate that adolescents with DLD have difficulty with episodic memory retrieval when stimuli are encoded and retrieved without support from contextual information. Associative recognition is relatively less affected than item recognition in adolescents with DLD. © 2017 Royal College of Speech and Language Therapists.
Individual differences in online spoken word recognition: Implications for SLI
McMurray, Bob; Samelson, Vicki M.; Lee, Sung Hee; Tomblin, J. Bruce
2012-01-01
Thirty years of research has uncovered the broad principles that characterize spoken word processing across listeners. However, there have been few systematic investigations of individual differences. Such an investigation could help refine models of word recognition by indicating which processing parameters are likely to vary, and could also have important implications for work on language impairment. The present study begins to fill this gap by relating individual differences in overall language ability to variation in online word recognition processes. Using the visual world paradigm, we evaluated online spoken word recognition in adolescents who varied in both basic language abilities and non-verbal cognitive abilities. Eye movements to target, cohort and rhyme objects were monitored during spoken word recognition, as an index of lexical activation. Adolescents with poor language skills showed fewer looks to the target and more fixations to the cohort and rhyme competitors. These results were compared to a number of variants of the TRACE model (McClelland & Elman, 1986) that were constructed to test a range of theoretical approaches to language impairment: impairments at sensory and phonological levels; vocabulary size, and generalized slowing. None were strongly supported, and variation in lexical decay offered the best fit. Thus, basic word recognition processes like lexical decay may offer a new way to characterize processing differences in language impairment. PMID:19836014
Speech as a pilot input medium
NASA Technical Reports Server (NTRS)
Plummer, R. P.; Coler, C. R.
1977-01-01
The speech recognition system under development is a trainable pattern classifier based on a maximum-likelihood technique. An adjustable uncertainty threshold allows the rejection of borderline cases for which the probability of misclassification is high. The syntax of the command language spoken may be used as an aid to recognition, and the system adapts to changes in pronunciation if feedback from the user is available. Words must be separated by .25 second gaps. The system runs in real time on a mini-computer (PDP 11/10) and was tested on 120,000 speech samples from 10- and 100-word vocabularies. The results of these tests were 99.9% correct recognition for a vocabulary consisting of the ten digits, and 99.6% recognition for a 100-word vocabulary of flight commands, with a 5% rejection rate in each case. With no rejection, the recognition accuracies for the same vocabularies were 99.5% and 98.6% respectively.
Multi-font printed Mongolian document recognition system
NASA Astrophysics Data System (ADS)
Peng, Liangrui; Liu, Changsong; Ding, Xiaoqing; Wang, Hua; Jin, Jianming
2009-01-01
Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some characteristics, for example, one character may be part of another character, we define the character set for recognition according to the segmented components, and the components are combined into characters by rule-based post-processing module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented. For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and connected components. As Mongolian has different font-types which are categorized into two major groups, the parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant character recognition kernels are integrated. Experiments show that the presented methods are effective. The text recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
Word-level recognition of multifont Arabic text using a feature vector matching approach
NASA Astrophysics Data System (ADS)
Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III
1996-03-01
Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.
Structuring Broadcast Audio for Information Access
NASA Astrophysics Data System (ADS)
Gauvain, Jean-Luc; Lamel, Lori
2003-12-01
One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.
Neural Insights into the Relation between Language and Communication
Willems, Roel M.; Varley, Rosemary
2010-01-01
The human capacity to communicate has been hypothesized to be causally dependent upon language. Intuitively this seems plausible since most communication relies on language. Moreover, intention recognition abilities (as a necessary prerequisite for communication) and language development seem to co-develop. Here we review evidence from neuroimaging as well as from neuropsychology to evaluate the relationship between communicative and linguistic abilities. Our review indicates that communicative abilities are best considered as neurally distinct from language abilities. This conclusion is based upon evidence showing that humans rely on different cortical systems when designing a communicative message for someone else as compared to when performing core linguistic tasks, as well as upon observations of individuals with severe language loss after extensive lesions to the language system, who are still able to perform tasks involving intention understanding. PMID:21151364
Handwritten recognition of Tamil vowels using deep learning
NASA Astrophysics Data System (ADS)
Ram Prashanth, N.; Siddarth, B.; Ganesh, Anirudh; Naveen Kumar, Vaegae
2017-11-01
We come across a large volume of handwritten texts in our daily lives and handwritten character recognition has long been an important area of research in pattern recognition. The complexity of the task varies among different languages and it so happens largely due to the similarity between characters, distinct shapes and number of characters which are all language-specific properties. There have been numerous works on character recognition of English alphabets and with laudable success, but regional languages have not been dealt with very frequently and with similar accuracies. In this paper, we explored the performance of Deep Belief Networks in the classification of Handwritten Tamil vowels, and conclusively compared the results obtained. The proposed method has shown satisfactory recognition accuracy in light of difficulties faced with regional languages such as similarity between characters and minute nuances that differentiate them. We can further extend this to all the Tamil characters.
Mispronunciation Detection for Language Learning and Speech Recognition Adaptation
ERIC Educational Resources Information Center
Ge, Zhenhao
2013-01-01
The areas of "mispronunciation detection" (or "accent detection" more specifically) within the speech recognition community are receiving increased attention now. Two application areas, namely language learning and speech recognition adaptation, are largely driving this research interest and are the focal points of this work.…
Mirror neurons, birdsong, and human language: a hypothesis.
Levy, Florence
2011-01-01
THE MIRROR SYSTEM HYPOTHESIS AND INVESTIGATIONS OF BIRDSONG ARE REVIEWED IN RELATION TO THE SIGNIFICANCE FOR THE DEVELOPMENT OF HUMAN SYMBOLIC AND LANGUAGE CAPACITY, IN TERMS OF THREE FUNDAMENTAL FORMS OF COGNITIVE REFERENCE: iconic, indexical, and symbolic. Mirror systems are initially iconic but can progress to indexical reference when produced without the need for concurrent stimuli. Developmental stages in birdsong are also explored with reference to juvenile subsong vs complex stereotyped adult syllables, as an analogy with human language development. While birdsong remains at an indexical reference stage, human language benefits from the capacity for symbolic reference. During a pre-linguistic "babbling" stage, recognition of native phonemic categories is established, allowing further development of subsequent prefrontal and linguistic circuits for sequential language capacity.
Mirror Neurons, Birdsong, and Human Language: A Hypothesis
Levy, Florence
2012-01-01
The mirror system hypothesis and investigations of birdsong are reviewed in relation to the significance for the development of human symbolic and language capacity, in terms of three fundamental forms of cognitive reference: iconic, indexical, and symbolic. Mirror systems are initially iconic but can progress to indexical reference when produced without the need for concurrent stimuli. Developmental stages in birdsong are also explored with reference to juvenile subsong vs complex stereotyped adult syllables, as an analogy with human language development. While birdsong remains at an indexical reference stage, human language benefits from the capacity for symbolic reference. During a pre-linguistic “babbling” stage, recognition of native phonemic categories is established, allowing further development of subsequent prefrontal and linguistic circuits for sequential language capacity. PMID:22287950
A multilingual gold-standard corpus for biomedical concept recognition: the Mantra GSC.
Kors, Jan A; Clematide, Simon; Akhondi, Saber A; van Mulligen, Erik M; Rebholz-Schuhmann, Dietrich
2015-09-01
To create a multilingual gold-standard corpus for biomedical concept recognition. We selected text units from different parallel corpora (Medline abstract titles, drug labels, biomedical patent claims) in English, French, German, Spanish, and Dutch. Three annotators per language independently annotated the biomedical concepts, based on a subset of the Unified Medical Language System and covering a wide range of semantic groups. To reduce the annotation workload, automatically generated preannotations were provided. Individual annotations were automatically harmonized and then adjudicated, and cross-language consistency checks were carried out to arrive at the final annotations. The number of final annotations was 5530. Inter-annotator agreement scores indicate good agreement (median F-score 0.79), and are similar to those between individual annotators and the gold standard. The automatically generated harmonized annotation set for each language performed equally well as the best annotator for that language. The use of automatic preannotations, harmonized annotations, and parallel corpora helped to keep the manual annotation efforts manageable. The inter-annotator agreement scores provide a reference standard for gauging the performance of automatic annotation techniques. To our knowledge, this is the first gold-standard corpus for biomedical concept recognition in languages other than English. Other distinguishing features are the wide variety of semantic groups that are being covered, and the diversity of text genres that were annotated. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
A Novel Approach towards Medical Entity Recognition in Chinese Clinical Text
Yu, Jian
2017-01-01
Medical entity recognition, a basic task in the language processing of clinical data, has been extensively studied in analyzing admission notes in alphabetic languages such as English. However, much less work has been done on nonstructural texts that are written in Chinese, or in the setting of differentiation of Chinese drug names between traditional Chinese medicine and Western medicine. Here, we propose a novel cascade-type Chinese medication entity recognition approach that aims at integrating the sentence category classifier from a support vector machine and the conditional random field-based medication entity recognition. We hypothesized that this approach could avoid the side effects of abundant negative samples and improve the performance of the named entity recognition from admission notes written in Chinese. Therefore, we applied this approach to a test set of 324 Chinese-written admission notes with manual annotation by medical experts. Our data demonstrated that this approach had a score of 94.2% in precision, 92.8% in recall, and 93.5% in F-measure for the recognition of traditional Chinese medicine drug names and 91.2% in precision, 92.6% in recall, and 91.7% F-measure for the recognition of Western medicine drug names. The differences in F-measure were significant compared with those in the baseline systems. PMID:29065612
Cross domains Arabic named entity recognition system
NASA Astrophysics Data System (ADS)
Al-Ahmari, S. Saad; Abdullatif Al-Johar, B.
2016-07-01
Named Entity Recognition (NER) plays an important role in many Natural Language Processing (NLP) applications such as; Information Extraction (IE), Question Answering (QA), Text Clustering, Text Summarization and Word Sense Disambiguation. This paper presents the development and implementation of domain independent system to recognize three types of Arabic named entities. The system works based on a set of domain independent grammar-rules along with Arabic part of speech tagger in addition to gazetteers and lists of trigger words. The experimental results shown, that the system performed as good as other systems with better results in some cases of cross-domains corpora.
Verification Processes in Recognition Memory: The Role of Natural Language Mediators
ERIC Educational Resources Information Center
Marshall, Philip H.; Smith, Randolph A. S.
1977-01-01
The existence of verification processes in recognition memory was confirmed in the context of Adams' (Adams & Bray, 1970) closed-loop theory. Subjects' recognition was tested following a learning session. The expectation was that data would reveal consistent internal relationships supporting the position that natural language mediation plays…
Automatic translation among spoken languages
NASA Technical Reports Server (NTRS)
Walter, Sharon M.; Costigan, Kelly
1994-01-01
The Machine Aided Voice Translation (MAVT) system was developed in response to the shortage of experienced military field interrogators with both foreign language proficiency and interrogation skills. Combining speech recognition, machine translation, and speech generation technologies, the MAVT accepts an interrogator's spoken English question and translates it into spoken Spanish. The spoken Spanish response of the potential informant can then be translated into spoken English. Potential military and civilian applications for automatic spoken language translation technology are discussed in this paper.
ERIC Educational Resources Information Center
Nishitani, Mari; Matsuda, Toshiki
2011-01-01
Researches in language anxiety have focused on the level of language anxiety so far. This study instead, hypothesizes that the interpretation of anxiety and the recognition of failure have an impact on learning and investigates how language anxiety and intrinsic motivation affect the use of learning strategies through the recognition of failure.…
NASA Technical Reports Server (NTRS)
Wolf, Jared J.
1977-01-01
The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.
Speech to Text Translation for Malay Language
NASA Astrophysics Data System (ADS)
Al-khulaidi, Rami Ali; Akmeliawati, Rini
2017-11-01
The speech recognition system is a front end and a back-end process that receives an audio signal uttered by a speaker and converts it into a text transcription. The speech system can be used in several fields including: therapeutic technology, education, social robotics and computer entertainments. In most cases in control tasks, which is the purpose of proposing our system, wherein the speed of performance and response concern as the system should integrate with other controlling platforms such as in voiced controlled robots. Therefore, the need for flexible platforms, that can be easily edited to jibe with functionality of the surroundings, came to the scene; unlike other software programs that require recording audios and multiple training for every entry such as MATLAB and Phoenix. In this paper, a speech recognition system for Malay language is implemented using Microsoft Visual Studio C#. 90 (ninety) Malay phrases were tested by 10 (ten) speakers from both genders in different contexts. The result shows that the overall accuracy (calculated from Confusion Matrix) is satisfactory as it is 92.69%.
Static sign language recognition using 1D descriptors and neural networks
NASA Astrophysics Data System (ADS)
Solís, José F.; Toxqui, Carina; Padilla, Alfonso; Santiago, César
2012-10-01
A frame work for static sign language recognition using descriptors which represents 2D images in 1D data and artificial neural networks is presented in this work. The 1D descriptors were computed by two methods, first one consists in a correlation rotational operator.1 and second is based on contour analysis of hand shape. One of the main problems in sign language recognition is segmentation; most of papers report a special color in gloves or background for hand shape analysis. In order to avoid the use of gloves or special clothing, a thermal imaging camera was used to capture images. Static signs were picked up from 1 to 9 digits of American Sign Language, a multilayer perceptron reached 100% recognition with cross-validation.
Integrating Computer-Assisted Language Learning in Saudi Schools: A Change Model
ERIC Educational Resources Information Center
Alresheed, Saleh; Leask, Marilyn; Raiker, Andrea
2015-01-01
Computer-assisted language learning (CALL) technology and pedagogy have gained recognition globally for their success in supporting second language acquisition (SLA). In Saudi Arabia, the government aims to provide most educational institutions with computers and networking for integrating CALL into classrooms. However, the recognition of CALL's…
Foreign Language Analysis and Recognition (FLARE) Progress
2015-02-01
Copies may be obtained from the Defense Technical Information Center (DTIC) (http://www.dtic.mil). AFRL- RH -WP-TR-2015-0007 HAS BEEN REVIEWED AND IS... retrieval (IR). 15. SUBJECT TERMS Automatic speech recognition (ASR), information retrieval (IR). 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...to the Haystack Multilingual Multimedia Information Extraction and Retrieval (MMIER) system that was initially developed under a prior work unit
Semantic and phonological schema influence spoken word learning and overnight consolidation.
Havas, Viktória; Taylor, Jsh; Vaquero, Lucía; de Diego-Balaguer, Ruth; Rodríguez-Fornells, Antoni; Davis, Matthew H
2018-06-01
We studied the initial acquisition and overnight consolidation of new spoken words that resemble words in the native language (L1) or in an unfamiliar, non-native language (L2). Spanish-speaking participants learned the spoken forms of novel words in their native language (Spanish) or in a different language (Hungarian), which were paired with pictures of familiar or unfamiliar objects, or no picture. We thereby assessed, in a factorial way, the impact of existing knowledge (schema) on word learning by manipulating both semantic (familiar vs unfamiliar objects) and phonological (L1- vs L2-like novel words) familiarity. Participants were trained and tested with a 12-hr intervening period that included overnight sleep or daytime awake. Our results showed (1) benefits of sleep to recognition memory that were greater for words with L2-like phonology and (2) that learned associations with familiar but not unfamiliar pictures enhanced recognition memory for novel words. Implications for complementary systems accounts of word learning are discussed.
The Design of Hand Gestures for Human-Computer Interaction: Lessons from Sign Language Interpreters
Rempel, David; Camilleri, Matt J.; Lee, David L.
2015-01-01
The design and selection of 3D modeled hand gestures for human-computer interaction should follow principles of natural language combined with the need to optimize gesture contrast and recognition. The selection should also consider the discomfort and fatigue associated with distinct hand postures and motions, especially for common commands. Sign language interpreters have extensive and unique experience forming hand gestures and many suffer from hand pain while gesturing. Professional sign language interpreters (N=24) rated discomfort for hand gestures associated with 47 characters and words and 33 hand postures. Clear associations of discomfort with hand postures were identified. In a nominal logistic regression model, high discomfort was associated with gestures requiring a flexed wrist, discordant adjacent fingers, or extended fingers. These and other findings should be considered in the design of hand gestures to optimize the relationship between human cognitive and physical processes and computer gesture recognition systems for human-computer input. PMID:26028955
Gradient language dominance affects talker learning.
Bregman, Micah R; Creel, Sarah C
2014-01-01
Traditional conceptions of spoken language assume that speech recognition and talker identification are computed separately. Neuropsychological and neuroimaging studies imply some separation between the two faculties, but recent perceptual studies suggest better talker recognition in familiar languages than unfamiliar languages. A familiar-language benefit in talker recognition potentially implies strong ties between the two domains. However, little is known about the nature of this language familiarity effect. The current study investigated the relationship between speech and talker processing by assessing bilingual and monolingual listeners' ability to learn voices as a function of language familiarity and age of acquisition. Two effects emerged. First, bilinguals learned to recognize talkers in their first language (Korean) more rapidly than they learned to recognize talkers in their second language (English), while English-speaking participants showed the opposite pattern (learning English talkers faster than Korean talkers). Second, bilinguals' learning rate for talkers in their second language (English) correlated with age of English acquisition. Taken together, these results suggest that language background materially affects talker encoding, implying a tight relationship between speech and talker representations. Copyright © 2013 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Akamatsu, Nobuhiko
1999-01-01
Uses case alteration (cAse ALteRaTiOn) to investigate effects of first-language (L1) orthographic characteristics on word recognition in English as a second language (ESL). Finds magnitude of case alteration effect for naming tasks was significantly larger for ESL participants whose L1 was not alphabetic. Suggests that L1 orthographic features…
Chen, S C; Shao, C L; Liang, C K; Lin, S W; Huang, T H; Hsieh, M C; Yang, C H; Luo, C H; Wuo, C M
2004-01-01
In this paper, we present a text input system for the seriously disabled by using lips image recognition based on LabVIEW. This system can be divided into the software subsystem and the hardware subsystem. In the software subsystem, we adopted the technique of image processing to recognize the status of mouth-opened or mouth-closed depending the relative distance between the upper lip and the lower lip. In the hardware subsystem, parallel port built in PC is used to transmit the recognized result of mouth status to the Morse-code text input system. Integrating the software subsystem with the hardware subsystem, we implement a text input system by using lips image recognition programmed in LabVIEW language. We hope the system can help the seriously disabled to communicate with normal people more easily.
Macedonia, Manuela; Mueller, Karsten
2016-01-01
Vocabulary learning in a second language is enhanced if learners enrich the learning experience with self-performed iconic gestures. This learning strategy is called enactment. Here we explore how enacted words are functionally represented in the brain and which brain regions contribute to enhance retention. After an enactment training lasting 4 days, participants performed a word recognition task in the functional Magnetic Resonance Imaging (fMRI) scanner. Data analysis suggests the participation of different and partially intertwined networks that are engaged in higher cognitive processes, i.e., enhanced attention and word recognition. Also, an experience-related network seems to map word representation. Besides core language regions, this latter network includes sensory and motor cortices, the basal ganglia, and the cerebellum. On the basis of its complexity and the involvement of the motor system, this sensorimotor network might explain superior retention for enactment. PMID:27445918
Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers
NASA Astrophysics Data System (ADS)
Caballero Morales, Santiago Omar; Cox, Stephen J.
2009-12-01
Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.
Han, Feifei
2017-01-01
While some first language (L1) reading models suggest that inefficient word recognition and small working memory tend to inhibit higher-level comprehension processes; the Compensatory Encoding Model maintains that slow word recognition and small working memory do not normally hinder reading comprehension, as readers are able to operate metacognitive strategies to compensate for inefficient word recognition and working memory limitation as long as readers process a reading task without time constraint. Although empirical evidence is accumulated for support of the Compensatory Encoding Model in L1 reading, there is lack of research for testing of the Compensatory Encoding Model in foreign language (FL) reading. This research empirically tested the Compensatory Encoding Model in English reading among Chinese college English language learners (ELLs). Two studies were conducted. Study one focused on testing whether reading condition varying time affects the relationship between word recognition, working memory, and reading comprehension. Students were tested on a computerized English word recognition test, a computerized Operation Span task, and reading comprehension in time constraint and non-time constraint reading. The correlation and regression analyses showed that the strength of association was much stronger between word recognition, working memory, and reading comprehension in time constraint than that in non-time constraint reading condition. Study two examined whether FL readers were able to operate metacognitive reading strategies as a compensatory way of reading comprehension for inefficient word recognition and working memory limitation in non-time constraint reading. The participants were tested on the same computerized English word recognition test and Operation Span test. They were required to think aloud while reading and to complete the comprehension questions. The think-aloud protocols were coded for concurrent use of reading strategies, classified into language-oriented strategies, content-oriented strategies, re-reading, pausing, and meta-comment. The correlation analyses showed that while word recognition and working memory were only significantly related to frequency of language-oriented strategies, re-reading, and pausing, but not with reading comprehension. Jointly viewed, the results of the two studies, complimenting each other, supported the applicability of the Compensatory Encoding Model in FL reading with Chinese college ELLs. PMID:28522984
Han, Feifei
2017-01-01
While some first language (L1) reading models suggest that inefficient word recognition and small working memory tend to inhibit higher-level comprehension processes; the Compensatory Encoding Model maintains that slow word recognition and small working memory do not normally hinder reading comprehension, as readers are able to operate metacognitive strategies to compensate for inefficient word recognition and working memory limitation as long as readers process a reading task without time constraint. Although empirical evidence is accumulated for support of the Compensatory Encoding Model in L1 reading, there is lack of research for testing of the Compensatory Encoding Model in foreign language (FL) reading. This research empirically tested the Compensatory Encoding Model in English reading among Chinese college English language learners (ELLs). Two studies were conducted. Study one focused on testing whether reading condition varying time affects the relationship between word recognition, working memory, and reading comprehension. Students were tested on a computerized English word recognition test, a computerized Operation Span task, and reading comprehension in time constraint and non-time constraint reading. The correlation and regression analyses showed that the strength of association was much stronger between word recognition, working memory, and reading comprehension in time constraint than that in non-time constraint reading condition. Study two examined whether FL readers were able to operate metacognitive reading strategies as a compensatory way of reading comprehension for inefficient word recognition and working memory limitation in non-time constraint reading. The participants were tested on the same computerized English word recognition test and Operation Span test. They were required to think aloud while reading and to complete the comprehension questions. The think-aloud protocols were coded for concurrent use of reading strategies, classified into language-oriented strategies, content-oriented strategies, re-reading, pausing, and meta-comment. The correlation analyses showed that while word recognition and working memory were only significantly related to frequency of language-oriented strategies, re-reading, and pausing, but not with reading comprehension. Jointly viewed, the results of the two studies, complimenting each other, supported the applicability of the Compensatory Encoding Model in FL reading with Chinese college ELLs.
ERIC Educational Resources Information Center
Cordier, Deborah
2009-01-01
A renewed focus on foreign language (FL) learning and speech for communication has resulted in computer-assisted language learning (CALL) software developed with Automatic Speech Recognition (ASR). ASR features for FL pronunciation (Lafford, 2004) are functional components of CALL designs used for FL teaching and learning. The ASR features…
Promotion in Times of Endangerment: The Sign Language Act in Finland
ERIC Educational Resources Information Center
De Meulder, Maartje
2017-01-01
The development of sign language recognition legislation is a relatively recent phenomenon in the field of language policy. So far only few authors have documented signing communities' aspirations for recognition legislation, how they work with their governments to achieve legislation which most reflects these goals, and whether and why outcomes…
A human mirror neuron system for language: Perspectives from signed languages of the deaf.
Knapp, Heather Patterson; Corina, David P
2010-01-01
Language is proposed to have developed atop the human analog of the macaque mirror neuron system for action perception and production [Arbib M.A. 2005. From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics (with commentaries and author's response). Behavioral and Brain Sciences, 28, 105-167; Arbib M.A. (2008). From grasp to language: Embodied concepts and the challenge of abstraction. Journal de Physiologie Paris 102, 4-20]. Signed languages of the deaf are fully-expressive, natural human languages that are perceived visually and produced manually. We suggest that if a unitary mirror neuron system mediates the observation and production of both language and non-linguistic action, three prediction can be made: (1) damage to the human mirror neuron system should non-selectively disrupt both sign language and non-linguistic action processing; (2) within the domain of sign language, a given mirror neuron locus should mediate both perception and production; and (3) the action-based tuning curves of individual mirror neurons should support the highly circumscribed set of motions that form the "vocabulary of action" for signed languages. In this review we evaluate data from the sign language and mirror neuron literatures and find that these predictions are only partially upheld. 2009 Elsevier Inc. All rights reserved.
Advances in natural language processing.
Hirschberg, Julia; Manning, Christopher D
2015-07-17
Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. Copyright © 2015, American Association for the Advancement of Science.
ERIC Educational Resources Information Center
de Zeeuw, Marlies; Verhoeven, Ludo; Schreuder, Robert
2012-01-01
This study examined to what extent young second language (L2) learners showed morphological family size effects in L2 word recognition and whether the effects were grade-level related. Turkish-Dutch bilingual children (L2) and Dutch (first language, L1) children from second, fourth, and sixth grade performed a Dutch lexical decision task on words…
2011-01-01
Training databases for LRE2007 and LRE2009 systems CF CallFriend CH CallHome F Fisher English Part 1 .and 2. F Fisher Levantine Arabic F HKUST Mandarin...information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering...information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1 . REPORT DATE (DD-MM
Strategies for distant speech recognitionin reverberant environments
NASA Astrophysics Data System (ADS)
Delcroix, Marc; Yoshioka, Takuya; Ogawa, Atsunori; Kubo, Yotaro; Fujimoto, Masakiyo; Ito, Nobutaka; Kinoshita, Keisuke; Espi, Miquel; Araki, Shoko; Hori, Takaaki; Nakatani, Tomohiro
2015-12-01
Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.
Describing, using 'recognition cones'. [parallel-series model with English-like computer program
NASA Technical Reports Server (NTRS)
Uhr, L.
1973-01-01
A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.
A Component-Based Vocabulary-Extensible Sign Language Gesture Recognition Framework.
Wei, Shengjing; Chen, Xiang; Yang, Xidong; Cao, Shuai; Zhang, Xu
2016-04-19
Sign language recognition (SLR) can provide a helpful tool for the communication between the deaf and the external world. This paper proposed a component-based vocabulary extensible SLR framework using data from surface electromyographic (sEMG) sensors, accelerometers (ACC), and gyroscopes (GYRO). In this framework, a sign word was considered to be a combination of five common sign components, including hand shape, axis, orientation, rotation, and trajectory, and sign classification was implemented based on the recognition of five components. Especially, the proposed SLR framework consisted of two major parts. The first part was to obtain the component-based form of sign gestures and establish the code table of target sign gesture set using data from a reference subject. In the second part, which was designed for new users, component classifiers were trained using a training set suggested by the reference subject and the classification of unknown gestures was performed with a code matching method. Five subjects participated in this study and recognition experiments under different size of training sets were implemented on a target gesture set consisting of 110 frequently-used Chinese Sign Language (CSL) sign words. The experimental results demonstrated that the proposed framework can realize large-scale gesture set recognition with a small-scale training set. With the smallest training sets (containing about one-third gestures of the target gesture set) suggested by two reference subjects, (82.6 ± 13.2)% and (79.7 ± 13.4)% average recognition accuracy were obtained for 110 words respectively, and the average recognition accuracy climbed up to (88 ± 13.7)% and (86.3 ± 13.7)% when the training set included 50~60 gestures (about half of the target gesture set). The proposed framework can significantly reduce the user's training burden in large-scale gesture recognition, which will facilitate the implementation of a practical SLR system.
Towards a Universal Model of Reading
Frost, Ram
2013-01-01
In the last decade, reading research has seen a paradigmatic shift. A new wave of computational models of orthographic processing that offer various forms of noisy position or context-sensitive coding, have revolutionized the field of visual word recognition. The influx of such models stems mainly from consistent findings, coming mostly from European languages, regarding an apparent insensitivity of skilled readers to letter-order. Underlying the current revolution is the theoretical assumption that the insensitivity of readers to letter order reflects the special way in which the human brain encodes the position of letters in printed words. The present paper discusses the theoretical shortcomings and misconceptions of this approach to visual word recognition. A systematic review of data obtained from a variety of languages demonstrates that letter-order insensitivity is not a general property of the cognitive system, neither it is a property of the brain in encoding letters. Rather, it is a variant and idiosyncratic characteristic of some languages, mostly European, reflecting a strategy of optimizing encoding resources, given the specific structure of words. Since the main goal of reading research is to develop theories that describe the fundamental and invariant phenomena of reading across orthographies, an alternative approach to model visual word recognition is offered. The dimensions of a possible universal model of reading, which outlines the common cognitive operations involved in orthographic processing in all writing systems, are discussed. PMID:22929057
Towards a universal model of reading.
Frost, Ram
2012-10-01
In the last decade, reading research has seen a paradigmatic shift. A new wave of computational models of orthographic processing that offer various forms of noisy position or context-sensitive coding have revolutionized the field of visual word recognition. The influx of such models stems mainly from consistent findings, coming mostly from European languages, regarding an apparent insensitivity of skilled readers to letter order. Underlying the current revolution is the theoretical assumption that the insensitivity of readers to letter order reflects the special way in which the human brain encodes the position of letters in printed words. The present article discusses the theoretical shortcomings and misconceptions of this approach to visual word recognition. A systematic review of data obtained from a variety of languages demonstrates that letter-order insensitivity is neither a general property of the cognitive system nor a property of the brain in encoding letters. Rather, it is a variant and idiosyncratic characteristic of some languages, mostly European, reflecting a strategy of optimizing encoding resources, given the specific structure of words. Since the main goal of reading research is to develop theories that describe the fundamental and invariant phenomena of reading across orthographies, an alternative approach to model visual word recognition is offered. The dimensions of a possible universal model of reading, which outlines the common cognitive operations involved in orthographic processing in all writing systems, are discussed.
ERIC Educational Resources Information Center
Sidgi, Lina Fathi Sidig; Shaari, Ahmad Jelani
2017-01-01
The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the foreign language classrooms where it is most needed. One promising emerging technology that supports language learning is automatic speech recognition (ASR). Integrating such technology, especially in the instruction of pronunciation…
ERIC Educational Resources Information Center
Shafiro, Valeriy; Kharkhurin, Anatoliy V.
2009-01-01
Abstract Does native language phonology influence visual word processing in a second language? This question was investigated in two experiments with two groups of Russian-English bilinguals, differing in their English experience, and a monolingual English control group. Experiment 1 tested visual word recognition following semantic…
Novel dynamic Bayesian networks for facial action element recognition and understanding
NASA Astrophysics Data System (ADS)
Zhao, Wei; Park, Jeong-Seon; Choi, Dong-You; Lee, Sang-Woong
2011-12-01
In daily life, language is an important tool of communication between people. Besides language, facial action can also provide a great amount of information. Therefore, facial action recognition has become a popular research topic in the field of human-computer interaction (HCI). However, facial action recognition is quite a challenging task due to its complexity. In a literal sense, there are thousands of facial muscular movements, many of which have very subtle differences. Moreover, muscular movements always occur simultaneously when the pose is changed. To address this problem, we first build a fully automatic facial points detection system based on a local Gabor filter bank and principal component analysis. Then, novel dynamic Bayesian networks are proposed to perform facial action recognition using the junction tree algorithm over a limited number of feature points. In order to evaluate the proposed method, we have used the Korean face database for model training. For testing, we used the CUbiC FacePix, facial expressions and emotion database, Japanese female facial expression database, and our own database. Our experimental results clearly demonstrate the feasibility of the proposed approach.
Gender affects body language reading.
Sokolov, Arseny A; Krüger, Samuel; Enck, Paul; Krägeloh-Mann, Ingeborg; Pavlova, Marina A
2011-01-01
Body motion is a rich source of information for social cognition. However, gender effects in body language reading are largely unknown. Here we investigated whether, and, if so, how recognition of emotional expressions revealed by body motion is gender dependent. To this end, females and males were presented with point-light displays portraying knocking at a door performed with different emotional expressions. The findings show that gender affects accuracy rather than speed of body language reading. This effect, however, is modulated by emotional content of actions: males surpass in recognition accuracy of happy actions, whereas females tend to excel in recognition of hostile angry knocking. Advantage of women in recognition accuracy of neutral actions suggests that females are better tuned to the lack of emotional content in body actions. The study provides novel insights into understanding of gender effects in body language reading, and helps to shed light on gender vulnerability to neuropsychiatric and neurodevelopmental impairments in visual social cognition.
1997-09-01
first PC-based, very large vocabulary dictation system with a continuous natural language free flow approach to speech recognition. (This system allows...indicating the likelihood that a particular stored HMM reference model is the best match for the input. This approach is called the Baum-Welch...InfoCentral, and Envoy 1.0; and Lotus Development Corp.’s SmartSuite 3, Approach 3.0, and Organizer. 2. IBM At a press conference in New York in June 1997, IBM
Watch what you say, your computer might be listening: A review of automated speech recognition
NASA Technical Reports Server (NTRS)
Degennaro, Stephen V.
1991-01-01
Spoken language is the most convenient and natural means by which people interact with each other and is, therefore, a promising candidate for human-machine interactions. Speech also offers an additional channel for hands-busy applications, complementing the use of motor output channels for control. Current speech recognition systems vary considerably across a number of important characteristics, including vocabulary size, speaking mode, training requirements for new speakers, robustness to acoustic environments, and accuracy. Algorithmically, these systems range from rule-based techniques through more probabilistic or self-learning approaches such as hidden Markov modeling and neural networks. This tutorial begins with a brief summary of the relevant features of current speech recognition systems and the strengths and weaknesses of the various algorithmic approaches.
Cross-cultural effect on the brain revisited: universal structures plus writing system variation.
Bolger, Donald J; Perfetti, Charles A; Schneider, Walter
2005-05-01
Recognizing printed words requires the mapping of graphic forms, which vary with writing systems, to linguistic forms, which vary with languages. Using a newly developed meta-analytic approach, aggregated Gaussian-estimated sources (AGES; Chein et al. [2002]: Psychol Behav 77:635-639), we examined the neuroimaging results for word reading within and across writing systems and languages. To find commonalities, we compiled 25 studies in English and other Western European languages that use an alphabetic writing system, 9 studies of native Chinese reading, 5 studies of Japanese Kana (syllabic) reading, and 4 studies of Kanji (morpho-syllabic) reading. Using the AGES approach, we created meta-images within each writing system, isolated reliable foci of activation, and compared findings across writing systems and languages. The results suggest that these writing systems utilize a common network of regions in word processing. Writing systems engage largely the same systems in terms of gross cortical regions, but localization within those regions suggests differences across writing systems. In particular, the region known as the visual word form area (VWFA) shows strikingly consistent localization across tasks and across writing systems. This region in the left mid-fusiform gyrus is critical to word recognition across writing systems and languages.
Recognition of emotion from body language among patients with unipolar depression
Loi, Felice; Vaidya, Jatin G.; Paradiso, Sergio
2013-01-01
Major depression may be associated with abnormal perception of emotions and impairment in social adaptation. Emotion recognition from body language and its possible implications to social adjustment have not been examined in patients with depression. Three groups of participants (51 with depression; 68 with history of depression in remission; and 69 never depressed healthy volunteers) were compared on static and dynamic tasks of emotion recognition from body language. Psychosocial adjustment was assessed using the Social Adjustment Scale Self-Report (SAS-SR). Participants with current depression showed reduced recognition accuracy for happy stimuli across tasks relative to remission and comparison participants. Participants with depression tended to show poorer psychosocial adaptation relative to remission and comparison groups. Correlations between perception accuracy of happiness and scores on the SAS-SR were largely not significant. These results indicate that depression is associated with reduced ability to appraise positive stimuli of emotional body language but emotion recognition performance is not tied to social adjustment. These alterations do not appear to be present in participants in remission suggesting state-like qualities. PMID:23608159
On the Development of Speech Resources for the Mixtec Language
2013-01-01
The Mixtec language is one of the main native languages in Mexico. In general, due to urbanization, discrimination, and limited attempts to promote the culture, the native languages are disappearing. Most of the information available about the Mixtec language is in written form as in dictionaries which, although including examples about how to pronounce the Mixtec words, are not as reliable as listening to the correct pronunciation from a native speaker. Formal acoustic resources, as speech corpora, are almost non-existent for the Mixtec, and no speech technologies are known to have been developed for it. This paper presents the development of the following resources for the Mixtec language: (1) a speech database of traditional narratives of the Mixtec culture spoken by a native speaker (labelled at the phonetic and orthographic levels by means of spectral analysis) and (2) a native speaker-adaptive automatic speech recognition (ASR) system (trained with the speech database) integrated with a Mixtec-to-Spanish/Spanish-to-Mixtec text translator. The speech database, although small and limited to a single variant, was reliable enough to build the multiuser speech application which presented a mean recognition/translation performance up to 94.36% in experiments with non-native speakers (the target users). PMID:23710134
Parallel language activation and cognitive control during spoken word recognition in bilinguals
Blumenfeld, Henrike K.; Marian, Viorica
2013-01-01
Accounts of bilingual cognitive advantages suggest an associative link between cross-linguistic competition and inhibitory control. We investigate this link by examining English-Spanish bilinguals’ parallel language activation during auditory word recognition and nonlinguistic Stroop performance. Thirty-one English-Spanish bilinguals and 30 English monolinguals participated in an eye-tracking study. Participants heard words in English (e.g., comb) and identified corresponding pictures from a display that included pictures of a Spanish competitor (e.g., conejo, English rabbit). Bilinguals with higher Spanish proficiency showed more parallel language activation and smaller Stroop effects than bilinguals with lower Spanish proficiency. Across all bilinguals, stronger parallel language activation between 300–500ms after word onset was associated with smaller Stroop effects; between 633–767ms, reduced parallel language activation was associated with smaller Stroop effects. Results suggest that bilinguals who perform well on the Stroop task show increased cross-linguistic competitor activation during early stages of word recognition and decreased competitor activation during later stages of word recognition. Findings support the hypothesis that cross-linguistic competition impacts domain-general inhibition. PMID:24244842
Integrating language models into classifiers for BCI communication: a review
NASA Astrophysics Data System (ADS)
Speier, W.; Arnold, C.; Pouratian, N.
2016-06-01
Objective. The present review systematically examines the integration of language models to improve classifier performance in brain-computer interface (BCI) communication systems. Approach. The domain of natural language has been studied extensively in linguistics and has been used in the natural language processing field in applications including information extraction, machine translation, and speech recognition. While these methods have been used for years in traditional augmentative and assistive communication devices, information about the output domain has largely been ignored in BCI communication systems. Over the last few years, BCI communication systems have started to leverage this information through the inclusion of language models. Main results. Although this movement began only recently, studies have already shown the potential of language integration in BCI communication and it has become a growing field in BCI research. BCI communication systems using language models in their classifiers have progressed down several parallel paths, including: word completion; signal classification; integration of process models; dynamic stopping; unsupervised learning; error correction; and evaluation. Significance. Each of these methods have shown significant progress, but have largely been addressed separately. Combining these methods could use the full potential of language model, yielding further performance improvements. This integration should be a priority as the field works to create a BCI system that meets the needs of the amyotrophic lateral sclerosis population.
Integrating language models into classifiers for BCI communication: a review.
Speier, W; Arnold, C; Pouratian, N
2016-06-01
The present review systematically examines the integration of language models to improve classifier performance in brain-computer interface (BCI) communication systems. The domain of natural language has been studied extensively in linguistics and has been used in the natural language processing field in applications including information extraction, machine translation, and speech recognition. While these methods have been used for years in traditional augmentative and assistive communication devices, information about the output domain has largely been ignored in BCI communication systems. Over the last few years, BCI communication systems have started to leverage this information through the inclusion of language models. Although this movement began only recently, studies have already shown the potential of language integration in BCI communication and it has become a growing field in BCI research. BCI communication systems using language models in their classifiers have progressed down several parallel paths, including: word completion; signal classification; integration of process models; dynamic stopping; unsupervised learning; error correction; and evaluation. Each of these methods have shown significant progress, but have largely been addressed separately. Combining these methods could use the full potential of language model, yielding further performance improvements. This integration should be a priority as the field works to create a BCI system that meets the needs of the amyotrophic lateral sclerosis population.
Multi-modal gesture recognition using integrated model of motion, audio and video
NASA Astrophysics Data System (ADS)
Goutsu, Yusuke; Kobayashi, Takaki; Obara, Junya; Kusajima, Ikuo; Takeichi, Kazunari; Takano, Wataru; Nakamura, Yoshihiko
2015-07-01
Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.
Translation lexicon acquisition from bilingual dictionaries
NASA Astrophysics Data System (ADS)
Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.
2001-12-01
Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.
Shi, Lu-Feng; Koenig, Laura L
2016-09-01
Nonnative listeners have difficulty recognizing English words due to underdeveloped acoustic-phonetic and/or lexical skills. The present study used Boothroyd and Nittrouer's (1988)j factor to tease apart these two components of word recognition. Participants included 15 native English and 29 native Russian listeners. Fourteen and 15 of the Russian listeners reported English (ED) and Russian (RD) to be their dominant language, respectively. Listeners were presented 119 consonant-vowel-consonant real and nonsense words in speech-spectrum noise at +6 dB SNR. Responses were scored for word and phoneme recognition, the logarithmic quotient of which yielded j. Word and phoneme recognition was comparable between native and ED listeners but poorer in RD listeners. Analysis of j indicated less effective use of lexical information in RD than in native and ED listeners. Lexical processing was strongly correlated with the length of residence in the United States. Language background is important for nonnative word recognition. Lexical skills can be regarded as nativelike in ED nonnative listeners. Compromised word recognition in ED listeners is unlikely a result of poor lexical processing. Performance should be interpreted with caution for listeners dominant in their first language, whose word recognition is affected by both lexical and acoustic-phonetic factors.
Speech Perception in Noise by Children With Cochlear Implants
Caldwell, Amanda; Nittrouer, Susan
2013-01-01
Purpose Common wisdom suggests that listening in noise poses disproportionately greater difficulty for listeners with cochlear implants (CIs) than for peers with normal hearing (NH). The purpose of this study was to examine phonological, language, and cognitive skills that might help explain speech-in-noise abilities for children with CIs. Method Three groups of kindergartners (NH, hearing aid wearers, and CI users) were tested on speech recognition in quiet and noise and on tasks thought to underlie the abilities that fit into the domains of phonological awareness, general language, and cognitive skills. These last measures were used as predictor variables in regression analyses with speech-in-noise scores as dependent variables. Results Compared to children with NH, children with CIs did not perform as well on speech recognition in noise or on most other measures, including recognition in quiet. Two surprising results were that (a) noise effects were consistent across groups and (b) scores on other measures did not explain any group differences in speech recognition. Conclusions Limitations of implant processing take their primary toll on recognition in quiet and account for poor speech recognition and language/phonological deficits in children with CIs. Implications are that teachers/clinicians need to teach language/phonology directly and maximize signal-to-noise levels in the classroom. PMID:22744138
Arabic sign language recognition based on HOG descriptor
NASA Astrophysics Data System (ADS)
Ben Jmaa, Ahmed; Mahdi, Walid; Ben Jemaa, Yousra; Ben Hamadou, Abdelmajid
2017-02-01
We present in this paper a new approach for Arabic sign language (ArSL) alphabet recognition using hand gesture analysis. This analysis consists in extracting a histogram of oriented gradient (HOG) features from a hand image and then using them to generate an SVM Models. Which will be used to recognize the ArSL alphabet in real-time from hand gesture using a Microsoft Kinect camera. Our approach involves three steps: (i) Hand detection and localization using a Microsoft Kinect camera, (ii) hand segmentation and (iii) feature extraction using Arabic alphabet recognition. One each input image first obtained by using a depth sensor, we apply our method based on hand anatomy to segment hand and eliminate all the errors pixels. This approach is invariant to scale, to rotation and to translation of the hand. Some experimental results show the effectiveness of our new approach. Experiment revealed that the proposed ArSL system is able to recognize the ArSL with an accuracy of 90.12%.
Performance of Language-Coordinated Collective Systems: A Study of Wine Recognition and Description
Zubek, Julian; Denkiewicz, Michał; Dębska, Agnieszka; Radkowska, Alicja; Komorowska-Mach, Joanna; Litwin, Piotr; Stępień, Magdalena; Kucińska, Adrianna; Sitarska, Ewa; Komorowska, Krystyna; Fusaroli, Riccardo; Tylén, Kristian; Rączaszek-Leonardi, Joanna
2016-01-01
Most of our perceptions of and engagements with the world are shaped by our immersion in social interactions, cultural traditions, tools and linguistic categories. In this study we experimentally investigate the impact of two types of language-based coordination on the recognition and description of complex sensory stimuli: that of red wine. Participants were asked to taste, remember and successively recognize samples of wines within a larger set in a two-by-two experimental design: (1) either individually or in pairs, and (2) with or without the support of a sommelier card—a cultural linguistic tool designed for wine description. Both effectiveness of recognition and the kinds of errors in the four conditions were analyzed. While our experimental manipulations did not impact recognition accuracy, bias-variance decomposition of error revealed non-trivial differences in how participants solved the task. Pairs generally displayed reduced bias and increased variance compared to individuals, however the variance dropped significantly when they used the sommelier card. The effect of sommelier card reducing the variance was observed only in pairs, individuals did not seem to benefit from the cultural linguistic tool. Analysis of descriptions generated with the aid of sommelier cards shows that pairs were more coherent and discriminative than individuals. The findings are discussed in terms of global properties and dynamics of collective systems when constrained by different types of cultural practices. PMID:27729875
Soskey, Laura; Holcomb, Phillip J; Midgley, Katherine J
2016-09-01
How do the neural mechanisms involved in word recognition evolve over the course of word learning in adult learners of a new second language? The current study sought to closely track language effects, which are differences in electrophysiological indices of word processing between one's native and second languages, in beginning university learners over the course of a single semester of learning. Monolingual L1 English-speakers enrolled in introductory Spanish were first trained on a list of 228 Spanish words chosen from the vocabulary to be learned in class. Behavioral data from the training session and the following experimental sessions spaced over the course of the semester showed expected learning effects. In the three laboratory sessions participants read words in three lists (English, Spanish and mixed) while performing a go/no-go lexical decision task in which event-related potentials (ERPs) were recorded. As observed in previous studies there were ERP language effects with larger N400s to native than second language words. Importantly, this difference declined over the course of L2 learning with N400 amplitude increasing for new second language words. These results suggest that even over a single semester of learning that new second language words are rapidly incorporated into the word recognition system and begin to take on lexical and semantic properties similar to native language words. Moreover, the results suggest that electrophysiological measures can be used as sensitive measures for tracking the acquisition of new linguistic knowledge. Copyright © 2016 Elsevier B.V. All rights reserved.
Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition.
Bianne-Bernard, Anne-Laure; Menasri, Farès; Al-Hajj Mohamad, Rami; Mokbel, Chafic; Kermorvant, Christopher; Likforman-Sulem, Laurence
2011-10-01
This study aims at building an efficient word recognition system resulting from the combination of three handwriting recognizers. The main component of this combined system is an HMM-based recognizer which considers dynamic and contextual information for a better modeling of writing units. For modeling the contextual units, a state-tying process based on decision tree clustering is introduced. Decision trees are built according to a set of expert-based questions on how characters are written. Questions are divided into global questions, yielding larger clusters, and precise questions, yielding smaller ones. Such clustering enables us to reduce the total number of models and Gaussians densities by 10. We then apply this modeling to the recognition of handwritten words. Experiments are conducted on three publicly available databases based on Latin or Arabic languages: Rimes, IAM, and OpenHart. The results obtained show that contextual information embedded with dynamic modeling significantly improves recognition.
The Heinz Electronic Library Interactive On-line System (HELIOS): An Update.
ERIC Educational Resources Information Center
Galloway, Edward A.; Michalek, Gabrielle V.
1998-01-01
Describes a project at Carnegie Mellon University libraries to convert the congressional papers of the late Senator John Heinz to digital format and to create an online system to search and retrieve these papers. Highlights include scanning, optical character recognition, and a search engine utilizing natural language processing. (Author/LRW)
Effectiveness of Feedback for Enhancing English Pronunciation in an ASR-Based CALL System
ERIC Educational Resources Information Center
Wang, Y.-H.; Young, S. S.-C.
2015-01-01
This paper presents a study on implementing the ASR-based CALL (computer-assisted language learning based upon automatic speech recognition) system embedded with both formative and summative feedback approaches and using implicit and explicit strategies to enhance adult and young learners' English pronunciation. Two groups of learners including 18…
Speech-Associated Gestures, Broca's Area, and the Human Mirror System
ERIC Educational Resources Information Center
Skipper, Jeremy I.; Goldin-Meadow, Susan; Nusbaum, Howard C.; Small, Steven L.
2007-01-01
Speech-associated gestures are hand and arm movements that not only convey semantic information to listeners but are themselves actions. Broca's area has been assumed to play an important role both in semantic retrieval or selection (as part of a language comprehension system) and in action recognition (as part of a "mirror" or…
Lexical access in sign language: a computational model.
Caselli, Naomi K; Cohen-Goldberg, Ariel M
2014-01-01
PSYCHOLINGUISTIC THEORIES HAVE PREDOMINANTLY BEEN BUILT UPON DATA FROM SPOKEN LANGUAGE, WHICH LEAVES OPEN THE QUESTION: How many of the conclusions truly reflect language-general principles as opposed to modality-specific ones? We take a step toward answering this question in the domain of lexical access in recognition by asking whether a single cognitive architecture might explain diverse behavioral patterns in signed and spoken language. Chen and Mirman (2012) presented a computational model of word processing that unified opposite effects of neighborhood density in speech production, perception, and written word recognition. Neighborhood density effects in sign language also vary depending on whether the neighbors share the same handshape or location. We present a spreading activation architecture that borrows the principles proposed by Chen and Mirman (2012), and show that if this architecture is elaborated to incorporate relatively minor facts about either (1) the time course of sign perception or (2) the frequency of sub-lexical units in sign languages, it produces data that match the experimental findings from sign languages. This work serves as a proof of concept that a single cognitive architecture could underlie both sign and word recognition.
ERIC Educational Resources Information Center
Ashrapova, Alsu; Alendeeva, Svetlana
2014-01-01
This article is the result of a study of the influence of English and German on the Russian language during the English learning based on lexical borrowings in the field of economics. This paper discusses the use and recognition of borrowings from the English and German languages by Russian native speakers. The use of lexical borrowings from…
Deep bottleneck features for spoken language identification.
Jiang, Bing; Song, Yan; Wei, Si; Liu, Jun-Hua; McLoughlin, Ian Vince; Dai, Li-Rong
2014-01-01
A key problem in spoken language identification (LID) is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF) for spoken LID, motivated by the success of Deep Neural Networks (DNN) in speech recognition. We show that DBFs can form a low-dimensional compact representation of the original inputs with a powerful descriptive and discriminative capability. To evaluate the effectiveness of this, we design two acoustic models, termed DBF-TV and parallel DBF-TV (PDBF-TV), using a DBF based i-vector representation for each speech utterance. Results on NIST language recognition evaluation 2009 (LRE09) show significant improvements over state-of-the-art systems. By fusing the output of phonotactic and acoustic approaches, we achieve an EER of 1.08%, 1.89% and 7.01% for 30 s, 10 s and 3 s test utterances respectively. Furthermore, various DBF configurations have been extensively evaluated, and an optimal system proposed.
NASA Astrophysics Data System (ADS)
Hachaj, Tomasz; Ogiela, Marek R.
2014-09-01
Gesture Description Language (GDL) is a classifier that enables syntactic description and real time recognition of full-body gestures and movements. Gestures are described in dedicated computer language named Gesture Description Language script (GDLs). In this paper we will introduce new GDLs formalisms that enable recognition of selected classes of movement trajectories. The second novelty is new unsupervised learning method with which it is possible to automatically generate GDLs descriptions. We have initially evaluated both proposed extensions of GDL and we have obtained very promising results. Both the novel methodology and evaluation results will be described in this paper.
Nilakantan, Aneesha S; Voss, Joel L; Weintraub, Sandra; Mesulam, M-Marsel; Rogalski, Emily J
2017-06-01
Primary progressive aphasia (PPA) is clinically defined by an initial loss of language function and preservation of other cognitive abilities, including episodic memory. While PPA primarily affects the left-lateralized perisylvian language network, some clinical neuropsychological tests suggest concurrent initial memory loss. The goal of this study was to test recognition memory of objects and words in the visual and auditory modality to separate language-processing impairments from retentive memory in PPA. Individuals with non-semantic PPA had longer reaction times and higher false alarms for auditory word stimuli compared to visual object stimuli. Moreover, false alarms for auditory word recognition memory were related to cortical thickness within the left inferior frontal gyrus and left temporal pole, while false alarms for visual object recognition memory was related to cortical thickness within the right-temporal pole. This pattern of results suggests that specific vulnerability in processing verbal stimuli can hinder episodic memory in PPA, and provides evidence for differential contributions of the left and right temporal poles in word and object recognition memory. Copyright © 2017 Elsevier Ltd. All rights reserved.
Korean letter handwritten recognition using deep convolutional neural network on android platform
NASA Astrophysics Data System (ADS)
Purnamawati, S.; Rachmawati, D.; Lumanauw, G.; Rahmat, R. F.; Taqyuddin, R.
2018-03-01
Currently, popularity of Korean culture attracts many people to learn everything about Korea, particularly its language. To acquire Korean Language, every single learner needs to be able to understand Korean non-Latin character. A digital approach needs to be carried out in order to make Korean learning process easier. This study is done by using Deep Convolutional Neural Network (DCNN). DCNN performs the recognition process on the image based on the model that has been trained such as Inception-v3 Model. Subsequently, re-training process using transfer learning technique with the trained and re-trained value of model is carried though in order to develop a new model with a better performance without any specific systemic errors. The testing accuracy of this research results in 86,9%.
The adaptation of GDL motion recognition system to sport and rehabilitation techniques analysis.
Hachaj, Tomasz; Ogiela, Marek R
2016-06-01
The main novelty of this paper is presenting the adaptation of Gesture Description Language (GDL) methodology to sport and rehabilitation data analysis and classification. In this paper we showed that Lua language can be successfully used for adaptation of the GDL classifier to those tasks. The newly applied scripting language allows easily extension and integration of classifier with other software technologies and applications. The obtained execution speed allows using the methodology in the real-time motion capture data processing where capturing frequency differs from 100 Hz to even 500 Hz depending on number of features or classes to be calculated and recognized. Due to this fact the proposed methodology can be used to the high-end motion capture system. We anticipate that using novel, efficient and effective method will highly help both sport trainers and physiotherapist in they practice. The proposed approach can be directly applied to motion capture data kinematics analysis (evaluation of motion without regard to the forces that cause that motion). The ability to apply pattern recognition methods for GDL description can be utilized in virtual reality environment and used for sport training or rehabilitation treatment.
Urbain, Jay
2015-12-01
We present the design, and analyze the performance of a multi-stage natural language processing system employing named entity recognition, Bayesian statistics, and rule logic to identify and characterize heart disease risk factor events in diabetic patients over time. The system was originally developed for the 2014 i2b2 Challenges in Natural Language in Clinical Data. The system's strengths included a high level of accuracy for identifying named entities associated with heart disease risk factor events. The system's primary weakness was due to inaccuracies when characterizing the attributes of some events. For example, determining the relative time of an event with respect to the record date, whether an event is attributable to the patient's history or the patient's family history, and differentiating between current and prior smoking status. We believe these inaccuracies were due in large part to the lack of an effective approach for integrating context into our event detection model. To address these inaccuracies, we explore the addition of a distributional semantic model for characterizing contextual evidence of heart disease risk factor events. Using this semantic model, we raise our initial 2014 i2b2 Challenges in Natural Language of Clinical data F1 score of 0.838 to 0.890 and increased precision by 10.3% without use of any lexicons that might bias our results. Copyright © 2015 Elsevier Inc. All rights reserved.
Quadcopter Control Using Speech Recognition
NASA Astrophysics Data System (ADS)
Malik, H.; Darma, S.; Soekirno, S.
2018-04-01
This research reported a comparison from a success rate of speech recognition systems that used two types of databases they were existing databases and new databases, that were implemented into quadcopter as motion control. Speech recognition system was using Mel frequency cepstral coefficient method (MFCC) as feature extraction that was trained using recursive neural network method (RNN). MFCC method was one of the feature extraction methods that most used for speech recognition. This method has a success rate of 80% - 95%. Existing database was used to measure the success rate of RNN method. The new database was created using Indonesian language and then the success rate was compared with results from an existing database. Sound input from the microphone was processed on a DSP module with MFCC method to get the characteristic values. Then, the characteristic values were trained using the RNN which result was a command. The command became a control input to the single board computer (SBC) which result was the movement of the quadcopter. On SBC, we used robot operating system (ROS) as the kernel (Operating System).
Choudhury, Naseem; Leppanen, Paavo H.T.; Leevers, Hilary J.; Benasich, April A.
2007-01-01
An infant’s ability to process auditory signals presented in rapid succession (i.e. rapid auditory processing abilities [RAP]) has been shown to predict differences in language outcomes in toddlers and preschool children. Early deficits in RAP abilities may serve as a behavioral marker for language-based learning disabilities. The purpose of this study is to determine if performance on infant information processing measures designed to tap RAP and global processing skills differ as a function of family history of specific language impairment (SLI) and/or the particular demand characteristics of the paradigm used. Seventeen 6- to 9-month-old infants from families with a history of specific language impairment (FH+) and 29 control infants (FH−) participated in this study. Infants’ performance on two different RAP paradigms (head-turn procedure [HT] and auditory-visual habituation/recognition memory [AVH/RM]) and on a global processing task (visual habituation/recognition memory [VH/RM]) was assessed at 6 and 9 months. Toddler language and cognitive skills were evaluated at 12 and 16 months. A number of significant group differences were seen: FH+ infants showed significantly poorer discrimination of fast rate stimuli on both RAP tasks, took longer to habituate on both habituation/recognition memory measures, and had lower novelty preference scores on the visual habituation/recognition memory task. Infants’ performance on the two RAP measures provided independent but converging contributions to outcome. Thus, different mechanisms appear to underlie performance on operantly conditioned tasks as compared to habituation/recognition memory paradigms. Further, infant RAP processing abilities predicted to 12- and 16-month language scores above and beyond family history of SLI. The results of this study provide additional support for the validity of infant RAP abilities as a behavioral marker for later language outcome. Finally, this is the first study to use a battery of infant tasks to demonstrate multi-modal processing deficits in infants at risk for SLI. PMID:17286846
Holistic neural coding of Chinese character forms in bilateral ventral visual system.
Mo, Ce; Yu, Mengxia; Seger, Carol; Mo, Lei
2015-02-01
How are Chinese characters recognized and represented in the brain of skilled readers? Functional MRI fast adaptation technique was used to address this question. We found that neural adaptation effects were limited to identical characters in bilateral ventral visual system while no activation reduction was observed for partially overlapping characters regardless of the spatial location of the shared sub-character components, suggesting highly selective neuronal tuning to whole characters. The consistent neural profile across the entire ventral visual cortex indicates that Chinese characters are represented as mutually distinctive wholes rather than combinations of sub-character components, which presents a salient contrast to the left-lateralized, simple-to-complex neural representations of alphabetic words. Our findings thus revealed the cultural modulation effect on both local neuronal activity patterns and functional anatomical regions associated with written symbol recognition. Moreover, the cross-language discrepancy in written symbol recognition mechanism might stem from the language-specific early-stage learning experience. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Parallel language activation and inhibitory control in bimodal bilinguals.
Giezen, Marcel R; Blumenfeld, Henrike K; Shook, Anthony; Marian, Viorica; Emmorey, Karen
2015-08-01
Findings from recent studies suggest that spoken-language bilinguals engage nonlinguistic inhibitory control mechanisms to resolve cross-linguistic competition during auditory word recognition. Bilingual advantages in inhibitory control might stem from the need to resolve perceptual competition between similar-sounding words both within and between their two languages. If so, these advantages should be lessened or eliminated when there is no perceptual competition between two languages. The present study investigated the extent of inhibitory control recruitment during bilingual language comprehension by examining associations between language co-activation and nonlinguistic inhibitory control abilities in bimodal bilinguals, whose two languages do not perceptually compete. Cross-linguistic distractor activation was identified in the visual world paradigm, and correlated significantly with performance on a nonlinguistic spatial Stroop task within a group of 27 hearing ASL-English bilinguals. Smaller Stroop effects (indexing more efficient inhibition) were associated with reduced co-activation of ASL signs during the early stages of auditory word recognition. These results suggest that inhibitory control in auditory word recognition is not limited to resolving perceptual linguistic competition in phonological input, but is also used to moderate competition that originates at the lexico-semantic level. Copyright © 2015 Elsevier B.V. All rights reserved.
Lexical access in sign language: a computational model
Caselli, Naomi K.; Cohen-Goldberg, Ariel M.
2014-01-01
Psycholinguistic theories have predominantly been built upon data from spoken language, which leaves open the question: How many of the conclusions truly reflect language-general principles as opposed to modality-specific ones? We take a step toward answering this question in the domain of lexical access in recognition by asking whether a single cognitive architecture might explain diverse behavioral patterns in signed and spoken language. Chen and Mirman (2012) presented a computational model of word processing that unified opposite effects of neighborhood density in speech production, perception, and written word recognition. Neighborhood density effects in sign language also vary depending on whether the neighbors share the same handshape or location. We present a spreading activation architecture that borrows the principles proposed by Chen and Mirman (2012), and show that if this architecture is elaborated to incorporate relatively minor facts about either (1) the time course of sign perception or (2) the frequency of sub-lexical units in sign languages, it produces data that match the experimental findings from sign languages. This work serves as a proof of concept that a single cognitive architecture could underlie both sign and word recognition. PMID:24860539
Towards a Transcription System of Sign Language for 3D Virtual Agents
NASA Astrophysics Data System (ADS)
Do Amaral, Wanessa Machado; de Martino, José Mario
Accessibility is a growing concern in computer science. Since virtual information is mostly presented visually, it may seem that access for deaf people is not an issue. However, for prelingually deaf individuals, those who were deaf since before acquiring and formally learn a language, written information is often of limited accessibility than if presented in signing. Further, for this community, signing is their language of choice, and reading text in a spoken language is akin to using a foreign language. Sign language uses gestures and facial expressions and is widely used by deaf communities. To enabling efficient production of signed content on virtual environment, it is necessary to make written records of signs. Transcription systems have been developed to describe sign languages in written form, but these systems have limitations. Since they were not originally designed with computer animation in mind, in general, the recognition and reproduction of signs in these systems is an easy task only to those who deeply know the system. The aim of this work is to develop a transcription system to provide signed content in virtual environment. To animate a virtual avatar, a transcription system requires explicit enough information, such as movement speed, signs concatenation, sequence of each hold-and-movement and facial expressions, trying to articulate close to reality. Although many important studies in sign languages have been published, the transcription problem remains a challenge. Thus, a notation to describe, store and play signed content in virtual environments offers a multidisciplinary study and research tool, which may help linguistic studies to understand the sign languages structure and grammar.
ERIC Educational Resources Information Center
Duyck, Wouter; Van Assche, Eva; Drieghe, Denis; Hartsuiker, Robert J.
2007-01-01
Recent research on bilingualism has shown that lexical access in visual word recognition by bilinguals is not selective with respect to language. In the present study, the authors investigated language-independent lexical access in bilinguals reading sentences, which constitutes a strong unilingual linguistic context. In the first experiment,…
Automatization and Orthographic Development in Second Language Visual Word Recognition
ERIC Educational Resources Information Center
Kida, Shusaku
2016-01-01
The present study investigated second language (L2) learners' acquisition of automatic word recognition and the development of L2 orthographic representation in the mental lexicon. Participants in the study were Japanese university students enrolled in a compulsory course involving a weekly 30-minute sustained silent reading (SSR) activity with…
Face Recognition Is Shaped by the Use of Sign Language
ERIC Educational Resources Information Center
Stoll, Chloé; Palluel-Germain, Richard; Caldara, Roberto; Lao, Junpeng; Dye, Matthew W. G.; Aptel, Florent; Pascalis, Olivier
2018-01-01
Previous research has suggested that early deaf signers differ in face processing. Which aspects of face processing are changed and the role that sign language may have played in that change are however unclear. Here, we compared face categorization (human/non-human) and human face recognition performance in early profoundly deaf signers, hearing…
L2 Gender Facilitation and Inhibition in Spoken Word Recognition
ERIC Educational Resources Information Center
Behney, Jennifer N.
2011-01-01
This dissertation investigates the role of grammatical gender facilitation and inhibition in second language (L2) learners' spoken word recognition. Native speakers of languages that have grammatical gender are sensitive to gender marking when hearing and recognizing a word. Gender facilitation refers to when a given noun that is preceded by an…
Breaking the language barrier: machine assisted diagnosis using the medical speech translator.
Starlander, Marianne; Bouillon, Pierrette; Rayner, Manny; Chatzichrisafis, Nikos; Hockey, Beth Ann; Isahara, Hitoshi; Kanzaki, Kyoko; Nakao, Yukie; Santaholma, Marianne
2005-01-01
In this paper, we describe and evaluate an Open Source medical speech translation system (MedSLT) intended for safety-critical applications. The aim of this system is to eliminate the language barriers in emergency situation. It translates spoken questions from English into French, Japanese and Finnish in three medical subdomains (headache, chest pain and abdominal pain), using a vocabulary of about 250-400 words per sub-domain. The architecture is a compromise between fixed-phrase translation on one hand and complex linguistically-based systems on the other. Recognition is guided by a Context Free Grammar Language Model compiled from a general unification grammar, automatically specialised for the domain. We present an evaluation of this initial prototype that shows the advantages of this grammar-based approach for this particular translation task in term of both reliability and use.
Scene Text Recognition using Similarity and a Lexicon with Sparse Belief Propagation
Weinman, Jerod J.; Learned-Miller, Erik; Hanson, Allen R.
2010-01-01
Scene text recognition (STR) is the recognition of text anywhere in the environment, such as signs and store fronts. Relative to document recognition, it is challenging because of font variability, minimal language context, and uncontrolled conditions. Much information available to solve this problem is frequently ignored or used sequentially. Similarity between character images is often overlooked as useful information. Because of language priors, a recognizer may assign different labels to identical characters. Directly comparing characters to each other, rather than only a model, helps ensure that similar instances receive the same label. Lexicons improve recognition accuracy but are used post hoc. We introduce a probabilistic model for STR that integrates similarity, language properties, and lexical decision. Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. By fusing information sources in one model, we eliminate unrecoverable errors that result from sequential processing, improving accuracy. In experimental results recognizing text from images of signs in outdoor scenes, incorporating similarity reduces character recognition error by 19%, the lexicon reduces word recognition error by 35%, and sparse belief propagation reduces the lexicon words considered by 99.9% with a 12X speedup and no loss in accuracy. PMID:19696446
Optical character recognition of handwritten Arabic using hidden Markov models
NASA Astrophysics Data System (ADS)
Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.
2011-04-01
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.
Optical character recognition of handwritten Arabic using hidden Markov models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.
2011-01-01
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language ismore » initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.« less
Patient empowerment by increasing the understanding of medical language for lay users.
Topac, V; Stoicu-Tivadar, V
2013-01-01
Patient empowerment is important in order to increase the quality of medical care and the life quality of the patients. An important obstacle for empowering patients is the language barrier the lay patient encounter when accessing medical information. To design and develop a service that will help increase the understanding of medical language for lay persons. The service identifies and explains medical terminology from a given text by annotating the terms in the original text with the definition. It is based on an original terminology interpretation engine that uses a fuzzy matching dictionary. The service was implemented in two projects: a) into the server of a tele-care system (TELEASIS) with the purpose of adapting medical text assigned by medical personnel for the assisted patients. b) Into a dedicated web site that can adapt the medical language from raw text or from existing web pages. The output of the service was evaluated by a group of persons, and the results indicate that such a system can increase the understanding of medical texts. Several design decisions were driven from the evaluation, and are being considered for future development. Other tests measuring accuracy and time performance for the fuzzy terminology recognition have been performed. Test results revealed good performance for accuracy and excellent results regarding time performance. The current version of the service increases the accessibility of medical language by explaining terminology with a good accuracy, while allowing the user to easily identify errors, in order to reduce the risk of incorrect terminology recognition.
Mexican sign language recognition using normalized moments and artificial neural networks
NASA Astrophysics Data System (ADS)
Solís-V., J.-Francisco; Toxqui-Quitl, Carina; Martínez-Martínez, David; H.-G., Margarita
2014-09-01
This work presents a framework designed for the Mexican Sign Language (MSL) recognition. A data set was recorded with 24 static signs from the MSL using 5 different versions, this MSL dataset was captured using a digital camera in incoherent light conditions. Digital Image Processing was used to segment hand gestures, a uniform background was selected to avoid using gloved hands or some special markers. Feature extraction was performed by calculating normalized geometric moments of gray scaled signs, then an Artificial Neural Network performs the recognition using a 10-fold cross validation tested in weka, the best result achieved 95.83% of recognition rate.
Baethge, Christopher
2013-03-26
Whereas the most influential journals in psychiatry are English language journals, periodicals published in other languages serve an important purpose for local communities of clinicians and researchers. This study aimed at analyzing the scientific production and the recognition of non-English general psychiatry journals. In a cohort study, the 2009 volume of ten journals from Brazil (1), German language countries (5), France (2), Italy (1), and Poland (1) was searched for original articles. Patterns of citations to these articles during 2010 and 2011 as documented in Web of Science were analyzed. The journals published 199 original articles (range: 4-46), mostly observational studies. Half of the papers were cited in the following two years. There were 246 citations received, or an average of 1.25 cites per article (range: 0.25-4.04). Many of these citations came from the local community, that is, from the same authors and journals. Citations by other periodicals and other authors accounted for 36% [95%-CI: 30%-42%], citations in English sources for 33% [28%-39%] of all quotations. There was considerable heterogeneity with regard to citations received among the ten journals investigated. Non-English language general psychiatry journals contribute substantially to the body of research. However, recognition, and in particular recognition by the international research community is moderate.
Shi, Lu-Feng; Morozova, Natalia
2012-08-01
Word recognition is a basic component in a comprehensive hearing evaluation, but data are lacking for listeners speaking two languages. This study obtained such data for Russian natives in the US and analysed the data using the perceptual assimilation model (PAM) and speech learning model (SLM). Listeners were randomly presented 200 NU-6 words in quiet. Listeners responded verbally and in writing. Performance was scored on words and phonemes (word-initial consonants, vowels, and word-final consonants). Seven normal-hearing, adult monolingual English natives (NM), 16 English-dominant (ED), and 15 Russian-dominant (RD) Russian natives participated. ED and RD listeners differed significantly in their language background. Consistent with the SLM, NM outperformed ED listeners and ED outperformed RD listeners, whether responses were scored on words or phonemes. NM and ED listeners shared similar phoneme error patterns, whereas RD listeners' errors had unique patterns that could be largely understood via the PAM. RD listeners had particular difficulty differentiating vowel contrasts /i-I/, /æ-ε/, and /ɑ-Λ/, word-initial consonant contrasts /p-h/ and /b-f/, and word-final contrasts /f-v/. Both first-language phonology and second-language learning history affect word and phoneme recognition. Current findings may help clinicians differentiate word recognition errors due to language background from hearing pathologies.
Language comprehension warps the mirror neuron system.
Zarr, Noah; Ferguson, Ryan; Glenberg, Arthur M
2013-01-01
Is the mirror neuron system (MNS) used in language understanding? According to embodied accounts of language comprehension, understanding sentences describing actions makes use of neural mechanisms of action control, including the MNS. Consequently, repeatedly comprehending sentences describing similar actions should induce adaptation of the MNS thereby warping its use in other cognitive processes such as action recognition and prediction. To test this prediction, participants read blocks of multiple sentences where each sentence in the block described transfer of objects in a direction away or toward the reader. Following each block, adaptation was measured by having participants predict the end-point of videotaped actions. The adapting sentences disrupted prediction of actions in the same direction, but (a) only for videos of biological motion, and (b) only when the effector implied by the language (e.g., the hand) matched the videos. These findings are signatures of the MNS.
Language comprehension warps the mirror neuron system
Zarr, Noah; Ferguson, Ryan; Glenberg, Arthur M.
2013-01-01
Is the mirror neuron system (MNS) used in language understanding? According to embodied accounts of language comprehension, understanding sentences describing actions makes use of neural mechanisms of action control, including the MNS. Consequently, repeatedly comprehending sentences describing similar actions should induce adaptation of the MNS thereby warping its use in other cognitive processes such as action recognition and prediction. To test this prediction, participants read blocks of multiple sentences where each sentence in the block described transfer of objects in a direction away or toward the reader. Following each block, adaptation was measured by having participants predict the end-point of videotaped actions. The adapting sentences disrupted prediction of actions in the same direction, but (a) only for videos of biological motion, and (b) only when the effector implied by the language (e.g., the hand) matched the videos. These findings are signatures of the MNS. PMID:24381553
Modern Languages and Antiracism.
ERIC Educational Resources Information Center
O'Shaughnessy, Martin
1988-01-01
Discusses a school language department's antiracist/multicultural policy for modern languages. The policy stresses the need for a multicultural curriculum, exploration of racism, acceptance of all languages, recognition of specialized knowledge, and positive images of people from ethnic minority groups. (CB)
Human and animal sounds influence recognition of body language.
Van den Stock, Jan; Grèzes, Julie; de Gelder, Beatrice
2008-11-25
In naturalistic settings emotional events have multiple correlates and are simultaneously perceived by several sensory systems. Recent studies have shown that recognition of facial expressions is biased towards the emotion expressed by a simultaneously presented emotional expression in the voice even if attention is directed to the face only. So far, no study examined whether this phenomenon also applies to whole body expressions, although there is no obvious reason why this crossmodal influence would be specific for faces. Here we investigated whether perception of emotions expressed in whole body movements is influenced by affective information provided by human and by animal vocalizations. Participants were instructed to attend to the action displayed by the body and to categorize the expressed emotion. The results indicate that recognition of body language is biased towards the emotion expressed by the simultaneously presented auditory information, whether it consist of human or of animal sounds. Our results show that a crossmodal influence from auditory to visual emotional information obtains for whole body video images with the facial expression blanked and includes human as well as animal sounds.
Concept recognition for extracting protein interaction relations from biomedical text
Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence
2008-01-01
Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500
The Affordance of Speech Recognition Technology for EFL Learning in an Elementary School Setting
ERIC Educational Resources Information Center
Liaw, Meei-Ling
2014-01-01
This study examined the use of speech recognition (SR) technology to support a group of elementary school children's learning of English as a foreign language (EFL). SR technology has been used in various language learning contexts. Its application to EFL teaching and learning is still relatively recent, but a solid understanding of its…
ERIC Educational Resources Information Center
Nassaji, Hossein
2014-01-01
This article examines current research on the role and importance of lower-level processes in second language (L2) reading. The focus is on word recognition and its subcomponent processes, including various phonological and orthographic processes. Issues related to syntactic and semantic processes and their relationship with word recognition are…
Semantic Ambiguity Effects in L2 Word Recognition
ERIC Educational Resources Information Center
Ishida, Tomomi
2018-01-01
The present study examined the ambiguity effects in second language (L2) word recognition. Previous studies on first language (L1) lexical processing have observed that ambiguous words are recognized faster and more accurately than unambiguous words on lexical decision tasks. In this research, L1 and L2 speakers of English were asked whether a…
Using Automatic Speech Recognition Technology with Elicited Oral Response Testing
ERIC Educational Resources Information Center
Cox, Troy L.; Davies, Randall S.
2012-01-01
This study examined the use of automatic speech recognition (ASR) scored elicited oral response (EOR) tests to assess the speaking ability of English language learners. It also examined the relationship between ASR-scored EOR and other language proficiency measures and the ability of the ASR to rate speakers without bias to gender or native…
Morphological Processing during Visual Word Recognition in Hebrew as a First and a Second Language
ERIC Educational Resources Information Center
Norman, Tal; Degani, Tamar; Peleg, Orna
2017-01-01
The present study examined whether sublexical morphological processing takes place during visual word-recognition in Hebrew, and whether morphological decomposition of written words depends on lexical activation of the complete word. Furthermore, it examined whether morphological processing is similar when reading Hebrew as a first language (L1)…
Segment-based acoustic models for continuous speech recognition
NASA Astrophysics Data System (ADS)
Ostendorf, Mari; Rohlicek, J. R.
1993-07-01
This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition, by extending previous work in segment-based modeling and by introducing a new hierarchical approach to representing intra-utterance statistical dependencies. These techniques, which are more costly than traditional approaches because of the large search space associated with higher order models, are made feasible through rescoring a set of HMM-generated N-best sentence hypotheses. We expect these different modeling techniques to result in improved recognition performance over that achieved by current systems, which handle only frame-based observations and assume that these observations are independent given an underlying state sequence. In the fourth quarter of the project, we have completed the following: (1) ported our recognition system to the Wall Street Journal task, a standard task in the ARPA community; (2) developed an initial dependency-tree model of intra-utterance observation correlation; and (3) implemented baseline language model estimation software. Our initial results on the Wall Street Journal task are quite good and represent significantly improved performance over most HMM systems reporting on the Nov. 1992 5k vocabulary test set.
Perceptual uncertainty is a property of the cognitive system.
Perea, Manuel; Carreiras, Manuel
2012-10-01
We qualify Frost's proposals regarding letter-position coding in visual word recognition and the universal model of reading. First, we show that perceptual uncertainty regarding letter position is not tied to European languages-instead it is a general property of the cognitive system. Second, we argue that a universal model of reading should incorporate a developmental view of the reading process.
Speech-associated gestures, Broca’s area, and the human mirror system
Skipper, Jeremy I.; Goldin-Meadow, Susan; Nusbaum, Howard C.; Small, Steven L
2009-01-01
Speech-associated gestures are hand and arm movements that not only convey semantic information to listeners but are themselves actions. Broca’s area has been assumed to play an important role both in semantic retrieval or selection (as part of a language comprehension system) and in action recognition (as part of a “mirror” or “observation–execution matching” system). We asked whether the role that Broca’s area plays in processing speech-associated gestures is consistent with the semantic retrieval/selection account (predicting relatively weak interactions between Broca’s area and other cortical areas because the meaningful information that speech-associated gestures convey reduces semantic ambiguity and thus reduces the need for semantic retrieval/selection) or the action recognition account (predicting strong interactions between Broca’s area and other cortical areas because speech-associated gestures are goal-direct actions that are “mirrored”). We compared the functional connectivity of Broca’s area with other cortical areas when participants listened to stories while watching meaningful speech-associated gestures, speech-irrelevant self-grooming hand movements, or no hand movements. A network analysis of neuroimaging data showed that interactions involving Broca’s area and other cortical areas were weakest when spoken language was accompanied by meaningful speech-associated gestures, and strongest when spoken language was accompanied by self-grooming hand movements or by no hand movements at all. Results are discussed with respect to the role that the human mirror system plays in processing speech-associated movements. PMID:17533001
Modeling the Perceptual Learning of Novel Dialect Features
ERIC Educational Resources Information Center
Tatman, Rachael
2017-01-01
All language use reflects the user's social identity in systematic ways. While humans can easily adapt to this sociolinguistic variation, automatic speech recognition (ASR) systems continue to struggle with it. This dissertation makes three main contributions. The first is to provide evidence that modern state-of-the-art commercial ASR systems…
TELLTALE: Experiments in a Dynamic Hypertext Environment for Degraded and Multilingual Data.
ERIC Educational Resources Information Center
Pearce, Claudia; Nicholas, Charles
1996-01-01
Presents experimentation results for the TELLTALE system, a dynamic hypertext environment that provides full-text search from a hypertext-style user interface for text corpora that may be garbled by OCR (optical character recognition) or transmission errors, and that may contain languages other than English. (Author/LRW)
Named Entity Recognition in a Hungarian NL Based QA System
NASA Astrophysics Data System (ADS)
Tikkl, Domonkos; Szidarovszky, P. Ferenc; Kardkovacs, Zsolt T.; Magyar, Gábor
In WoW project our purpose is to create a complex search interface with the following features: search in the deep web content of contracted partners' databases, processing Hungarian natural language (NL) questions and transforming them to SQL queries for database access, image search supported by a visual thesaurus that describes in a structural form the visual content of images (also in Hungarian). This paper primarily focuses on a particular problem of question processing task: the entity recognition. Before going into details we give a short overview of the project's aims.
From grasp to language: embodied concepts and the challenge of abstraction.
Arbib, Michael A
2008-01-01
The discovery of mirror neurons in the macaque monkey and the discovery of a homologous "mirror system for grasping" in Broca's area in the human brain has revived the gestural origins theory of the evolution of the human capability for language, enriching it with the suggestion that mirror neurons provide the neurological core for this evolution. However, this notion of "mirror neuron support for the transition from grasp to language" has been worked out in very different ways in the Mirror System Hypothesis model [Arbib, M.A., 2005a. From monkey-like action recognition to human language: an evolutionary framework for neurolinguistics (with commentaries and author's response). Behavioral and Brain Sciences 28, 105-167; Rizzolatti, G., Arbib, M.A., 1998. Language within our grasp. Trends in Neuroscience 21(5), 188-194] and the Embodied Concept model [Gallese, V., Lakoff, G., 2005. The brain's concepts: the role of the sensory-motor system in reason and language. Cognitive Neuropsychology 22, 455-479]. The present paper provides a critique of the latter to enrich analysis of the former, developing the role of schema theory [Arbib, M.A., 1981. Perceptual structures and distributed motor control. In: Brooks, V.B. (Ed.), Handbook of Physiology--The Nervous System II. Motor Control. American Physiological Society, pp. 1449-1480].
Denmark, Tanya; Atkinson, Joanna; Campbell, Ruth; Swettenham, John
2014-10-01
Facial expressions in sign language carry a variety of communicative features. While emotion can modulate a spoken utterance through changes in intonation, duration and intensity, in sign language specific facial expressions presented concurrently with a manual sign perform this function. When deaf adult signers cannot see facial features, their ability to judge emotion in a signed utterance is impaired (Reilly et al. in Sign Lang Stud 75:113-118, 1992). We examined the role of the face in the comprehension of emotion in sign language in a group of typically developing (TD) deaf children and in a group of deaf children with autism spectrum disorder (ASD). We replicated Reilly et al.'s (Sign Lang Stud 75:113-118, 1992) adult results in the TD deaf signing children, confirming the importance of the face in understanding emotion in sign language. The ASD group performed more poorly on the emotion recognition task than the TD children. The deaf children with ASD showed a deficit in emotion recognition during sign language processing analogous to the deficit in vocal emotion recognition that has been observed in hearing children with ASD.
Francis, Wendy S; Gutiérrez, Marisela
2012-04-01
The effects of bilingual proficiency on recognition memory were examined in an experiment with Spanish-English bilinguals. Participants learned lists of words in English and Spanish under shallow- and deep-encoding conditions. Overall, hit rates were higher, discrimination greater, and response times shorter in the nondominant language, consistent with effects previously observed for lower frequency words. Levels-of-processing effects in hit rates, discrimination, and response time were stronger in the dominant language. Specifically, with shallow encoding, the advantage for the nondominant language was larger than with deep encoding. The results support the idea that memory performance in the nondominant language is impacted by both the greater demand for cognitive resources and the lower familiarity of the words.
Thai Language Sentence Similarity Computation Based on Syntactic Structure and Semantic Vector
NASA Astrophysics Data System (ADS)
Wang, Hongbin; Feng, Yinhan; Cheng, Liang
2018-03-01
Sentence similarity computation plays an increasingly important role in text mining, Web page retrieval, machine translation, speech recognition and question answering systems. Thai language as a kind of resources scarce language, it is not like Chinese language with HowNet and CiLin resources. So the Thai sentence similarity research faces some challenges. In order to solve this problem of the Thai language sentence similarity computation. This paper proposes a novel method to compute the similarity of Thai language sentence based on syntactic structure and semantic vector. This method firstly uses the Part-of-Speech (POS) dependency to calculate two sentences syntactic structure similarity, and then through the word vector to calculate two sentences semantic similarity. Finally, we combine the two methods to calculate two Thai language sentences similarity. The proposed method not only considers semantic, but also considers the sentence syntactic structure. The experiment result shows that this method in Thai language sentence similarity computation is feasible.
Iris unwrapping using the Bresenham circle algorithm for real-time iris recognition
NASA Astrophysics Data System (ADS)
Carothers, Matthew T.; Ngo, Hau T.; Rakvic, Ryan N.; Broussard, Randy P.
2015-02-01
An efficient parallel architecture design for the iris unwrapping process in a real-time iris recognition system using the Bresenham Circle Algorithm is presented in this paper. Based on the characteristics of the model parameters this algorithm was chosen over the widely used polar conversion technique as the iris unwrapping model. The architecture design is parallelized to increase the throughput of the system and is suitable for processing an inputted image size of 320 × 240 pixels in real-time using Field Programmable Gate Array (FPGA) technology. Quartus software is used to implement, verify, and analyze the design's performance using the VHSIC Hardware Description Language. The system's predicted processing time is faster than the modern iris unwrapping technique used today∗.
ERIC Educational Resources Information Center
Hoover, Jill R.
2018-01-01
Purpose: The purpose of the current study was to determine the effect of neighborhood density and syntactic class on word recognition in children with specific language impairment (SLI) and typical development (TD). Method: Fifteen children with SLI ("M" age = 6;5 [years;months]) and 15 with TD ("M" age = 6;4) completed a…
ERIC Educational Resources Information Center
Wu, Shiyu; Ma, Zheng
2017-01-01
Previous research has indicated that, in viewing a visual word, the activated phonological representation in turn activates its homophone, causing semantic interference. Using this mechanism of phonological mediation, this study investigated native-language phonological interference in visual recognition of Chinese two-character compounds by early…
ERIC Educational Resources Information Center
Heimann, Mikael; Strid, Karin; Smith, Lars; Tjus, Tomas; Ulvund, Stein Erik; Meltzoff, Andrew N.
2006-01-01
The relationship between recall memory, visual recognition memory, social communication, and the emergence of language skills was measured in a longitudinal study. Thirty typically developing Swedish children were tested at 6, 9 and 14 months. The result showed that, in combination, visual recognition memory at 6 months, deferred imitation at 9…
ERIC Educational Resources Information Center
Rispens, Judith; Baker, Anne; Duinmeijer, Iris
2015-01-01
Purpose: The effects of neighborhood density (ND) and lexical frequency on word recognition and the effects of phonotactic probability (PP) on nonword repetition (NWR) were examined to gain insight into processing at the lexical and sublexical levels in typically developing (TD) children and children with developmental language problems. Method:…
Computer-Mediated Input, Output and Feedback in the Development of L2 Word Recognition from Speech
ERIC Educational Resources Information Center
Matthews, Joshua; Cheng, Junyu; O'Toole, John Mitchell
2015-01-01
This paper reports on the impact of computer-mediated input, output and feedback on the development of second language (L2) word recognition from speech (WRS). A quasi-experimental pre-test/treatment/post-test research design was used involving three intact tertiary level English as a Second Language (ESL) classes. Classes were either assigned to…
Optimizing estimation of hemispheric dominance for language using magnetic source imaging
Passaro, Antony D.; Rezaie, Roozbeh; Moser, Dana C.; Li, Zhimin; Dias, Nadeeka; Papanicolaou, Andrew C.
2011-01-01
The efficacy of magnetoencephalography (MEG) as an alternative to invasive methods for investigating the cortical representation of language has been explored in several studies. Recently, studies comparing MEG to the gold standard Wada procedure have found inconsistent and often less-than accurate estimates of laterality across various MEG studies. Here we attempted to address this issue among normal right-handed adults (N=12) by supplementing a well-established MEG protocol involving word recognition and the single dipole method with a sentence comprehension task and a beamformer approach localizing neural oscillations. Beamformer analysis of word recognition and sentence comprehension tasks revealed a desynchronization in the 10–18 Hz range, localized to the temporo-parietal cortices. Inspection of individual profiles of localized desynchronization (10–18 Hz) revealed left hemispheric dominance in 91.7% and 83.3% of individuals during the word recognition and sentence comprehension tasks, respectively. In contrast, single dipole analysis yielded lower estimates, such that activity in temporal language regions was left-lateralized in 66.7% and 58.3% of individuals during word recognition and sentence comprehension, respectively. The results obtained from the word recognition task and localization of oscillatory activity using a beamformer appear to be in line with general estimates of left hemispheric dominance for language in normal right-handed individuals. Furthermore, the current findings support the growing notion that changes in neural oscillations underlie critical components of linguistic processing. PMID:21890118
Online Collaborative Communities of Learning for Pre-Service Teachers of Languages
ERIC Educational Resources Information Center
Morgan, Anne-Marie
2015-01-01
University programs for preparing preservice teachers of languages for teaching in schools generally involve generic pedagogy, methodology, curriculum, programming and issues foci, that provide a bridge between the study of languages (or recognition of existing language proficiency) and the teaching of languages. There is much territory to cover…
Caballero-Morales, Santiago-Omar
2013-01-01
An approach for the recognition of emotions in speech is presented. The target language is Mexican Spanish, and for this purpose a speech database was created. The approach consists in the phoneme acoustic modelling of emotion-specific vowels. For this, a standard phoneme-based Automatic Speech Recognition (ASR) system was built with Hidden Markov Models (HMMs), where different phoneme HMMs were built for the consonants and emotion-specific vowels associated with four emotional states (anger, happiness, neutral, sadness). Then, estimation of the emotional state from a spoken sentence is performed by counting the number of emotion-specific vowels found in the ASR's output for the sentence. With this approach, accuracy of 87–100% was achieved for the recognition of emotional state of Mexican Spanish speech. PMID:23935410
Hierarchically Structured Non-Intrusive Sign Language Recognition. Chapter 2
NASA Technical Reports Server (NTRS)
Zieren, Jorg; Zieren, Jorg; Kraiss, Karl-Friedrich
2007-01-01
This work presents a hierarchically structured approach at the nonintrusive recognition of sign language from a monocular frontal view. Robustness is achieved through sophisticated localization and tracking methods, including a combined EM/CAMSHIFT overlap resolution procedure and the parallel pursuit of multiple hypotheses about hands position and movement. This allows handling of ambiguities and automatically corrects tracking errors. A biomechanical skeleton model and dynamic motion prediction using Kalman filters represents high level knowledge. Classification is performed by Hidden Markov Models. 152 signs from German sign language were recognized with an accuracy of 97.6%.
2013-01-01
Background Whereas the most influential journals in psychiatry are English language journals, periodicals published in other languages serve an important purpose for local communities of clinicians and researchers. This study aimed at analyzing the scientific production and the recognition of non-English general psychiatry journals. Methods In a cohort study, the 2009 volume of ten journals from Brazil (1), German language countries (5), France (2), Italy (1), and Poland (1) was searched for original articles. Patterns of citations to these articles during 2010 and 2011 as documented in Web of Science were analyzed. Results The journals published 199 original articles (range: 4–46), mostly observational studies. Half of the papers were cited in the following two years. There were 246 citations received, or an average of 1.25 cites per article (range: 0.25-4.04). Many of these citations came from the local community, that is, from the same authors and journals. Citations by other periodicals and other authors accounted for 36% [95%-CI: 30%-42%], citations in English sources for 33% [28%-39%] of all quotations. There was considerable heterogeneity with regard to citations received among the ten journals investigated. Conclusion Non-English language general psychiatry journals contribute substantially to the body of research. However, recognition, and in particular recognition by the international research community is moderate. PMID:23531084
The Mechanical Recognition of Speech: Prospects for Use in the Teaching of Languages.
ERIC Educational Resources Information Center
Pulliam, Robert
1970-01-01
This paper begins with a brief account of the development of automatic speech recogniton (ASR) and then proceeds to an examination of ASR systems typical of the kind now in operation. It is stressed that such systems, although highly developed, do not recognize speech in the same sense as the human being does, and that they can not deal with a…
Advocate: A Distributed Architecture for Speech-to-Speech Translation
2009-01-01
tecture, are either wrapped natural-language processing ( NLP ) components or objects developed from scratch using the architecture’s API. GATE is...framework, we put together a demonstration Arabic -to- English speech translation system using both internally developed ( Arabic speech recognition and MT...conditions of our Arabic S2S demonstration system described earlier. Once again, the data size was varied and eighty identical requests were
University of Colorado Dialog Systems for Travel and Navigation
2001-01-01
understanding technologies using the DARPA Hub Architecture. Users are able to converse with an automated travel agent over the phone to retrieve up-to-date...travel information such as flight schedules, pricing, along with hotel and rental car availability. The CU Communicator has been under development...implementation of the DARPA Communicator task [3]. The system combines continuous speech recognition, natural language understanding and flexible dialogue
Zazo, Ruben; Lozano-Diez, Alicia; Gonzalez-Dominguez, Javier; Toledano, Doroteo T; Gonzalez-Rodriguez, Joaquin
2016-01-01
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources (a single GPU) that outperforms a reference i-vector system on a subset of the NIST Language Recognition Evaluation (8 target languages, 3s task) by up to a 26%. This result is in line with previously published research using proprietary LSTM implementations and huge computational resources, which made these former results hardly reproducible. Further, we extend those previous experiments modeling unseen languages (out of set, OOS, modeling), which is crucial in real applications. Results show that a LSTM RNN with OOS modeling is able to detect these languages and generalizes robustly to unseen OOS languages. Finally, we also analyze the effect of even more limited test data (from 2.25s to 0.1s) proving that with as little as 0.5s an accuracy of over 50% can be achieved.
A neuropsychological perspective on the link between language and praxis in modern humans
Roby-Brami, Agnes; Hermsdörfer, Joachim; Roy, Alice C.; Jacobs, Stéphane
2012-01-01
Hypotheses about the emergence of human cognitive abilities postulate strong evolutionary links between language and praxis, including the possibility that language was originally gestural. The present review considers functional and neuroanatomical links between language and praxis in brain-damaged patients with aphasia and/or apraxia. The neural systems supporting these functions are predominantly located in the left hemisphere. There are many parallels between action and language for recognition, imitation and gestural communication suggesting that they rely partially on large, common networks, differentially recruited depending on the nature of the task. However, this relationship is not unequivocal and the production and understanding of gestural communication are dependent on the context in apraxic patients and remains to be clarified in aphasic patients. The phonological, semantic and syntactic levels of language seem to share some common cognitive resources with the praxic system. In conclusion, neuropsychological observations do not allow support or rejection of the hypothesis that gestural communication may have constituted an evolutionary link between tool use and language. Rather they suggest that the complexity of human behaviour is based on large interconnected networks and on the evolution of specific properties within strategic areas of the left cerebral hemisphere. PMID:22106433
Zazo, Ruben; Lozano-Diez, Alicia; Gonzalez-Dominguez, Javier; T. Toledano, Doroteo; Gonzalez-Rodriguez, Joaquin
2016-01-01
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources (a single GPU) that outperforms a reference i-vector system on a subset of the NIST Language Recognition Evaluation (8 target languages, 3s task) by up to a 26%. This result is in line with previously published research using proprietary LSTM implementations and huge computational resources, which made these former results hardly reproducible. Further, we extend those previous experiments modeling unseen languages (out of set, OOS, modeling), which is crucial in real applications. Results show that a LSTM RNN with OOS modeling is able to detect these languages and generalizes robustly to unseen OOS languages. Finally, we also analyze the effect of even more limited test data (from 2.25s to 0.1s) proving that with as little as 0.5s an accuracy of over 50% can be achieved. PMID:26824467
Biological origins of color categorization.
Skelton, Alice E; Catchpole, Gemma; Abbott, Joshua T; Bosten, Jenny M; Franklin, Anna
2017-05-23
The biological basis of the commonality in color lexicons across languages has been hotly debated for decades. Prior evidence that infants categorize color could provide support for the hypothesis that color categorization systems are not purely constructed by communication and culture. Here, we investigate the relationship between infants' categorization of color and the commonality across color lexicons, and the potential biological origin of infant color categories. We systematically mapped infants' categorical recognition memory for hue onto a stimulus array used previously to document the color lexicons of 110 nonindustrialized languages. Following familiarization to a given hue, infants' response to a novel hue indicated that their recognition memory parses the hue continuum into red, yellow, green, blue, and purple categories. Infants' categorical distinctions aligned with common distinctions in color lexicons and are organized around hues that are commonly central to lexical categories across languages. The boundaries between infants' categorical distinctions also aligned, relative to the adaptation point, with the cardinal axes that describe the early stages of color representation in retinogeniculate pathways, indicating that infant color categorization may be partly organized by biological mechanisms of color vision. The findings suggest that color categorization in language and thought is partially biologically constrained and have implications for broader debate on how biology, culture, and communication interact in human cognition.
Kim, Seongjung; Kim, Jongman; Ahn, Soonjae; Kim, Youngho
2018-04-18
Deaf people use sign or finger languages for communication, but these methods of communication are very specialized. For this reason, the deaf can suffer from social inequalities and financial losses due to their communication restrictions. In this study, we developed a finger language recognition algorithm based on an ensemble artificial neural network (E-ANN) using an armband system with 8-channel electromyography (EMG) sensors. The developed algorithm was composed of signal acquisition, filtering, segmentation, feature extraction and an E-ANN based classifier that was evaluated with the Korean finger language (14 consonants, 17 vowels and 7 numbers) in 17 subjects. E-ANN was categorized according to the number of classifiers (1 to 10) and size of training data (50 to 1500). The accuracy of the E-ANN-based classifier was obtained by 5-fold cross validation and compared with an artificial neural network (ANN)-based classifier. As the number of classifiers (1 to 8) and size of training data (50 to 300) increased, the average accuracy of the E-ANN-based classifier increased and the standard deviation decreased. The optimal E-ANN was composed with eight classifiers and 300 size of training data, and the accuracy of the E-ANN was significantly higher than that of the general ANN.
Biological origins of color categorization
Catchpole, Gemma; Abbott, Joshua T.; Bosten, Jenny M.; Franklin, Anna
2017-01-01
The biological basis of the commonality in color lexicons across languages has been hotly debated for decades. Prior evidence that infants categorize color could provide support for the hypothesis that color categorization systems are not purely constructed by communication and culture. Here, we investigate the relationship between infants’ categorization of color and the commonality across color lexicons, and the potential biological origin of infant color categories. We systematically mapped infants’ categorical recognition memory for hue onto a stimulus array used previously to document the color lexicons of 110 nonindustrialized languages. Following familiarization to a given hue, infants’ response to a novel hue indicated that their recognition memory parses the hue continuum into red, yellow, green, blue, and purple categories. Infants’ categorical distinctions aligned with common distinctions in color lexicons and are organized around hues that are commonly central to lexical categories across languages. The boundaries between infants’ categorical distinctions also aligned, relative to the adaptation point, with the cardinal axes that describe the early stages of color representation in retinogeniculate pathways, indicating that infant color categorization may be partly organized by biological mechanisms of color vision. The findings suggest that color categorization in language and thought is partially biologically constrained and have implications for broader debate on how biology, culture, and communication interact in human cognition. PMID:28484022
Practical vision based degraded text recognition system
NASA Astrophysics Data System (ADS)
Mohammad, Khader; Agaian, Sos; Saleh, Hani
2011-02-01
Rapid growth and progress in the medical, industrial, security and technology fields means more and more consideration for the use of camera based optical character recognition (OCR) Applying OCR to scanned documents is quite mature, and there are many commercial and research products available on this topic. These products achieve acceptable recognition accuracy and reasonable processing times especially with trained software, and constrained text characteristics. Even though the application space for OCR is huge, it is quite challenging to design a single system that is capable of performing automatic OCR for text embedded in an image irrespective of the application. Challenges for OCR systems include; images are taken under natural real world conditions, Surface curvature, text orientation, font, size, lighting conditions, and noise. These and many other conditions make it extremely difficult to achieve reasonable character recognition. Performance for conventional OCR systems drops dramatically as the degradation level of the text image quality increases. In this paper, a new recognition method is proposed to recognize solid or dotted line degraded characters. The degraded text string is localized and segmented using a new algorithm. The new method was implemented and tested using a development framework system that is capable of performing OCR on camera captured images. The framework allows parameter tuning of the image-processing algorithm based on a training set of camera-captured text images. Novel methods were used for enhancement, text localization and the segmentation algorithm which enables building a custom system that is capable of performing automatic OCR which can be used for different applications. The developed framework system includes: new image enhancement, filtering, and segmentation techniques which enabled higher recognition accuracies, faster processing time, and lower energy consumption, compared with the best state of the art published techniques. The system successfully produced impressive OCR accuracies (90% -to- 93%) using customized systems generated by our development framework in two industrial OCR applications: water bottle label text recognition and concrete slab plate text recognition. The system was also trained for the Arabic language alphabet, and demonstrated extremely high recognition accuracy (99%) for Arabic license name plate text recognition with processing times of 10 seconds. The accuracy and run times of the system were compared to conventional and many states of art methods, the proposed system shows excellent results.
The Suitability of Cloud-Based Speech Recognition Engines for Language Learning
ERIC Educational Resources Information Center
Daniels, Paul; Iwago, Koji
2017-01-01
As online automatic speech recognition (ASR) engines become more accurate and more widely implemented with call software, it becomes important to evaluate the effectiveness and the accuracy of these recognition engines using authentic speech samples. This study investigates two of the most prominent cloud-based speech recognition engines--Apple's…
Recognition of face identity and emotion in expressive specific language impairment.
Merkenschlager, A; Amorosa, H; Kiefl, H; Martinius, J
2012-01-01
To study face and emotion recognition in children with mostly expressive specific language impairment (SLI-E). A test movie to study perception and recognition of faces and mimic-gestural expression was applied to 24 children diagnosed as suffering from SLI-E and an age-matched control group of normally developing children. Compared to a normal control group, the SLI-E children scored significantly worse in both the face and expression recognition tasks with a preponderant effect on emotion recognition. The performance of the SLI-E group could not be explained by reduced attention during the test session. We conclude that SLI-E is associated with a deficiency in decoding non-verbal emotional facial and gestural information, which might lead to profound and persistent problems in social interaction and development. Copyright © 2012 S. Karger AG, Basel.
ERIC Educational Resources Information Center
Fengler, Ineke; Delfau, Pia-Céline; Röder, Brigitte
2018-01-01
It is yet unclear whether congenitally deaf cochlear implant (CD CI) users' visual and multisensory emotion perception is influenced by their history in sign language acquisition. We hypothesized that early-signing CD CI users, relative to late-signing CD CI users and hearing, non-signing controls, show better facial expression recognition and…
ERIC Educational Resources Information Center
McBride-Chang, Catherine; Lam, Fanny; Lam, Catherine; Doo, Sylvia; Wong, Simpson W. L.; Chow, Yvonne Y. Y.
2008-01-01
Background: This study sought to identify cognitive abilities that might distinguish Hong Kong Chinese kindergarten children at risk for dyslexia through either language delay or familial history of dyslexia from children who were not at risk and to examine how these abilities were associated with Chinese word recognition. The cognitive skills of…
A Prerequisite to L1 Homophone Effects in L2 Spoken-Word Recognition
ERIC Educational Resources Information Center
Nakai, Satsuki; Lindsay, Shane; Ota, Mitsuhiko
2015-01-01
When both members of a phonemic contrast in L2 (second language) are perceptually mapped to a single phoneme in one's L1 (first language), L2 words containing a member of that contrast can spuriously activate L2 words in spoken-word recognition. For example, upon hearing cattle, Dutch speakers of English are reported to experience activation…
A Set of Handwriting Features for Use in Automated Writer Identification.
Miller, John J; Patterson, Robert Bradley; Gantz, Donald T; Saunders, Christopher P; Walch, Mark A; Buscaglia, JoAnn
2017-05-01
A writer's biometric identity can be characterized through the distribution of physical feature measurements ("writer's profile"); a graph-based system that facilitates the quantification of these features is described. To accomplish this quantification, handwriting is segmented into basic graphical forms ("graphemes"), which are "skeletonized" to yield the graphical topology of the handwritten segment. The graph-based matching algorithm compares the graphemes first by their graphical topology and then by their geometric features. Graphs derived from known writers can be compared against graphs extracted from unknown writings. The process is computationally intensive and relies heavily upon statistical pattern recognition algorithms. This article focuses on the quantification of these physical features and the construction of the associated pattern recognition methods for using the features to discriminate among writers. The graph-based system described in this article has been implemented in a highly accurate and approximately language-independent biometric recognition system of writers of cursive documents. © 2017 American Academy of Forensic Sciences.
Recognition of speaker-dependent continuous speech with KEAL
NASA Astrophysics Data System (ADS)
Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.
1989-04-01
A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.
ERIC Educational Resources Information Center
Parks, Elizabeth
2015-01-01
Linguistic ideologies that are left unquestioned and unexplored, especially as reflected and produced in marginalized language communities, can contribute to inequality made real in decisions about languages and the people who use them. One of the primary bodies of knowledge guiding international language policy is the International Organization…
Corina, David P.; Grosvald, Michael
2011-01-01
In this paper, we compare responses of deaf signers and hearing non-signers engaged in a categorization task of signs and non-linguistic human actions. We examine the time it takes to make such categorizations under conditions of 180-degree stimulus inversion and as a function of repetition priming, in an effort to understand whether the processing of sign language forms draws upon special processing mechanisms or makes use of mechanisms used in recognition of non-linguistic human actions. Our data show that deaf signers were much faster in the categorization of both linguistic and non-linguistic actions, and relative to hearing non-signers, show evidence that they were more sensitive to the configural properties of signs. Our study suggests that sign expertise may lead to modifications of a general-purpose human action recognition system rather than evoking a qualitatively different mode of processing, and supports the contention that signed languages make use of perceptual systems through which humans understand or parse human actions and gestures more generally. PMID:22153323
Sign Perception and Recognition in Non-Native Signers of ASL
Morford, Jill P.; Carlson, Martina L.
2011-01-01
Past research has established that delayed first language exposure is associated with comprehension difficulties in non-native signers of American Sign Language (ASL) relative to native signers. The goal of the current study was to investigate potential explanations of this disparity: do non-native signers have difficulty with all aspects of comprehension, or are their comprehension difficulties restricted to some aspects of processing? We compared the performance of deaf non-native, hearing L2, and deaf native signers on a handshape and location monitoring and a sign recognition task. The results indicate that deaf non-native signers are as rapid and accurate on the monitoring task as native signers, with differences in the pattern of relative performance across handshape and location parameters. By contrast, non-native signers differ significantly from native signers during sign recognition. Hearing L2 signers, who performed almost as well as the two groups of deaf signers on the monitoring task, resembled the deaf native signers more than the deaf non-native signers on the sign recognition task. The combined results indicate that delayed exposure to a signed language leads to an overreliance on handshape during sign recognition. PMID:21686080
Ahmad, Riaz; Naz, Saeeda; Afzal, Muhammad Zeshan; Amin, Sayed Hassan; Breuel, Thomas
2015-01-01
The presence of a large number of unique shapes called ligatures in cursive languages, along with variations due to scaling, orientation and location provides one of the most challenging pattern recognition problems. Recognition of the large number of ligatures is often a complicated task in oriental languages such as Pashto, Urdu, Persian and Arabic. Research on cursive script recognition often ignores the fact that scaling, orientation, location and font variations are common in printed cursive text. Therefore, these variations are not included in image databases and in experimental evaluations. This research uncovers challenges faced by Arabic cursive script recognition in a holistic framework by considering Pashto as a test case, because Pashto language has larger alphabet set than Arabic, Persian and Urdu. A database containing 8000 images of 1000 unique ligatures having scaling, orientation and location variations is introduced. In this article, a feature space based on scale invariant feature transform (SIFT) along with a segmentation framework has been proposed for overcoming the above mentioned challenges. The experimental results show a significantly improved performance of proposed scheme over traditional feature extraction techniques such as principal component analysis (PCA). PMID:26368566
Cheng, Juan; Chen, Xun; Liu, Aiping; Peng, Hu
2015-01-01
Sign language recognition (SLR) is an important communication tool between the deaf and the external world. It is highly necessary to develop a worldwide continuous and large-vocabulary-scale SLR system for practical usage. In this paper, we propose a novel phonology- and radical-coded Chinese SLR framework to demonstrate the feasibility of continuous SLR using accelerometer (ACC) and surface electromyography (sEMG) sensors. The continuous Chinese characters, consisting of coded sign gestures, are first segmented into active segments using EMG signals by means of moving average algorithm. Then, features of each component are extracted from both ACC and sEMG signals of active segments (i.e., palm orientation represented by the mean and variance of ACC signals, hand movement represented by the fixed-point ACC sequence, and hand shape represented by both the mean absolute value (MAV) and autoregressive model coefficients (ARs)). Afterwards, palm orientation is first classified, distinguishing “Palm Downward” sign gestures from “Palm Inward” ones. Only the “Palm Inward” gestures are sent for further hand movement and hand shape recognition by dynamic time warping (DTW) algorithm and hidden Markov models (HMM) respectively. Finally, component recognition results are integrated to identify one certain coded gesture. Experimental results demonstrate that the proposed SLR framework with a vocabulary scale of 223 characters can achieve an averaged recognition accuracy of 96.01% ± 0.83% for coded gesture recognition tasks and 92.73% ± 1.47% for character recognition tasks. Besides, it demonstrats that sEMG signals are rather consistent for a given hand shape independent of hand movements. Hence, the number of training samples will not be significantly increased when the vocabulary scale increases, since not only the number of the completely new proposed coded gestures is constant and limited, but also the transition movement which connects successive signs needs no training samples to model even though the same coded gesture performed in different characters. This work opens up a possible new way to realize a practical Chinese SLR system. PMID:26389907
Cheng, Juan; Chen, Xun; Liu, Aiping; Peng, Hu
2015-09-15
Sign language recognition (SLR) is an important communication tool between the deaf and the external world. It is highly necessary to develop a worldwide continuous and large-vocabulary-scale SLR system for practical usage. In this paper, we propose a novel phonology- and radical-coded Chinese SLR framework to demonstrate the feasibility of continuous SLR using accelerometer (ACC) and surface electromyography (sEMG) sensors. The continuous Chinese characters, consisting of coded sign gestures, are first segmented into active segments using EMG signals by means of moving average algorithm. Then, features of each component are extracted from both ACC and sEMG signals of active segments (i.e., palm orientation represented by the mean and variance of ACC signals, hand movement represented by the fixed-point ACC sequence, and hand shape represented by both the mean absolute value (MAV) and autoregressive model coefficients (ARs)). Afterwards, palm orientation is first classified, distinguishing "Palm Downward" sign gestures from "Palm Inward" ones. Only the "Palm Inward" gestures are sent for further hand movement and hand shape recognition by dynamic time warping (DTW) algorithm and hidden Markov models (HMM) respectively. Finally, component recognition results are integrated to identify one certain coded gesture. Experimental results demonstrate that the proposed SLR framework with a vocabulary scale of 223 characters can achieve an averaged recognition accuracy of 96.01% ± 0.83% for coded gesture recognition tasks and 92.73% ± 1.47% for character recognition tasks. Besides, it demonstrats that sEMG signals are rather consistent for a given hand shape independent of hand movements. Hence, the number of training samples will not be significantly increased when the vocabulary scale increases, since not only the number of the completely new proposed coded gestures is constant and limited, but also the transition movement which connects successive signs needs no training samples to model even though the same coded gesture performed in different characters. This work opens up a possible new way to realize a practical Chinese SLR system.
2012-01-01
Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054
McCreery, Ryan W; Walker, Elizabeth A; Spratford, Meredith; Oleson, Jacob; Bentler, Ruth; Holte, Lenore; Roush, Patricia
2015-01-01
Progress has been made in recent years in the provision of amplification and early intervention for children who are hard of hearing. However, children who use hearing aids (HAs) may have inconsistent access to their auditory environment due to limitations in speech audibility through their HAs or limited HA use. The effects of variability in children's auditory experience on parent-reported auditory skills questionnaires and on speech recognition in quiet and in noise were examined for a large group of children who were followed as part of the Outcomes of Children with Hearing Loss study. Parent ratings on auditory development questionnaires and children's speech recognition were assessed for 306 children who are hard of hearing. Children ranged in age from 12 months to 9 years. Three questionnaires involving parent ratings of auditory skill development and behavior were used, including the LittlEARS Auditory Questionnaire, Parents Evaluation of Oral/Aural Performance in Children rating scale, and an adaptation of the Speech, Spatial, and Qualities of Hearing scale. Speech recognition in quiet was assessed using the Open- and Closed-Set Test, Early Speech Perception test, Lexical Neighborhood Test, and Phonetically Balanced Kindergarten word lists. Speech recognition in noise was assessed using the Computer-Assisted Speech Perception Assessment. Children who are hard of hearing were compared with peers with normal hearing matched for age, maternal educational level, and nonverbal intelligence. The effects of aided audibility, HA use, and language ability on parent responses to auditory development questionnaires and on children's speech recognition were also examined. Children who are hard of hearing had poorer performance than peers with normal hearing on parent ratings of auditory skills and had poorer speech recognition. Significant individual variability among children who are hard of hearing was observed. Children with greater aided audibility through their HAs, more hours of HA use, and better language abilities generally had higher parent ratings of auditory skills and better speech-recognition abilities in quiet and in noise than peers with less audibility, more limited HA use, or poorer language abilities. In addition to the auditory and language factors that were predictive for speech recognition in quiet, phonological working memory was also a positive predictor for word recognition abilities in noise. Children who are hard of hearing continue to experience delays in auditory skill development and speech-recognition abilities compared with peers with normal hearing. However, significant improvements in these domains have occurred in comparison to similar data reported before the adoption of universal newborn hearing screening and early intervention programs for children who are hard of hearing. Increasing the audibility of speech has a direct positive effect on auditory skill development and speech-recognition abilities and also may enhance these skills by improving language abilities in children who are hard of hearing. Greater number of hours of HA use also had a significant positive impact on parent ratings of auditory skills and children's speech recognition.
Jiang, Min; Chen, Yukun; Liu, Mei; Rosenbloom, S Trent; Mani, Subramani; Denny, Joshua C; Xu, Hua
2011-01-01
The authors' goal was to develop and evaluate machine-learning-based approaches to extracting clinical entities-including medical problems, tests, and treatments, as well as their asserted status-from hospital discharge summaries written using natural language. This project was part of the 2010 Center of Informatics for Integrating Biology and the Bedside/Veterans Affairs (VA) natural-language-processing challenge. The authors implemented a machine-learning-based named entity recognition system for clinical text and systematically evaluated the contributions of different types of features and ML algorithms, using a training corpus of 349 annotated notes. Based on the results from training data, the authors developed a novel hybrid clinical entity extraction system, which integrated heuristic rule-based modules with the ML-base named entity recognition module. The authors applied the hybrid system to the concept extraction and assertion classification tasks in the challenge and evaluated its performance using a test data set with 477 annotated notes. Standard measures including precision, recall, and F-measure were calculated using the evaluation script provided by the Center of Informatics for Integrating Biology and the Bedside/VA challenge organizers. The overall performance for all three types of clinical entities and all six types of assertions across 477 annotated notes were considered as the primary metric in the challenge. Systematic evaluation on the training set showed that Conditional Random Fields outperformed Support Vector Machines, and semantic information from existing natural-language-processing systems largely improved performance, although contributions from different types of features varied. The authors' hybrid entity extraction system achieved a maximum overall F-score of 0.8391 for concept extraction (ranked second) and 0.9313 for assertion classification (ranked fourth, but not statistically different than the first three systems) on the test data set in the challenge.
Working Memory and Language Learning: A Review
ERIC Educational Resources Information Center
Archibald, Lisa M. D.
2017-01-01
Children with speech, language, and communication needs (SLCN) form a highly heterogeneous group, including those with an unexplained delay in language development known as specific language impairment (SLI). There is growing recognition that multiple mechanisms underlie the range of profiles observed in these children. Broadly speaking, both the…
Visual Word Recognition Across the Adult Lifespan
Cohen-Shikora, Emily R.; Balota, David A.
2016-01-01
The current study examines visual word recognition in a large sample (N = 148) across the adult lifespan and across a large set of stimuli (N = 1187) in three different lexical processing tasks (pronunciation, lexical decision, and animacy judgments). Although the focus of the present study is on the influence of word frequency, a diverse set of other variables are examined as the system ages and acquires more experience with language. Computational models and conceptual theories of visual word recognition and aging make differing predictions for age-related changes in the system. However, these have been difficult to assess because prior studies have produced inconsistent results, possibly due to sample differences, analytic procedures, and/or task-specific processes. The current study confronts these potential differences by using three different tasks, treating age and word variables as continuous, and exploring the influence of individual differences such as vocabulary, vision, and working memory. The primary finding is remarkable stability in the influence of a diverse set of variables on visual word recognition across the adult age spectrum. This pattern is discussed in reference to previous inconsistent findings in the literature and implications for current models of visual word recognition. PMID:27336629
Development of coffee maker service robot using speech and face recognition systems using POMDP
NASA Astrophysics Data System (ADS)
Budiharto, Widodo; Meiliana; Santoso Gunawan, Alexander Agung
2016-07-01
There are many development of intelligent service robot in order to interact with user naturally. This purpose can be done by embedding speech and face recognition ability on specific tasks to the robot. In this research, we would like to propose Intelligent Coffee Maker Robot which the speech recognition is based on Indonesian language and powered by statistical dialogue systems. This kind of robot can be used in the office, supermarket or restaurant. In our scenario, robot will recognize user's face and then accept commands from the user to do an action, specifically in making a coffee. Based on our previous work, the accuracy for speech recognition is about 86% and face recognition is about 93% in laboratory experiments. The main problem in here is to know the intention of user about how sweetness of the coffee. The intelligent coffee maker robot should conclude the user intention through conversation under unreliable automatic speech in noisy environment. In this paper, this spoken dialog problem is treated as a partially observable Markov decision process (POMDP). We describe how this formulation establish a promising framework by empirical results. The dialog simulations are presented which demonstrate significant quantitative outcome.
Brouwer, Susanne; Van Engen, Kristin J; Calandruccio, Lauren; Bradlow, Ann R
2012-02-01
This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener's knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. © 2012 Acoustical Society of America
Brouwer, Susanne; Van Engen, Kristin J.; Calandruccio, Lauren; Bradlow, Ann R.
2012-01-01
This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener’s knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. PMID:22352516
Language comprehenders retain implied shape and orientation of objects.
Pecher, Diane; van Dantzig, Saskia; Zwaan, Rolf A; Zeelenberg, René
2009-06-01
According to theories of embodied cognition, language comprehenders simulate sensorimotor experiences to represent the meaning of what they read. Previous studies have shown that picture recognition is better if the object in the picture matches the orientation or shape implied by a preceding sentence. In order to test whether strategic imagery may explain previous findings, language comprehenders first read a list of sentences in which objects were mentioned. Only once the complete list had been read was recognition memory tested with pictures. Recognition performance was better if the orientation or shape of the object matched that implied by the sentence, both immediately after reading the complete list of sentences and after a 45-min delay. These results suggest that previously found match effects were not due to strategic imagery and show that details of sensorimotor simulations are retained over longer periods.
Optimizing estimation of hemispheric dominance for language using magnetic source imaging.
Passaro, Antony D; Rezaie, Roozbeh; Moser, Dana C; Li, Zhimin; Dias, Nadeeka; Papanicolaou, Andrew C
2011-10-06
The efficacy of magnetoencephalography (MEG) as an alternative to invasive methods for investigating the cortical representation of language has been explored in several studies. Recently, studies comparing MEG to the gold standard Wada procedure have found inconsistent and often less-than accurate estimates of laterality across various MEG studies. Here we attempted to address this issue among normal right-handed adults (N=12) by supplementing a well-established MEG protocol involving word recognition and the single dipole method with a sentence comprehension task and a beamformer approach localizing neural oscillations. Beamformer analysis of word recognition and sentence comprehension tasks revealed a desynchronization in the 10-18Hz range, localized to the temporo-parietal cortices. Inspection of individual profiles of localized desynchronization (10-18Hz) revealed left hemispheric dominance in 91.7% and 83.3% of individuals during the word recognition and sentence comprehension tasks, respectively. In contrast, single dipole analysis yielded lower estimates, such that activity in temporal language regions was left-lateralized in 66.7% and 58.3% of individuals during word recognition and sentence comprehension, respectively. The results obtained from the word recognition task and localization of oscillatory activity using a beamformer appear to be in line with general estimates of left hemispheric dominance for language in normal right-handed individuals. Furthermore, the current findings support the growing notion that changes in neural oscillations underlie critical components of linguistic processing. Published by Elsevier B.V.
Zhang, Linjun; Li, Yu; Wu, Han; Li, Xin; Shu, Hua; Zhang, Yang; Li, Ping
2016-01-01
Speech recognition by second language (L2) learners in optimal and suboptimal conditions has been examined extensively with English as the target language in most previous studies. This study extended existing experimental protocols (Wang et al., 2013) to investigate Mandarin speech recognition by Japanese learners of Mandarin at two different levels (elementary vs. intermediate) of proficiency. The overall results showed that in addition to L2 proficiency, semantic context, F0 contours, and listening condition all affected the recognition performance on the Mandarin sentences. However, the effects of semantic context and F0 contours on L2 speech recognition diverged to some extent. Specifically, there was significant modulation effect of listening condition on semantic context, indicating that L2 learners made use of semantic context less efficiently in the interfering background than in quiet. In contrast, no significant modulation effect of listening condition on F0 contours was found. Furthermore, there was significant interaction between semantic context and F0 contours, indicating that semantic context becomes more important for L2 speech recognition when F0 information is degraded. None of these effects were found to be modulated by L2 proficiency. The discrepancy in the effects of semantic context and F0 contours on L2 speech recognition in the interfering background might be related to differences in processing capacities required by the two types of information in adverse listening conditions.
How should a speech recognizer work?
Scharenborg, Odette; Norris, Dennis; Bosch, Louis; McQueen, James M
2005-11-12
Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input. 2005 Lawrence Erlbaum Associates, Inc.
ERIC Educational Resources Information Center
ten Holt, G. A.; van Doorn, A. J.; de Ridder, H.; Reinders, M. J. T.; Hendriks, E. A.
2009-01-01
We present the results of an experiment on lexical recognition of human sign language signs in which the available perceptual information about handshape and hand orientation was manipulated. Stimuli were videos of signs from Sign Language of the Netherlands (SLN). The videos were processed to create four conditions: (1) one in which neither…
U.S. Army Research Laboratory (ARL) Corporate Dari Document Transcription and Translation Guidelines
2012-10-01
text file format. 15. SUBJECT TERMS Transcription, Translation, guidelines, ground truth, Optical character recognition , OCR, Machine Translation, MT...foreign language into a target language in order to train, test, and evaluate optical character recognition (OCR) and machine translation (MT) embedded...graphic element and should not be transcribed. Elements that are not part of the primary text such as handwritten annotations or stamps should not be
Visual recognition of permuted words
NASA Astrophysics Data System (ADS)
Rashid, Sheikh Faisal; Shafait, Faisal; Breuel, Thomas M.
2010-02-01
In current study we examine how letter permutation affects in visual recognition of words for two orthographically dissimilar languages, Urdu and German. We present the hypothesis that recognition or reading of permuted and non-permuted words are two distinct mental level processes, and that people use different strategies in handling permuted words as compared to normal words. A comparison between reading behavior of people in these languages is also presented. We present our study in context of dual route theories of reading and it is observed that the dual-route theory is consistent with explanation of our hypothesis of distinction in underlying cognitive behavior for reading permuted and non-permuted words. We conducted three experiments in lexical decision tasks to analyze how reading is degraded or affected by letter permutation. We performed analysis of variance (ANOVA), distribution free rank test, and t-test to determine the significance differences in response time latencies for two classes of data. Results showed that the recognition accuracy for permuted words is decreased 31% in case of Urdu and 11% in case of German language. We also found a considerable difference in reading behavior for cursive and alphabetic languages and it is observed that reading of Urdu is comparatively slower than reading of German due to characteristics of cursive script.
Parsing and Tagging of Bilingual Dictionary
2003-09-01
LAMP-TR-106 CAR-TR-991 CS-TR-4529 UMIACS-TR-2003-97 PARSING ANS TAGGING OF BILINGUAL DICTIONARY Huanfeng Ma1,2, Burcu Karagol-Ayan1,2, David... dictionaries hold great potential as a source of lexical resources for training and testing automated systems for optical character recognition, machine...translation, and cross-language information retrieval. In this paper, we describe a system for extracting term lexicons from printed bilingual dictionaries
Wołk, Agnieszka; Glinkowski, Wojciech
2017-01-01
People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer. PMID:29230254
Wołk, Krzysztof; Wołk, Agnieszka; Glinkowski, Wojciech
2017-01-01
People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer.
Critically Engaging with Cultural Representations in Foreign Language Textbooks
ERIC Educational Resources Information Center
McConachy, Troy
2018-01-01
There is currently strong recognition within the field of intercultural language teaching of the need for language learners to develop the ability to actively interpret and critically reflect on cultural meanings and representations from a variety of perspectives. This article argues that cultural representations contained in language textbooks,…
Law, Language, and the Multiethnic State.
ERIC Educational Resources Information Center
De Varennes, Fernand
1996-01-01
Examines why language policies should be considered in a multiethnic state and suggests that there are human rights issues that mandate some recognition of language demands and usage beyond what some states may provide. The article emphasizes that questions of language, ethnicity, and nationalism must be addressed in a rational and coherent…
ERIC Educational Resources Information Center
Xiang, Huadong; Dediu, Dan; Roberts, Leah; van Oort, Erik; Norris, David G.; Hagoort, Peter
2012-01-01
In this article, we report the results of a study on the relationship between individual differences in language learning aptitude and the structural connectivity of language pathways in the adult brain, the first of its kind. We measured four components of language aptitude ("vocabulary learning"; "sound recognition"; "sound-symbol…
ERIC Educational Resources Information Center
US Department of Education, 2010
2010-01-01
The American Speech-Language-Hearing Association, Council on Academic Accreditation in Audiology and Speech-Language Pathology (CAA) is a national accrediting agency of graduate education programs in audiology or speech-language pathology. The CAA currently accredits or or preaccredits 319 programs (247 in speech-language pathology and 72 in…
Bilingual Education for Deaf Children in Sweden
ERIC Educational Resources Information Center
Svartholm, Kristina
2010-01-01
In 1981, Swedish Sign Language gained recognition by the Swedish Parliament as the language of deaf people, a decision that made Sweden the first country in the world to give a sign language the status of a language. Swedish was designated as a second language for deaf people, and the need for bilingualism among them was officially asserted. This…
NASA Astrophysics Data System (ADS)
Kostopoulos, S.; Sidiropoulos, K.; Glotsos, D.; Dimitropoulos, N.; Kalatzis, I.; Asvestas, P.; Cavouras, D.
2014-03-01
The aim of this study was to design a pattern recognition system for assisting the diagnosis of breast lesions, using image information from Ultrasound (US) and Digital Mammography (DM) imaging modalities. State-of-art computer technology was employed based on commercial Graphics Processing Unit (GPU) cards and parallel programming. An experienced radiologist outlined breast lesions on both US and DM images from 59 patients employing a custom designed computer software application. Textural features were extracted from each lesion and were used to design the pattern recognition system. Several classifiers were tested for highest performance in discriminating benign from malignant lesions. Classifiers were also combined into ensemble schemes for further improvement of the system's classification accuracy. Following the pattern recognition system optimization, the final system was designed employing the Probabilistic Neural Network classifier (PNN) on the GPU card (GeForce 580GTX) using CUDA programming framework and C++ programming language. The use of such state-of-art technology renders the system capable of redesigning itself on site once additional verified US and DM data are collected. Mixture of US and DM features optimized performance with over 90% accuracy in correctly classifying the lesions.
Automated speech understanding: the next generation
NASA Astrophysics Data System (ADS)
Picone, J.; Ebel, W. J.; Deshmukh, N.
1995-04-01
Modern speech understanding systems merge interdisciplinary technologies from Signal Processing, Pattern Recognition, Natural Language, and Linguistics into a unified statistical framework. These systems, which have applications in a wide range of signal processing problems, represent a revolution in Digital Signal Processing (DSP). Once a field dominated by vector-oriented processors and linear algebra-based mathematics, the current generation of DSP-based systems rely on sophisticated statistical models implemented using a complex software paradigm. Such systems are now capable of understanding continuous speech input for vocabularies of several thousand words in operational environments. The current generation of deployed systems, based on small vocabularies of isolated words, will soon be replaced by a new technology offering natural language access to vast information resources such as the Internet, and provide completely automated voice interfaces for mundane tasks such as travel planning and directory assistance.
L2 Word Recognition Research: A Critical Review.
ERIC Educational Resources Information Center
Koda, Keiko
1996-01-01
Explores conceptual syntheses advancing second language (L2) word recognition research and uncovers agendas relating to cross-linguistic examinations of L2 processing in a cohort of undergraduate students in France. Describes connections between word recognition and reading, overviews the connectionist construct, and illustrates cross-linguistic…
Orthographic Facilitation in Chinese Spoken Word Recognition: An ERP Study
ERIC Educational Resources Information Center
Zou, Lijuan; Desroches, Amy S.; Liu, Youyi; Xia, Zhichao; Shu, Hua
2012-01-01
Orthographic influences in spoken word recognition have been previously examined in alphabetic languages. However, it is unknown whether orthographic information affects spoken word recognition in Chinese, which has a clean dissociation between orthography (O) and phonology (P). The present study investigated orthographic effects using event…
Velan, Hadas; Frost, Ram
2010-01-01
Recent studies suggest that basic effects which are markers of visual word recognition in Indo-European languages cannot be obtained in Hebrew or in Arabic. Although Hebrew has an alphabetic writing system, just like English, French, or Spanish, a series of studies consistently suggested that simple form-orthographic priming, or letter-transposition priming are not found in Hebrew. In four experiments, we tested the hypothesis that this is due to the fact that Semitic words have an underlying structure that constrains the possible alignment of phonemes and their respective letters. The experiments contrasted typical Semitic words which are root-derived, with Hebrew words of non-Semitic origin, which are morphologically simple and resemble base words in European languages. Using RSVP, TL priming, and form-priming manipulations, we show that Hebrew readers process Hebrew words which are morphologically simple similar to the way they process English words. These words indeed reveal the typical form-priming and TL priming effects reported in European languages. In contrast, words with internal structure are processed differently, and require a different code for lexical access. We discuss the implications of these findings for current models of visual word recognition. PMID:21163472
ERIC Educational Resources Information Center
Galloway, Edward A.; Michalek, Gabrielle V.
1995-01-01
Discusses the conversion project of the congressional papers of Senator John Heinz into digital format and the provision of electronic access to these papers by Carnegie Mellon University. Topics include collection background, project team structure, document processing, scanning, use of optical character recognition software, verification…
ERIC Educational Resources Information Center
Sáez, Leilani; Irvin, P. Shawn; Alonzo, Julie; Tindal, Gerald
2012-01-01
In 2006, the easyCBM reading assessment system was developed to support the progress monitoring of phoneme segmenting, letter names and sounds recognition, word reading, passage reading fluency, and comprehension skill development in elementary schools. More recently, the Common Core Standards in English Language Arts have been introduced as a…
ERIC Educational Resources Information Center
Osguthorpe, Russell T.; Li Chang, Linda
1988-01-01
A computerized symbol processor system using an Apple IIe computer and a Power Pad graphics tablet was tested with 22 nonspeaking, multiply disabled students. The students were taught to express themselves independently in writing, and they did significantly better than control students on measures of language comprehension and symbol recognition.…
ERIC Educational Resources Information Center
Quer, Josep
2012-01-01
Despite being minority languages like many others, sign languages have traditionally remained absent from the agendas of policy makers and language planning and policies. In the past two decades, though, this situation has started to change at different paces and to different degrees in several countries. In this article, the author describes the…
ERIC Educational Resources Information Center
Casey, Laura Baylot; Bicard, David F.
2009-01-01
Language development in typically developing children has a very predictable pattern beginning with crying, cooing, babbling, and gestures along with the recognition of spoken words, comprehension of spoken words, and then one word utterances. This predictable pattern breaks down for children with language disorders. This article will discuss…
The Promise of NLP and Speech Processing Technologies in Language Assessment
ERIC Educational Resources Information Center
Chapelle, Carol A.; Chung, Yoo-Ree
2010-01-01
Advances in natural language processing (NLP) and automatic speech recognition and processing technologies offer new opportunities for language testing. Despite their potential uses on a range of language test item types, relatively little work has been done in this area, and it is therefore not well understood by test developers, researchers or…
McCreery, Ryan W.; Walker, Elizabeth A.; Spratford, Meredith; Oleson, Jacob; Bentler, Ruth; Holte, Lenore; Roush, Patricia
2015-01-01
Objectives Progress has been made in recent years in the provision of amplification and early intervention for children who are hard of hearing. However, children who use hearing aids (HA) may have inconsistent access to their auditory environment due to limitations in speech audibility through their HAs or limited HA use. The effects of variability in children’s auditory experience on parent-report auditory skills questionnaires and on speech recognition in quiet and in noise were examined for a large group of children who were followed as part of the Outcomes of Children with Hearing Loss study. Design Parent ratings on auditory development questionnaires and children’s speech recognition were assessed for 306 children who are hard of hearing. Children ranged in age from 12 months to 9 years of age. Three questionnaires involving parent ratings of auditory skill development and behavior were used, including the LittlEARS Auditory Questionnaire, Parents Evaluation of Oral/Aural Performance in Children Rating Scale, and an adaptation of the Speech, Spatial and Qualities of Hearing scale. Speech recognition in quiet was assessed using the Open and Closed set task, Early Speech Perception Test, Lexical Neighborhood Test, and Phonetically-balanced Kindergarten word lists. Speech recognition in noise was assessed using the Computer-Assisted Speech Perception Assessment. Children who are hard of hearing were compared to peers with normal hearing matched for age, maternal educational level and nonverbal intelligence. The effects of aided audibility, HA use and language ability on parent responses to auditory development questionnaires and on children’s speech recognition were also examined. Results Children who are hard of hearing had poorer performance than peers with normal hearing on parent ratings of auditory skills and had poorer speech recognition. Significant individual variability among children who are hard of hearing was observed. Children with greater aided audibility through their HAs, more hours of HA use and better language abilities generally had higher parent ratings of auditory skills and better speech recognition abilities in quiet and in noise than peers with less audibility, more limited HA use or poorer language abilities. In addition to the auditory and language factors that were predictive for speech recognition in quiet, phonological working memory was also a positive predictor for word recognition abilities in noise. Conclusions Children who are hard of hearing continue to experience delays in auditory skill development and speech recognition abilities compared to peers with normal hearing. However, significant improvements in these domains have occurred in comparison to similar data reported prior to the adoption of universal newborn hearing screening and early intervention programs for children who are hard of hearing. Increasing the audibility of speech has a direct positive effect on auditory skill development and speech recognition abilities, and may also enhance these skills by improving language abilities in children who are hard of hearing. Greater number of hours of HA use also had a significant positive impact on parent ratings of auditory skills and children’s speech recognition. PMID:26731160
The Temporal Structure of Spoken Language Understanding.
ERIC Educational Resources Information Center
Marslen-Wilson, William; Tyler, Lorraine Komisarjevsky
1980-01-01
An investigation of word-by-word time-course of spoken language understanding focused on word recognition and structural and interpretative processes. Results supported an online interactive language processing theory, in which lexical, structural, and interpretative knowledge sources communicate and interact during processing efficiently and…
Whole Language in the Play Store.
ERIC Educational Resources Information Center
Fields, Marjorie V.; Hillstead, Deborah V.
1990-01-01
The concept of whole language instruction is explained by means of examples from a kindergarten unit on the grocery store. Activities include visiting the supermarket, making stone soup, and creating a grocery store. Activities teach reading, writing, oral language, phonics, and word recognition. (DG)
Advances to the development of a basic Mexican sign-to-speech and text language translator
NASA Astrophysics Data System (ADS)
Garcia-Bautista, G.; Trujillo-Romero, F.; Diaz-Gonzalez, G.
2016-09-01
Sign Language (SL) is the basic alternative communication method between deaf people. However, most of the hearing people have trouble understanding the SL, making communication with deaf people almost impossible and taking them apart from daily activities. In this work we present an automatic basic real-time sign language translator capable of recognize a basic list of Mexican Sign Language (MSL) signs of 10 meaningful words, letters (A-Z) and numbers (1-10) and translate them into speech and text. The signs were collected from a group of 35 MSL signers executed in front of a Microsoft Kinect™ Sensor. The hand gesture recognition system use the RGB-D camera to build and storage data point clouds, color and skeleton tracking information. In this work we propose a method to obtain the representative hand trajectory pattern information. We use Euclidean Segmentation method to obtain the hand shape and Hierarchical Centroid as feature extraction method for images of numbers and letters. A pattern recognition method based on a Back Propagation Artificial Neural Network (ANN) is used to interpret the hand gestures. Finally, we use K-Fold Cross Validation method for training and testing stages. Our results achieve an accuracy of 95.71% on words, 98.57% on numbers and 79.71% on letters. In addition, an interactive user interface was designed to present the results in voice and text format.
Changes in Visual Object Recognition Precede the Shape Bias in Early Noun Learning
Yee, Meagan; Jones, Susan S.; Smith, Linda B.
2012-01-01
Two of the most formidable skills that characterize human beings are language and our prowess in visual object recognition. They may also be developmentally intertwined. Two experiments, a large sample cross-sectional study and a smaller sample 6-month longitudinal study of 18- to 24-month-olds, tested a hypothesized developmental link between changes in visual object representation and noun learning. Previous findings in visual object recognition indicate that children’s ability to recognize common basic level categories from sparse structural shape representations of object shape emerges between the ages of 18 and 24 months, is related to noun vocabulary size, and is lacking in children with language delay. Other research shows in artificial noun learning tasks that during this same developmental period, young children systematically generalize object names by shape, that this shape bias predicts future noun learning, and is lacking in children with language delay. The two experiments examine the developmental relation between visual object recognition and the shape bias for the first time. The results show that developmental changes in visual object recognition systematically precede the emergence of the shape bias. The results suggest a developmental pathway in which early changes in visual object recognition that are themselves linked to category learning enable the discovery of higher-order regularities in category structure and thus the shape bias in novel noun learning tasks. The proposed developmental pathway has implications for understanding the role of specific experience in the development of both visual object recognition and the shape bias in early noun learning. PMID:23227015
Huysmans, Elke; Bolk, Elske; Zekveld, Adriana A; Festen, Joost M; de Groot, Annette M B; Goverts, S Theo
2016-01-01
The authors first examined the influence of moderate to severe congenital hearing impairment (CHI) on the correctness of samples of elicited spoken language. Then, the authors used this measure as an indicator of linguistic proficiency and examined its effect on performance in language reception, independent of bottom-up auditory processing. In groups of adults with normal hearing (NH, n = 22), acquired hearing impairment (AHI, n = 22), and moderate to severe CHI (n = 21), the authors assessed linguistic proficiency by analyzing the morphosyntactic correctness of their spoken language production. Language reception skills were examined with a task for masked sentence recognition in the visual domain (text), at a readability level of 50%, using grammatically correct sentences and sentences with distorted morphosyntactic cues. The actual performance on the tasks was compared between groups. Adults with CHI made more morphosyntactic errors in spoken language production than adults with NH, while no differences were observed between the AHI and NH group. This outcome pattern sustained when comparisons were restricted to subgroups of AHI and CHI adults, matched for current auditory speech reception abilities. The data yielded no differences between groups in performance in masked text recognition of grammatically correct sentences in a test condition in which subjects could fully take advantage of their linguistic knowledge. Also, no difference between groups was found in the sensitivity to morphosyntactic distortions when processing short masked sentences, presented visually. These data showed that problems with the correct use of specific morphosyntactic knowledge in spoken language production are a long-term effect of moderate to severe CHI, independent of current auditory processing abilities. However, moderate to severe CHI generally does not impede performance in masked language reception in the visual modality, as measured in this study with short, degraded sentences. Aspects of linguistic proficiency that are affected by CHI thus do not seem to play a role in masked sentence recognition in the visual modality.
Language Model Combination and Adaptation Using Weighted Finite State Transducers
NASA Technical Reports Server (NTRS)
Liu, X.; Gales, M. J. F.; Hieronymus, J. L.; Woodland, P. C.
2010-01-01
In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences
The role of voice input for human-machine communication.
Cohen, P R; Oviatt, S L
1995-01-01
Optimism is growing that the near future will witness rapid growth in human-computer interaction using voice. System prototypes have recently been built that demonstrate speaker-independent real-time speech recognition, and understanding of naturally spoken utterances with vocabularies of 1000 to 2000 words, and larger. Already, computer manufacturers are building speech recognition subsystems into their new product lines. However, before this technology can be broadly useful, a substantial knowledge base is needed about human spoken language and performance during computer-based spoken interaction. This paper reviews application areas in which spoken interaction can play a significant role, assesses potential benefits of spoken interaction with machines, and compares voice with other modalities of human-computer interaction. It also discusses information that will be needed to build a firm empirical foundation for the design of future spoken and multimodal interfaces. Finally, it argues for a more systematic and scientific approach to investigating spoken input and performance with future language technology. PMID:7479803
Translation and Foreign Language Teaching.
ERIC Educational Resources Information Center
Tinsley, Royal L., Jr.
Translators and teachers of foreign languages need each other: translators need formal academic training and recognition and teachers of foreign languages need students. Unfortunately, translators know only too well that most FL teachers are not competent translators, and FL Departments generally consider translation as an activity beneath the…
Word recognition and phonetic structure acquisition: Possible relations
NASA Astrophysics Data System (ADS)
Morgan, James
2002-05-01
Several accounts of possible relations between the emergence of the mental lexicon and acquisition of native language phonological structure have been propounded. In one view, acquisition of word meanings guides infants' attention toward those contrasts that are linguistically significant in their language. In the opposing view, native language phonological categories may be acquired from statistical patterns of input speech, prior to and independent of learning at the lexical level. Here, a more interactive account will be presented, in which phonological structure is modeled as emerging consequentially from the self-organization of perceptual space underlying word recognition. A key prediction of this model is that early native language phonological categories will be highly context specific. Data bearing on this prediction will be presented which provide clues to the nature of infants' statistical analysis of input.
Marchman, Virginia A.; Fernald, Anne; Hurtado, Nereyda
2010-01-01
Research using online comprehension measures with monolingual children shows that speed and accuracy of spoken word recognition are correlated with lexical development. Here we examined speech processing efficiency in relation to vocabulary development in bilingual children learning both Spanish and English (n=26; 2;6 yrs). Between-language associations were weak: vocabulary size in Spanish was uncorrelated with vocabulary in English, and children’s facility in online comprehension in Spanish was unrelated to their facility in English. Instead, efficiency of online processing in one language was significantly related to vocabulary size in that language, after controlling for processing speed and vocabulary size in the other language. These links between efficiency of lexical access and vocabulary knowledge in bilinguals parallel those previously reported for Spanish and English monolinguals, suggesting that children’s ability to abstract information from the input in building a working lexicon relates fundamentally to mechanisms underlying the construction of language. PMID:19726000
An L2 Reader's Word-Recognition Strategies: Transferred or Developed
ERIC Educational Resources Information Center
Alco, Bonnie
2010-01-01
Transfer of reading strategies from the first language (L1) to the second language (L2) has long puzzled educators, but what happens if the L1 is an alphabet language and the second is not, or if there is a mismatch in the languages' grapheme-phoneme connection? Although some students readily adjust to reading and writing in their second language,…
Analyzing handwriting biometrics in metadata context
NASA Astrophysics Data System (ADS)
Scheidat, Tobias; Wolf, Franziska; Vielhauer, Claus
2006-02-01
In this article, methods for user recognition by online handwriting are experimentally analyzed using a combination of demographic data of users in relation to their handwriting habits. Online handwriting as a biometric method is characterized by having high variations of characteristics that influences the reliance and security of this method. These variations have not been researched in detail so far. Especially in cross-cultural application it is urgent to reveal the impact of personal background to security aspects in biometrics. Metadata represent the background of writers, by introducing cultural, biological and conditional (changing) aspects like fist language, country of origin, gender, handedness, experiences the influence handwriting and language skills. The goal is the revelation of intercultural impacts on handwriting in order to achieve higher security in biometrical systems. In our experiments, in order to achieve a relatively high coverage, 48 different handwriting tasks have been accomplished by 47 users from three countries (Germany, India and Italy) have been investigated with respect to the relations of metadata and biometric recognition performance. For this purpose, hypotheses have been formulated and have been evaluated using the measurement of well-known recognition error rates from biometrics. The evaluation addressed both: system reliance and security threads by skilled forgeries. For the later purpose, a novel forgery type is introduced, which applies the personal metadata to security aspects and includes new methods of security tests. Finally in our paper, we formulate recommendations for specific user groups and handwriting samples.
Event Recognition Based on Deep Learning in Chinese Texts
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
2016-01-01
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231
Event Recognition Based on Deep Learning in Chinese Texts.
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
2016-01-01
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
Targeted Help for Spoken Dialogue Systems: Intelligent Feedback Improves Naive Users' Performance
NASA Technical Reports Server (NTRS)
Hockey, Beth Ann; Lemon, Oliver; Campana, Ellen; Hiatt, Laura; Aist, Gregory; Hieronymous, Jim; Gruenstein, Alexander; Dowding, John
2003-01-01
We present experimental evidence that providing naive users of a spoken dialogue system with immediate help messages related to their out-of-coverage utterances improves their success in using the system. A grammar-based recognizer and a Statistical Language Model (SLM) recognizer are run simultaneously. If the grammar-based recognizer suceeds, the less accurate SLM recognizer hypothesis is not used. When the grammar-based recognizer fails and the SLM recognizer produces a recognition hypothesis, this result is used by the Targeted Help agent to give the user feed-back on what was recognized, a diagnosis of what was problematic about the utterance, and a related in-coverage example. The in-coverage example is intended to encourage alignment between user inputs and the language model of the system. We report on controlled experiments on a spoken dialogue system for command and control of a simulated robotic helicopter.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif
2016-03-11
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers-that we proposed earlier-improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif
2016-01-01
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. PMID:26978368
Neural Correlates of Human Action Observation in Hearing and Deaf Subjects
Corina, David; Chiu, Yi-Shiuan; Knapp, Heather; Greenwald, Ralf; Jose-Robertson, Lucia San; Braun, Allen
2007-01-01
Accumulating evidence has suggested the existence of a human action recognition system involving inferior frontal, parietal, and superior temporal regions that may participate in both the perception and execution of actions. However, little is known about the specificity of this system in response to different forms of human action. Here we present data from PET neuroimaging studies from passive viewing of three distinct action types, intransitive self-oriented actions (e.g., stretching, rubbing one’s eyes, etc.), transitive object-oriented actions (e.g., opening a door, lifting a cup to the lips to drink), and the abstract, symbolic actions–signs used in American Sign Language. Our results show that these different classes of human actions engage a frontal/parietal/STS human action recognition system in a highly similar fashion. However, the results indicate that this neural consistency across motion classes is true primarily for hearing subjects. Data from deaf signers shows a non-uniform response to different classes of human actions. As expected, deaf signers engaged left-hemisphere perisylvian language areas during the perception of signed language signs. Surprisingly, these subjects did not engage the expected frontal/parietal/STS circuitry during passive viewing of non-linguistic actions, but rather reliably activated middle-occipital temporal-ventral regions which are known to participate in the detection of human bodies, faces, and movements. Comparisons with data from hearing subjects establish statistically significant contributions of middle-occipital temporal-ventral during the processing of non-linguistic actions in deaf signers. These results suggest that during human motion processing, deaf individuals may engage specialized neural systems that allow for rapid, online differentiation of meaningful linguistic actions from non-linguistic human movements. PMID:17459349
ERIC Educational Resources Information Center
Watson, Enid; Finkelstein, Norma; Gurewich, Deborah; Morse, Barbara
2011-01-01
Prenatal alcohol exposure can result in fetal alcohol spectrum disorders (FASD), which can include physical and neurobehavioral disorders, including cognitive, social, language, and motor impairments that can persist throughout life. In order for children with FASD to receive the full benefit of services, recognition of their disability needs to…
Empowering Nigerian Pidgin: A Challenge for Status Planning?
ERIC Educational Resources Information Center
Igboanusi, Herbert
2008-01-01
In spite of the fact that Nigerian Pidgin (NP) is probably the language with the highest population of users in Nigeria, it does not enjoy official recognition and is excluded from the education system. It lacks prestige because it is seen by many Nigerians as a "bad" form of English and associated with a socially deprived set of people.…
Implementation of the Intelligent Voice System for Kazakh
NASA Astrophysics Data System (ADS)
Yessenbayev, Zh; Saparkhojayev, N.; Tibeyev, T.
2014-04-01
Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis- MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application Server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1% WER on isolated word recognition and 6.9% WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.
An early illness recognition framework using a temporal Smith Waterman algorithm and NLP.
Hajihashemi, Zahra; Popescu, Mihail
2013-01-01
In this paper we propose a framework for detecting health patterns based on non-wearable sensor sequence similarity and natural language processing (NLP). In TigerPlace, an aging in place facility from Columbia, MO, we deployed 47 sensor networks together with a nursing electronic health record (EHR) system to provide early illness recognition. The proposed framework utilizes sensor sequence similarity and NLP on EHR nursing comments to automatically notify the physician when health problems are detected. The reported methodology is inspired by genomic sequence annotation using similarity algorithms such as Smith Waterman (SW). Similarly, for each sensor sequence, we associate health concepts extracted from the nursing notes using Metamap, a NLP tool provided by Unified Medical Language System (UMLS). Since sensor sequences, unlike genomics ones, have an associated time dimension we propose a temporal variant of SW (TSW) to account for time. The main challenges presented by our framework are finding the most suitable time sequence similarity and aggregation of the retrieved UMLS concepts. On a pilot dataset from three Tiger Place residents, with a total of 1685 sensor days and 626 nursing records, we obtained an average precision of 0.64 and a recall of 0.37.
The Role of Native-Language Knowledge in the Perception of Casual Speech in a Second Language
Mitterer, Holger; Tuinman, Annelie
2012-01-01
Casual speech processes, such as /t/-reduction, make word recognition harder. Additionally, word recognition is also harder in a second language (L2). Combining these challenges, we investigated whether L2 learners have recourse to knowledge from their native language (L1) when dealing with casual speech processes in their L2. In three experiments, production and perception of /t/-reduction was investigated. An initial production experiment showed that /t/-reduction occurred in both languages and patterned similarly in proper nouns but differed when /t/ was a verbal inflection. Two perception experiments compared the performance of German learners of Dutch with that of native speakers for nouns and verbs. Mirroring the production patterns, German learners’ performance strongly resembled that of native Dutch listeners when the reduced /t/ was part of a word stem, but deviated where /t/ was a verbal inflection. These results suggest that a casual speech process in a second language is problematic for learners when the process is not known from the leaner’s native language, similar to what has been observed for phoneme contrasts. PMID:22811675
Cognitive aging and hearing acuity: modeling spoken language comprehension.
Wingfield, Arthur; Amichetti, Nicole M; Lash, Amanda
2015-01-01
The comprehension of spoken language has been characterized by a number of "local" theories that have focused on specific aspects of the task: models of word recognition, models of selective attention, accounts of thematic role assignment at the sentence level, and so forth. The ease of language understanding (ELU) model (Rönnberg et al., 2013) stands as one of the few attempts to offer a fully encompassing framework for language understanding. In this paper we discuss interactions between perceptual, linguistic, and cognitive factors in spoken language understanding. Central to our presentation is an examination of aspects of the ELU model that apply especially to spoken language comprehension in adult aging, where speed of processing, working memory capacity, and hearing acuity are often compromised. We discuss, in relation to the ELU model, conceptions of working memory and its capacity limitations, the use of linguistic context to aid in speech recognition and the importance of inhibitory control, and language comprehension at the sentence level. Throughout this paper we offer a constructive look at the ELU model; where it is strong and where there are gaps to be filled.
[Information technology in learning sign language].
Hernández, Cesar; Pulido, Jose L; Arias, Jorge E
2015-01-01
To develop a technological tool that improves the initial learning of sign language in hearing impaired children. The development of this research was conducted in three phases: the lifting of requirements, design and development of the proposed device, and validation and evaluation device. Through the use of information technology and with the advice of special education professionals, we were able to develop an electronic device that facilitates the learning of sign language in deaf children. This is formed mainly by a graphic touch screen, a voice synthesizer, and a voice recognition system. Validation was performed with the deaf children in the Filadelfia School of the city of Bogotá. A learning methodology was established that improves learning times through a small, portable, lightweight, and educational technological prototype. Tests showed the effectiveness of this prototype, achieving a 32 % reduction in the initial learning time for sign language in deaf children.
Incorporating Speech Recognition into a Natural User Interface
NASA Technical Reports Server (NTRS)
Chapa, Nicholas
2017-01-01
The Augmented/ Virtual Reality (AVR) Lab has been working to study the applicability of recent virtual and augmented reality hardware and software to KSC operations. This includes the Oculus Rift, HTC Vive, Microsoft HoloLens, and Unity game engine. My project in this lab is to integrate voice recognition and voice commands into an easy to modify system that can be added to an existing portion of a Natural User Interface (NUI). A NUI is an intuitive and simple to use interface incorporating visual, touch, and speech recognition. The inclusion of speech recognition capability will allow users to perform actions or make inquiries using only their voice. The simplicity of needing only to speak to control an on-screen object or enact some digital action means that any user can quickly become accustomed to using this system. Multiple programs were tested for use in a speech command and recognition system. Sphinx4 translates speech to text using a Hidden Markov Model (HMM) based Language Model, an Acoustic Model, and a word Dictionary running on Java. PocketSphinx had similar functionality to Sphinx4 but instead ran on C. However, neither of these programs were ideal as building a Java or C wrapper slowed performance. The most ideal speech recognition system tested was the Unity Engine Grammar Recognizer. A Context Free Grammar (CFG) structure is written in an XML file to specify the structure of phrases and words that will be recognized by Unity Grammar Recognizer. Using Speech Recognition Grammar Specification (SRGS) 1.0 makes modifying the recognized combinations of words and phrases very simple and quick to do. With SRGS 1.0, semantic information can also be added to the XML file, which allows for even more control over how spoken words and phrases are interpreted by Unity. Additionally, using a CFG with SRGS 1.0 produces a Finite State Machine (FSM) functionality limiting the potential for incorrectly heard words or phrases. The purpose of my project was to investigate options for a Speech Recognition System. To that end I attempted to integrate Sphinx4 into a user interface. Sphinx4 had great accuracy and is the only free program able to perform offline speech dictation. However it had a limited dictionary of words that could be recognized, single syllable words were almost impossible for it to hear, and since it ran on Java it could not be integrated into the Unity based NUI. PocketSphinx ran much faster than Sphinx4 which would've made it ideal as a plugin to the Unity NUI, unfortunately creating a C# wrapper for the C code made the program unusable with Unity due to the wrapper slowing code execution and class files becoming unreachable. Unity Grammar Recognizer is the ideal speech recognition interface, it is flexible in recognizing multiple variations of the same command. It is also the most accurate program in recognizing speech due to using an XML grammar to specify speech structure instead of relying solely on a Dictionary and Language model. The Unity Grammar Recognizer will be used with the NUI for these reasons as well as being written in C# which further simplifies the incorporation.
2006-10-01
Hierarchy of Pre-Processing Techniques 3. NLP (Natural Language Processing) Utilities 3.1 Named-Entity Recognition 3.1.1 Example for Named-Entity... Recognition 3.2 Symbol RemovalN-Gram Identification: Bi-Grams 4. Stemming 4.1 Stemming Example 5. Delete List 5.1 Open a Delete List 5.1.1 Small...iterative and involves several key processes: • Named-Entity Recognition Named-Entity Recognition is an Automap feature that allows you to
Characterising receptive language processing in schizophrenia using word and sentence tasks.
Tan, Eric J; Yelland, Gregory W; Rossell, Susan L
2016-01-01
Language dysfunction is proposed to relate to the speech disturbances in schizophrenia, which are more commonly referred to as formal thought disorder (FTD). Presently, language production deficits in schizophrenia are better characterised than language comprehension difficulties. This study thus aimed to examine three aspects of language comprehension in schizophrenia: (1) the role of lexical processing, (2) meaning attribution for words and sentences, and (3) the relationship between comprehension and production. Fifty-seven schizophrenia/schizoaffective disorder patients and 48 healthy controls completed a clinical assessment and three language tasks assessing word recognition, synonym identification, and sentence comprehension. Poorer patient performance was expected on the latter two tasks. Recognition of word form was not impaired in schizophrenia, indicating intact lexical processing. Whereas single-word synonym identification was not significantly impaired, there was a tendency to attribute word meanings based on phonological similarity with increasing FTD severity. Importantly, there was a significant sentence comprehension deficit for processing deep structure, which correlated with FTD severity. These findings established a receptive language deficit in schizophrenia at the syntactic level. There was also evidence for a relationship between some aspects of language comprehension and speech production/FTD. Apart from indicating language as another mechanism in FTD aetiology, the data also suggest that remediating language comprehension problems may be an avenue to pursue in alleviating FTD symptomatology.
Linguistic Corpora and Language Teaching.
ERIC Educational Resources Information Center
Murison-Bowie, Simon
1996-01-01
Examines issues raised by corpus linguistics concerning the description of language. The article argues that it is necessary to start from correct descriptions of linguistic units and the contexts in which they occur. Corpus linguistics has joined with language teaching by sharing a recognition of the importance of a larger, schematic view of…
Our Perception of Woman as Determined by Language.
ERIC Educational Resources Information Center
Ayim, Maryann
Recognition of gender as a significant factor in the social parameters of language is a very recent phenomonon. The external aspects of language as they relate to sexism have social and political ramifications. Using Peirce's definition of sign, which encompasses the representation, the object, and its interpretation, sexually stereotypic language…
One Hundred Years of Esperanto: A Survey.
ERIC Educational Resources Information Center
Tonkin, Humphrey
1987-01-01
The history of Esperanto, a language created to promote international communication, is chronicled. Discussion focuses on the origins and development of the language, early attitudes toward its adoption, patterns of use and recognition, the development of a literature in and of Esperanto, the culture associated with the language. (Author/MSE)
Early Sign Language Exposure and Cochlear Implantation Benefits.
Geers, Ann E; Mitchell, Christine M; Warner-Czyz, Andrea; Wang, Nae-Yuh; Eisenberg, Laurie S
2017-07-01
Most children with hearing loss who receive cochlear implants (CI) learn spoken language, and parents must choose early on whether to use sign language to accompany speech at home. We address whether parents' use of sign language before and after CI positively influences auditory-only speech recognition, speech intelligibility, spoken language, and reading outcomes. Three groups of children with CIs from a nationwide database who differed in the duration of early sign language exposure provided in their homes were compared in their progress through elementary grades. The groups did not differ in demographic, auditory, or linguistic characteristics before implantation. Children without early sign language exposure achieved better speech recognition skills over the first 3 years postimplant and exhibited a statistically significant advantage in spoken language and reading near the end of elementary grades over children exposed to sign language. Over 70% of children without sign language exposure achieved age-appropriate spoken language compared with only 39% of those exposed for 3 or more years. Early speech perception predicted speech intelligibility in middle elementary grades. Children without sign language exposure produced speech that was more intelligible (mean = 70%) than those exposed to sign language (mean = 51%). This study provides the most compelling support yet available in CI literature for the benefits of spoken language input for promoting verbal development in children implanted by 3 years of age. Contrary to earlier published assertions, there was no advantage to parents' use of sign language either before or after CI. Copyright © 2017 by the American Academy of Pediatrics.
Clay, Zanna; Pople, Sally; Hood, Bruce; Kita, Sotaro
2014-08-01
Research on Nicaraguan Sign Language, created by deaf children, has suggested that young children use gestures to segment the semantic elements of events and linearize them in ways similar to those used in signed and spoken languages. However, it is unclear whether this is due to children's learning processes or to a more general effect of iterative learning. We investigated whether typically developing children, without iterative learning, segment and linearize information. Gestures produced in the absence of speech to express a motion event were examined in 4-year-olds, 12-year-olds, and adults (all native English speakers). We compared the proportions of gestural expressions that segmented semantic elements into linear sequences and that encoded them simultaneously. Compared with adolescents and adults, children reshaped the holistic stimuli by segmenting and recombining their semantic features into linearized sequences. A control task on recognition memory ruled out the possibility that this was due to different event perception or memory. Young children spontaneously bring fundamental properties of language into their communication system. © The Author(s) 2014.
ERIC Educational Resources Information Center
Fuentes, Mariana; Tolchinsky, Liliana
2004-01-01
Linguistic descriptions of sign languages are important to the recognition of their linguistic status. These languages are an essential part of the cultural heritage of the communities that create and use them and vital in the education of deaf children. They are also the reference point in language acquisition studies. Ours is exploratory…
Kinect-based sign language recognition of static and dynamic hand movements
NASA Astrophysics Data System (ADS)
Dalawis, Rando C.; Olayao, Kenneth Deniel R.; Ramos, Evan Geoffrey I.; Samonte, Mary Jane C.
2017-02-01
A different approach of sign language recognition of static and dynamic hand movements was developed in this study using normalized correlation algorithm. The goal of this research was to translate fingerspelling sign language into text using MATLAB and Microsoft Kinect. Digital input image captured by Kinect devices are matched from template samples stored in a database. This Human Computer Interaction (HCI) prototype was developed to help people with communication disability to express their thoughts with ease. Frame segmentation and feature extraction was used to give meaning to the captured images. Sequential and random testing was used to test both static and dynamic fingerspelling gestures. The researchers explained some factors they encountered causing some misclassification of signs.
Bernaerts, Sylvie; Berra, Emmely; Wenderoth, Nicole; Alaerts, Kaat
2016-10-01
The neuropeptide 'oxytocin' (OT) is known to play a pivotal role in a variety of complex social behaviors by promoting a prosocial attitude and interpersonal bonding. One mechanism by which OT is hypothesized to promote prosocial behavior is by enhancing the processing of socially relevant information from the environment. With the present study, we explored to what extent OT can alter the 'reading' of emotional body language as presented by impoverished biological motion point light displays (PLDs). To do so, a double-blind between-subjects randomized placebo-controlled trial was conducted, assessing performance on a bodily emotion recognition task in healthy adult males before and after a single-dose of intranasal OT (24 IU). Overall, a single-dose of OT administration had a significant effect of medium size on emotion recognition from body language. OT-induced improvements in emotion recognition were not differentially modulated by the emotional valence of the presented stimuli (positive versus negative) and also, the overall tendency to label an observed emotional state as 'happy' (positive) or 'angry' (negative) was not modified by the administration of OT. Albeit moderate, the present findings of OT-induced improvements in bodily emotion recognition from whole-body PLD provide further support for a link between OT and the processing of socio-communicative cues originating from the body of others. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Warzybok, Anna; Brand, Thomas; Wagener, Kirsten C; Kollmeier, Birger
2015-01-01
The current study investigates the extent to which the linguistic complexity of three commonly employed speech recognition tests and second language proficiency influence speech recognition thresholds (SRTs) in noise in non-native listeners. SRTs were measured for non-natives and natives using three German speech recognition tests: the digit triplet test (DTT), the Oldenburg sentence test (OLSA), and the Göttingen sentence test (GÖSA). Sixty-four non-native and eight native listeners participated. Non-natives can show native-like SRTs in noise only for the linguistically easy speech material (DTT). Furthermore, the limitation of phonemic-acoustical cues in digit triplets affects speech recognition to the same extent in non-natives and natives. For more complex and less familiar speech materials, non-natives, ranging from basic to advanced proficiency in German, require on average 3-dB better signal-to-noise ratio for the OLSA and 6-dB for the GÖSA to obtain 50% speech recognition compared to native listeners. In clinical audiology, SRT measurements with a closed-set speech test (i.e. DTT for screening or OLSA test for clinical purposes) should be used with non-native listeners rather than open-set speech tests (such as the GÖSA or HINT), especially if a closed-set version in the patient's own native language is available.
Yoneyama, Kiyoko; Munson, Benjamin
2017-02-01
Whether or not the influence of listeners' language proficiency on L2 speech recognition was affected by the structure of the lexicon was examined. This specific experiment examined the effect of word frequency (WF) and phonological neighborhood density (PND) on word recognition in native speakers of English and second-language (L2) speakers of English whose first language was Japanese. The stimuli included English words produced by a native speaker of English and English words produced by a native speaker of Japanese (i.e., with Japanese-accented English). The experiment was inspired by the finding of Imai, Flege, and Walley [(2005). J. Acoust. Soc. Am. 117, 896-907] that the influence of talker accent on speech intelligibility for L2 learners of English whose L1 is Spanish varies as a function of words' PND. In the currently study, significant interactions between stimulus accentedness and listener group on the accuracy and speed of spoken word recognition were found, as were significant effects of PND and WF on word-recognition accuracy. However, no significant three-way interaction among stimulus talker, listener group, and PND on either measure was found. Results are discussed in light of recent findings on cross-linguistic differences in the nature of the effects of PND on L2 phonological and lexical processing.
Oh, Jooyoung; Chun, Ji-Won; Kim, Eunseong; Park, Hae-Jeong; Lee, Boreom; Kim, Jae-Jin
2017-01-01
Patients with schizophrenia exhibit several cognitive deficits, including memory impairment. Problems with recognition memory can hinder socially adaptive behavior. Previous investigations have suggested that altered activation of the frontotemporal area plays an important role in recognition memory impairment. However, the cerebral networks related to these deficits are not known. The aim of this study was to elucidate the brain networks required for recognizing socially relevant information in patients with schizophrenia performing an old-new recognition task. Sixteen patients with schizophrenia and 16 controls participated in this study. First, the subjects performed the theme-identification task during functional magnetic resonance imaging. In this task, pictures depicting social situations were presented with three words, and the subjects were asked to select the best theme word for each picture. The subjects then performed an old-new recognition task in which they were asked to discriminate whether the presented words were old or new. Task performance and neural responses in the old-new recognition task were compared between the subject groups. An independent component analysis of the functional connectivity was performed. The patients with schizophrenia exhibited decreased discriminability and increased activation of the right superior temporal gyrus compared with the controls during correct responses. Furthermore, aberrant network activities were found in the frontopolar and language comprehension networks in the patients. The functional connectivity analysis showed aberrant connectivity in the frontopolar and language comprehension networks in the patients with schizophrenia, and these aberrations possibly contribute to their low recognition performance and social dysfunction. These results suggest that the frontopolar and language comprehension networks are potential therapeutic targets in patients with schizophrenia.
NASA Astrophysics Data System (ADS)
Hassibi, Khosrow M.
1994-02-01
This paper presents a brief overview of our research in the development of an OCR system for recognition of machine-printed texts in languages that use the Arabic alphabet. The cursive nature of machine-printed Arabic makes the segmentation of words into letters a challenging problem. In our approach, through a novel preliminary segmentation technique, a word is broken into pieces where each piece may not represent a valid letter in general. Neural networks trained on a training sample set of about 500 Arabic text images are used for recognition of these pieces. The rules governing the alphabet and character-level contextual information are used for recombining these pieces into valid letters. Higher-level contextual analysis schemes including the use of an Arabic lexicon and n-grams is also under development and are expected to improve the word recognition accuracy. The segmentation, recognition, and contextual analysis processes are closely integrated using a feedback scheme. The details of preparation of the training set and some recent results on training of the networks will be presented.
Combining point context and dynamic time warping for online gesture recognition
NASA Astrophysics Data System (ADS)
Mao, Xia; Li, Chen
2017-05-01
Previous gesture recognition methods usually focused on recognizing gestures after the entire gesture sequences were obtained. However, in many practical applications, a system has to identify gestures before they end to give instant feedback. We present an online gesture recognition approach that can realize early recognition of unfinished gestures with low latency. First, a curvature buffer-based point context (CBPC) descriptor is proposed to extract the shape feature of a gesture trajectory. The CBPC descriptor is a complete descriptor with a simple computation, and thus has its superiority in online scenarios. Then, we introduce an online windowed dynamic time warping algorithm to realize online matching between the ongoing gesture and the template gestures. In the algorithm, computational complexity is effectively decreased by adding a sliding window to the accumulative distance matrix. Lastly, the experiments are conducted on the Australian sign language data set and the Kinect hand gesture (KHG) data set. Results show that the proposed method outperforms other state-of-the-art methods especially when gesture information is incomplete.
Tone classification of syllable-segmented Thai speech based on multilayer perception
NASA Astrophysics Data System (ADS)
Satravaha, Nuttavudh; Klinkhachorn, Powsiri; Lass, Norman
2002-05-01
Thai is a monosyllabic tonal language that uses tone to convey lexical information about the meaning of a syllable. Thus to completely recognize a spoken Thai syllable, a speech recognition system not only has to recognize a base syllable but also must correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system. Thai has five distinctive tones (``mid,'' ``low,'' ``falling,'' ``high,'' and ``rising'') and each tone is represented by a single fundamental frequency (F0) pattern. However, several factors, including tonal coarticulation, stress, intonation, and speaker variability, affect the F0 pattern of a syllable in continuous Thai speech. In this study, an efficient method for tone classification of syllable-segmented Thai speech, which incorporates the effects of tonal coarticulation, stress, and intonation, as well as a method to perform automatic syllable segmentation, were developed. Acoustic parameters were used as the main discriminating parameters. The F0 contour of a segmented syllable was normalized by using a z-score transformation before being presented to a tone classifier. The proposed system was evaluated on 920 test utterances spoken by 8 speakers. A recognition rate of 91.36% was achieved by the proposed system.
Use of Authentic-Speech Technique for Teaching Sound Recognition to EFL Students
ERIC Educational Resources Information Center
Sersen, William J.
2011-01-01
The main objective of this research was to test an authentic-speech technique for improving the sound-recognition skills of EFL (English as a foreign language) students at Roi-Et Rajabhat University. The secondary objective was to determine the correlation, if any, between students' self-evaluation of sound-recognition progress and the actual…
ERIC Educational Resources Information Center
Ebert, Ashlee A.
2009-01-01
Ehri's developmental model of word recognition outlines early reading development that spans from the use of logos to advanced knowledge of oral and written language to read words. Henderson's developmental spelling theory presents stages of word knowledge that progress in a similar manner to Ehri's phases. The purpose of this research study was…
Featuring Old/New Recognition: The Two Faces of the Pseudoword Effect
ERIC Educational Resources Information Center
Joordens, Steve; Ozubko, Jason D.; Niewiadomski, Marty W.
2008-01-01
In his analysis of the pseudoword effect, [Greene, R.L. (2004). Recognition memory for pseudowords. "Journal of Memory and Language," 50, 259-267.] suggests nonwords can feel more familiar that words in a recognition context if the orthographic features of the nonword match well with the features of the items presented at study. One possible…
ERIC Educational Resources Information Center
Li, Ming
2013-01-01
The goal of this work is to enhance the robustness and efficiency of the multimodal human states recognition task. Human states recognition can be considered as a joint term for identifying/verifing various kinds of human related states, such as biometric identity, language spoken, age, gender, emotion, intoxication level, physical activity, vocal…
NASA Astrophysics Data System (ADS)
Iervolino, Onorio; Meo, Michele
2017-04-01
Sign language is a method of communication for deaf-mute people with articulated gestures and postures of hands and fingers to represent alphabet letters or complete words. Recognizing gestures is a difficult task, due to intrapersonal and interpersonal variations in performing them. This paper investigates the use of Spiral Passive Electromagnetic Sensor (SPES) as a motion recognition tool. An instrumented glove integrated with wearable multi-SPES sensors was developed to encode data and provide a unique response for each hand gesture. The device can be used for recognition of gestures; motion control and well-defined gesture sets such as sign languages. Each specific gesture was associated to a unique sensor response. The gloves encode data regarding the gesture directly in the frequency spectrum response of the SPES. The absence of chip or complex electronic circuit make the gloves light and comfortable to wear. Results showed encouraging data to use SPES in wearable applications.
Emotion through locomotion: gender impact.
Krüger, Samuel; Sokolov, Alexander N; Enck, Paul; Krägeloh-Mann, Ingeborg; Pavlova, Marina A
2013-01-01
Body language reading is of significance for daily life social cognition and successful social interaction, and constitutes a core component of social competence. Yet it is unclear whether our ability for body language reading is gender specific. In the present work, female and male observers had to visually recognize emotions through point-light human locomotion performed by female and male actors with different emotional expressions. For subtle emotional expressions only, males surpass females in recognition accuracy and readiness to respond to happy walking portrayed by female actors, whereas females exhibit a tendency to be better in recognition of hostile angry locomotion expressed by male actors. In contrast to widespread beliefs about female superiority in social cognition, the findings suggest that gender effects in recognition of emotions from human locomotion are modulated by emotional content of actions and opposite actor gender. In a nutshell, the study makes a further step in elucidation of gender impact on body language reading and on neurodevelopmental and psychiatric deficits in visual social cognition.
Almabruk, Abubaker A. A.; Paterson, Kevin B.; McGowan, Victoria; Jordan, Timothy R.
2011-01-01
Background Previous studies have claimed that a precise split at the vertical midline of each fovea causes all words to the left and right of fixation to project to the opposite, contralateral hemisphere, and this division in hemispheric processing has considerable consequences for foveal word recognition. However, research in this area is dominated by the use of stimuli from Latinate languages, which may induce specific effects on performance. Consequently, we report two experiments using stimuli from a fundamentally different, non-Latinate language (Arabic) that offers an alternative way of revealing effects of split-foveal processing, if they exist. Methods and Findings Words (and pseudowords) were presented to the left or right of fixation, either close to fixation and entirely within foveal vision, or further from fixation and entirely within extrafoveal vision. Fixation location and stimulus presentations were carefully controlled using an eye-tracker linked to a fixation-contingent display. To assess word recognition, Experiment 1 used the Reicher-Wheeler task and Experiment 2 used the lexical decision task. Results Performance in both experiments indicated a functional division in hemispheric processing for words in extrafoveal locations (in recognition accuracy in Experiment 1 and in reaction times and error rates in Experiment 2) but no such division for words in foveal locations. Conclusions These findings from a non-Latinate language provide new evidence that although a functional division in hemispheric processing exists for word recognition outside the fovea, this division does not extend up to the point of fixation. Some implications for word recognition and reading are discussed. PMID:21559084
A language-familiarity effect for speaker discrimination without comprehension.
Fleming, David; Giordano, Bruno L; Caldara, Roberto; Belin, Pascal
2014-09-23
The influence of language familiarity upon speaker identification is well established, to such an extent that it has been argued that "Human voice recognition depends on language ability" [Perrachione TK, Del Tufo SN, Gabrieli JDE (2011) Science 333(6042):595]. However, 7-mo-old infants discriminate speakers of their mother tongue better than they do foreign speakers [Johnson EK, Westrek E, Nazzi T, Cutler A (2011) Dev Sci 14(5):1002-1011] despite their limited speech comprehension abilities, suggesting that speaker discrimination may rely on familiarity with the sound structure of one's native language rather than the ability to comprehend speech. To test this hypothesis, we asked Chinese and English adult participants to rate speaker dissimilarity in pairs of sentences in English or Mandarin that were first time-reversed to render them unintelligible. Even in these conditions a language-familiarity effect was observed: Both Chinese and English listeners rated pairs of native-language speakers as more dissimilar than foreign-language speakers, despite their inability to understand the material. Our data indicate that the language familiarity effect is not based on comprehension but rather on familiarity with the phonology of one's native language. This effect may stem from a mechanism analogous to the "other-race" effect in face recognition.
Discriminative exemplar coding for sign language recognition with Kinect.
Sun, Chao; Zhang, Tianzhu; Bao, Bing-Kun; Xu, Changsheng; Mei, Tao
2013-10-01
Sign language recognition is a growing research area in the field of computer vision. A challenge within it is to model various signs, varying with time resolution, visual manual appearance, and so on. In this paper, we propose a discriminative exemplar coding (DEC) approach, as well as utilizing Kinect sensor, to model various signs. The proposed DEC method can be summarized as three steps. First, a quantity of class-specific candidate exemplars are learned from sign language videos in each sign category by considering their discrimination. Then, every video of all signs is described as a set of similarities between frames within it and the candidate exemplars. Instead of simply using a heuristic distance measure, the similarities are decided by a set of exemplar-based classifiers through the multiple instance learning, in which a positive (or negative) video is treated as a positive (or negative) bag and those frames similar to the given exemplar in Euclidean space as instances. Finally, we formulate the selection of the most discriminative exemplars into a framework and simultaneously produce a sign video classifier to recognize sign. To evaluate our method, we collect an American sign language dataset, which includes approximately 2000 phrases, while each phrase is captured by Kinect sensor with color, depth, and skeleton information. Experimental results on our dataset demonstrate the feasibility and effectiveness of the proposed approach for sign language recognition.
Jiao, Dazhi; Wild, David J
2009-02-01
This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively.
ERIC Educational Resources Information Center
Landon, Laura L.
2017-01-01
This study examines the application of the Simple View of Reading (SVR), a reading comprehension theory focusing on word recognition and linguistic comprehension, to English Language Learners' (ELLs') English reading development. This study examines the concurrent and predictive validity of two components of the SVR, oral language and word-level…
ERIC Educational Resources Information Center
Sharp, Kathryn M; Gathercole, Virginia C. Mueller
2013-01-01
In recent years, there has been growing recognition of a need for a general, non-language-specific assessment tool that could be used to evaluate general speech and language abilities in children, especially to assist in identifying atypical development in bilingual children who speak a language unfamiliar to the assessor. It has been suggested…
Interactive Processing of Words in Connected Speech in L1 and L2.
ERIC Educational Resources Information Center
Hayashi, Takuo
1991-01-01
A study exploring the differences between first- and second-language word recognition strategies revealed that second-language listeners used more higher level information than native language listeners, when access to higher level information was not hindered by a competence-ceiling effect, indicating that word processing strategy is a function…
Symbolic Play Connects to Language through Visual Object Recognition
ERIC Educational Resources Information Center
Smith, Linda B.; Jones, Susan S.
2011-01-01
Object substitutions in play (e.g. using a box as a car) are strongly linked to language learning and their absence is a diagnostic marker of language delay. Classic accounts posit a symbolic function that underlies both words and object substitutions. Here we show that object substitutions depend on developmental changes in visual object…
ERIC Educational Resources Information Center
McMurray, Bob; Munson, Cheyenne; Tomblin, J. Bruce
2014-01-01
Purpose: The authors examined speech perception deficits associated with individual differences in language ability, contrasting auditory, phonological, or lexical accounts by asking whether lexical competition is differentially sensitive to fine-grained acoustic variation. Method: Adolescents with a range of language abilities (N = 74, including…
ERIC Educational Resources Information Center
Constantino, John N.; Yang, Dan; Gray, Teddi L.; Gross, Maggie M.; Abbacchi, Anna M.; Smith, Sarah C.; Kohn, Catherine E.; Kuhl, Patricia K.
2007-01-01
Autism spectrum disorders (ASDs) are characterized by correlated deficiencies in social and language development. This study explored a fundamental aspect of auditory information processing (AIP) that is dependent on social experience and critical to early language development: the ability to compartmentalize close-sounding speech sounds into…
Codeswitching in the Multilingual English First Language Classroom
ERIC Educational Resources Information Center
Moodley, Visvaganthie
2007-01-01
This paper focuses on the role of codeswitching (CS) by isiZulu (Zulu) native language (NL) junior secondary learners in English first language (EL1) multilingual classrooms in South Africa. In spite of the educational transformation in South Africa, and the recognition of CS (by education policy documents) as a means of fulfilling pedagogical…
Hemispheric Differences in Bilingual Word and Language Recognition.
ERIC Educational Resources Information Center
Roberts, William T.; And Others
The linguistic role of the right hemisphere in bilingual language processing was examined. Ten right-handed Spanish-English bilinguals were tachistoscopically presented with mixed lists of Spanish and English words to either the right or left visual field and asked to identify the language and the word presented. Five of the subjects identified…
Modern Languages in the United Kingdom
ERIC Educational Resources Information Center
Coleman, James A.
2011-01-01
The article supplies an overview of UK modern languages education at school and university level. It attends particularly to trends over recent years, with regard both to numbers and to social elitism, and reflects on perceptions of language learning in the wider culture and the importance of gaining wider recognition of the value of languages…
Memory for Nonsemantic Attributes of American Sign Language Signs and English Words
ERIC Educational Resources Information Center
Siple, Patricia
1977-01-01
Two recognition memory experiments were used to study the retention of language and modality of input. A bilingual list of American Sign Language signs and English words was presented to two deaf and two hearing groups, one instructed to remember mode of input, and one hearing group. Findings are analyzed. (CHK)
The Bimodal Bilingual Brain: Effects of Sign Language Experience
ERIC Educational Resources Information Center
Emmorey, Karen; McCullough, Stephen
2009-01-01
Bimodal bilinguals are hearing individuals who know both a signed and a spoken language. Effects of bimodal bilingualism on behavior and brain organization are reviewed, and an fMRI investigation of the recognition of facial expressions by ASL-English bilinguals is reported. The fMRI results reveal separate effects of sign language and spoken…
Leadership Practices to Support Teaching and Learning for English Language Learners
ERIC Educational Resources Information Center
McGee, Alyson; Haworth, Penny; MacIntyre, Lesieli
2015-01-01
With a substantial increase in the numbers of English language learners in schools, particularly in countries where English is the primary use first language, it is vital that educators are able to meet the needs of ethnically and linguistically changing and challenging classrooms. However, despite the recognition of the importance of effective…
On-Line Orthographic Influences on Spoken Language in a Semantic Task
ERIC Educational Resources Information Center
Pattamadilok, Chotiga; Perre, Laetitia; Dufau, Stephane; Ziegler, Johannes C.
2009-01-01
Literacy changes the way the brain processes spoken language. Most psycholinguists believe that orthographic effects on spoken language are either strategic or restricted to meta-phonological tasks. We used event-related brain potentials (ERPs) to investigate the locus and the time course of orthographic effects on spoken word recognition in a…
Levi, Gabriel; Colonnello, Valentina; Giacchè, Roberta; Piredda, Maria Letizia; Sogos, Carla
2014-05-01
Recent studies have shown that language processing is grounded in actions. Multiple independent research findings indicate that children with specific language impairment (SLI) show subtle difficulties beyond the language domain. Uncertainties remain on possible association between body-mediated, non-linguistic expression of verbs and early manifestation of SLI during verb acquisition. The present study was conducted to determine whether verb production through non-linguistic modalities is impaired in children with SLI. Children with SLI (mean age 41 months) and typically developing children (mean age 40 months) were asked to recognize target verbs while viewing video clips showing the action associated with the verb (verb-recognition task) and to enact the action corresponding to the verb (verb-enacting task). Children with SLI performed more poorly than control children in both tasks. The present study demonstrates that early language impairment emerges at the bodily level. These findings are consistent with the embodied theories of cognition and underscore the role of action-based representations during language development. Copyright © 2014 Elsevier Ltd. All rights reserved.
1993-06-18
David Hislop (US Army Research Office), Eero Hyvonen (VTT, Finland), Marek Karpinski (Bonn, Germany), Yves Kodratoff (Paris VI, France), Jan...21] M. P. Marcus, A theory of Syntactic Recognition for Natural Language, The MIT Press, Cambridge, Mass., 1980 [221 K.R. McKeown, The TEXT System...H. Simon, The Sciences of the Artificial, The MIT Press, Cambridge, MA, ( 1980 ). 3. M.D. Mesarovich, D. Macko, Y. Takahara Y., Theory of Hierarchical
The role of artificial intelligence and expert systems in increasing STS operations productivity
NASA Technical Reports Server (NTRS)
Culbert, C.
1985-01-01
Artificial Intelligence (AI) is discussed. A number of the computer technologies pioneered in the AI world can make significant contributions to increasing STS operations productivity. Application of expert systems, natural language, speech recognition, and other key technologies can reduce manpower while raising productivity. Many aspects of STS support lend themselves to this type of automation. The artificial intelligence section of the mission planning and analysis division has developed a number of functioning prototype systems which demonstrate the potential gains of applying AI technology.
Bottoni, Paolo; Cinque, Luigi; De Marsico, Maria; Levialdi, Stefano; Panizzi, Emanuele
2006-06-01
This paper reports on the research activities performed by the Pictorial Computing Laboratory at the University of Rome, La Sapienza, during the last 5 years. Such work, essentially is based on the study of humancomputer interaction, spans from metamodels of interaction down to prototypes of interactive systems for both synchronous multimedia communication and groupwork, annotation systems for web pages, also encompassing theoretical and practical issues of visual languages and environments also including pattern recognition algorithms. Some applications are also considered like e-learning and collaborative work.
MITLL 2015 Language Recognition Evaluation System Description
2016-01-27
912 8.18 qsl-rus Russian 2021 37.80 ara-ary Maghrebi 919 46.91 spa-car Carib. Spa. 194 30.59 ara-arz Egyptian 440 97.27 spa-eur Eur. Spa. 366 8.55...qsl-pol Polish 695 32.14 ara-arb MSA 912 8.18 qsl-rus Russian 2021 37.80 ara-ary Maghrebi 919 46.91 spa-car Carib. Spa. 194 30.59 ara-arz Egyptian ...BOTTLENECK I-VECTOR SYSTEM (BNF1) The Deep Neural Network architecture that we used for this system was composed of seven hidden layers. The sixth
Preparing chief information officers for the clinical information systems environment.
Valenta, A L; Mendola, R A; Dieter, M; Panko, W B
1999-05-01
Over the past decade, the chief information officer (CIO) in the health care enterprise has gained recognition as a member of the senior management team based on an understanding of business processes and business language. The clinical information system (CIS) in the health care environment poses a new frontier for CIOs, who are generally unfamiliar with both clinical languages and clinical processes. The authors discuss the role formal informatics training can have in preparing learners for future careers as CIOs in CIS environments. The health information management (HIM) specialization within the MBA program at the University of Illinois at Chicago is one example of an educational program designed to train future CIOs who can manage the business, technical, and clinical aspects of the health care environment.
Speech Processing and Recognition (SPaRe)
2011-01-01
results in the areas of automatic speech recognition (ASR), speech processing, machine translation (MT), natural language processing ( NLP ), and...Processing ( NLP ), Information Retrieval (IR) 16. SECURITY CLASSIFICATION OF: UNCLASSIFED 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME...Figure 9, the IOC was only expected to provide document submission and search; automatic speech recognition (ASR) for English, Spanish, Arabic , and
Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech
NASA Astrophysics Data System (ADS)
Furui, Sadaoki
This paper presents our recent work in regard to building Large Vocabulary Continuous Speech Recognition (LVCSR) systems for the Thai, Indonesian, and Chinese languages. For Thai, since there is no word boundary in the written form, we have proposed a new method for automatically creating word-like units from a text corpus, and applied topic and speaking style adaptation to the language model to recognize spoken-style utterances. For Indonesian, we have applied proper noun-specific adaptation to acoustic modeling, and rule-based English-to-Indonesian phoneme mapping to solve the problem of large variation in proper noun and English word pronunciation in a spoken-query information retrieval system. In spoken Chinese, long organization names are frequently abbreviated, and abbreviated utterances cannot be recognized if the abbreviations are not included in the dictionary. We have proposed a new method for automatically generating Chinese abbreviations, and by expanding the vocabulary using the generated abbreviations, we have significantly improved the performance of spoken query-based search.
Psycholinguistically Oriented Second Language Research.
ERIC Educational Resources Information Center
Juffs, Alan
2001-01-01
Reviews recent research that investigates second language performance from the perspective of sentence processing (on-line comprehension studies) and word recognition. Concentrates on describing methods that employ reaction time measures as correlates of processing difficulty or knowledge representation. (Author/VWL)
Plan recognition and generalization in command languages with application to telerobotics
NASA Technical Reports Server (NTRS)
Yared, Wael I.; Sheridan, Thomas B.
1991-01-01
A method for pragmatic inference as a necessary accompaniment to command languages is proposed. The approach taken focuses on the modeling and recognition of the human operator's intent, which relates sequences of domain actions ('plans') to changes in some model of the task environment. The salient feature of this module is that it captures some of the physical and linguistic contextual aspects of an instruction. This provides a basis for generalization and reinterpretation of the instruction in different task environments. The theoretical development is founded on previous work in computational linguistics and some recent models in the theory of action and intention. To illustrate these ideas, an experimental command language to a telerobot is implemented. The program consists of three different components: a robot graphic simulation, the command language itself, and the domain-independent pragmatic inference module. Examples of task instruction processes are provided to demonstrate the benefits of this approach.
Universal brain systems for recognizing word shapes and handwriting gestures during reading
Nakamura, Kimihiro; Kuo, Wen-Jui; Pegado, Felipe; Cohen, Laurent; Tzeng, Ovid J. L.; Dehaene, Stanislas
2012-01-01
Do the neural circuits for reading vary across culture? Reading of visually complex writing systems such as Chinese has been proposed to rely on areas outside the classical left-hemisphere network for alphabetic reading. Here, however, we show that, once potential confounds in cross-cultural comparisons are controlled for by presenting handwritten stimuli to both Chinese and French readers, the underlying network for visual word recognition may be more universal than previously suspected. Using functional magnetic resonance imaging in a semantic task with words written in cursive font, we demonstrate that two universal circuits, a shape recognition system (reading by eye) and a gesture recognition system (reading by hand), are similarly activated and show identical patterns of activation and repetition priming in the two language groups. These activations cover most of the brain regions previously associated with culture-specific tuning. Our results point to an extended reading network that invariably comprises the occipitotemporal visual word-form system, which is sensitive to well-formed static letter strings, and a distinct left premotor region, Exner’s area, which is sensitive to the forward or backward direction with which cursive letters are dynamically presented. These findings suggest that cultural effects in reading merely modulate a fixed set of invariant macroscopic brain circuits, depending on surface features of orthographies. PMID:23184998
An integrated information retrieval and document management system
NASA Technical Reports Server (NTRS)
Coles, L. Stephen; Alvarez, J. Fernando; Chen, James; Chen, William; Cheung, Lai-Mei; Clancy, Susan; Wong, Alexis
1993-01-01
This paper describes the requirements and prototype development for an intelligent document management and information retrieval system that will be capable of handling millions of pages of text or other data. Technologies for scanning, Optical Character Recognition (OCR), magneto-optical storage, and multiplatform retrieval using a Standard Query Language (SQL) will be discussed. The semantic ambiguity inherent in the English language is somewhat compensated-for through the use of coefficients or weighting factors for partial synonyms. Such coefficients are used both for defining structured query trees for routine queries and for establishing long-term interest profiles that can be used on a regular basis to alert individual users to the presence of relevant documents that may have just arrived from an external source, such as a news wire service. Although this attempt at evidential reasoning is limited in comparison with the latest developments in AI Expert Systems technology, it has the advantage of being commercially available.
Phonologic-graphemic transcodifier for Portuguese Language spoken in Brazil (PLB)
NASA Astrophysics Data System (ADS)
Fragadasilva, Francisco Jose; Saotome, Osamu; Deoliveira, Carlos Alberto
An automatic speech-to-text transformer system, suited to unlimited vocabulary, is presented. The basic acoustic unit considered are the allophones of the phonemes corresponding to the Portuguese language spoken in Brazil (PLB). The input to the system is a phonetic sequence, from a former step of isolated word recognition of slowly spoken speech. In a first stage, the system eliminates phonetic elements that don't belong to PLB. Using knowledge sources such as phonetics, phonology, orthography, and PLB specific lexicon, the output is a sequence of written words, ordered by probabilistic criterion that constitutes the set of graphemic possibilities to that input sequence. Pronunciation differences of some regions of Brazil are considered, but only those that cause differences in phonological transcription, because those of phonetic level are absorbed, during the transformation to phonological level. In the final stage, all possible written words are analyzed for orthography and grammar point of view, to eliminate the incorrect ones.
Semantic Radical Knowledge and Word Recognition in Chinese for Chinese as Foreign Language Learners
ERIC Educational Resources Information Center
Su, Xiaoxiang; Kim, Young-Suk
2014-01-01
In the present study, we examined the relation of knowledge of semantic radicals to students' language proficiency and word reading for adult Chinese-as-a-foreign language students. Ninety-seven college students rated their proficiency in speaking, listening, reading, and writing in Chinese, and were administered measures of receptive and…
2017-03-01
the Center for Technology Enhanced Language Learning (CTELL), a research cell in the Department of Foreign Languages, United States Military Academy...models for automatic speech recognition (ASR), and to, thereby, investigate the utility of ASR in pedagogical technology . The corpus is a sample of...lexical resources, language technology 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18. NUMBER OF
American or British? L2 Speakers' Recognition and Evaluations of Accent Features in English
ERIC Educational Resources Information Center
Carrie, Erin; McKenzie, Robert M.
2018-01-01
Recent language attitude research has attended to the processes involved in identifying and evaluating spoken language varieties. This article investigates the ability of second-language learners of English in Spain (N = 71) to identify Received Pronunciation (RP) and General American (GenAm) speech and their perceptions of linguistic variation…
Bilingual Language Representation and Cognitive Processes in Translation
ERIC Educational Resources Information Center
Hatzidaki, Anna; Pothos, Emmanuel M.
2008-01-01
A "text"-translation task and a recognition task investigated the hypothesis that "semantic memory" principally mediates translation from a bilingual's native first language (L1) to her second language (L2), whereas "lexical memory" mediates translation from L2 to L1. This has been held for word translation by the revised hierarchical model (RHM)…
Language Nonselective Lexical Access in Bilingual Toddlers
ERIC Educational Resources Information Center
Von Holzen, Katie; Mani, Nivedita
2012-01-01
We examined how words from bilingual toddlers' second language (L2) primed recognition of related target words in their first language (L1). On critical trials, prime-target word pairs were either (a) phonologically related, with L2 primes overlapped phonologically with L1 target words [e.g., "slide" (L2 prime)-"Kleid" (L1 target, "dress")], or…
NASA Astrophysics Data System (ADS)
Mantecón, Tomás.; del Blanco, Carlos Roberto; Jaureguizar, Fernando; García, Narciso
2014-06-01
New forms of natural interactions between human operators and UAVs (Unmanned Aerial Vehicle) are demanded by the military industry to achieve a better balance of the UAV control and the burden of the human operator. In this work, a human machine interface (HMI) based on a novel gesture recognition system using depth imagery is proposed for the control of UAVs. Hand gesture recognition based on depth imagery is a promising approach for HMIs because it is more intuitive, natural, and non-intrusive than other alternatives using complex controllers. The proposed system is based on a Support Vector Machine (SVM) classifier that uses spatio-temporal depth descriptors as input features. The designed descriptor is based on a variation of the Local Binary Pattern (LBP) technique to efficiently work with depth video sequences. Other major consideration is the especial hand sign language used for the UAV control. A tradeoff between the use of natural hand signs and the minimization of the inter-sign interference has been established. Promising results have been achieved in a depth based database of hand gestures especially developed for the validation of the proposed system.
Microcomputers and Preschoolers.
ERIC Educational Resources Information Center
Evans, Dina
Preschool children can benefit by working with microcomputers. Thinking skills are enhanced by software games that focus on logic, memory, problem solving, and pattern recognition. Counting, sequencing, and matching games develop mathematics skills, and word games focusing on basic letter symbol and word recognition develop language skills.…
A Spoken English Recognition Expert System.
1983-09-01
Davidson. "Representation of Knowledge," Handbook of Artificial Intelligence, edited by Avron Barr and Edward A. Felgenbaum. DTIC document number AD...Regents of the University of CalTorni, 1981. 9. Gardner, Anne. "Search," Handbook of Artificial Intelligence, edited by Avron Barr and Edward A...Felgenbaum, DTIC document number AD A074078, 1979. 10. Gardner, Anne,et al. "Natural Language Understanding," Handbook of Artificial Intelligence, edited
1983-10-28
Computing. By seizing an opportunity to leverage recent advances in artificial intelligence, computer science, and microelectronics, the Agency plans...occurred in many separated areas of artificial intelligence, computer science, and microelectronics. Advances in "expert system" technology now...and expert knowledge o Advances in Artificial Intelligence: Mechanization of speech recognition, vision, and natural language understanding. o
The Effects of Lexical Pitch Accent on Infant Word Recognition in Japanese
Ota, Mitsuhiko; Yamane, Naoto; Mazuka, Reiko
2018-01-01
Learners of lexical tone languages (e.g., Mandarin) develop sensitivity to tonal contrasts and recognize pitch-matched, but not pitch-mismatched, familiar words by 11 months. Learners of non-tone languages (e.g., English) also show a tendency to treat pitch patterns as lexically contrastive up to about 18 months. In this study, we examined if this early-developing capacity to lexically encode pitch variations enables infants to acquire a pitch accent system, in which pitch-based lexical contrasts are obscured by the interaction of lexical and non-lexical (i.e., intonational) features. Eighteen 17-month-olds learning Tokyo Japanese were tested on their recognition of familiar words with the expected pitch or the lexically opposite pitch pattern. In early trials, infants were faster in shifting their eyegaze from the distractor object to the target object than in shifting from the target to distractor in the pitch-matched condition. In later trials, however, infants showed faster distractor-to-target than target-to-distractor shifts in both the pitch-matched and pitch-mismatched conditions. We interpret these results to mean that, in a pitch-accent system, the ability to use pitch variations to recognize words is still in a nascent state at 17 months. PMID:29375452
Neural-Network Object-Recognition Program
NASA Technical Reports Server (NTRS)
Spirkovska, L.; Reid, M. B.
1993-01-01
HONTIOR computer program implements third-order neural network exhibiting invariance under translation, change of scale, and in-plane rotation. Invariance incorporated directly into architecture of network. Only one view of each object needed to train network for two-dimensional-translation-invariant recognition of object. Also used for three-dimensional-transformation-invariant recognition by training network on only set of out-of-plane rotated views. Written in C language.
Handwritten-word spotting using biologically inspired features.
van der Zant, Tijn; Schomaker, Lambert; Haak, Koen
2008-11-01
For quick access to new handwritten collections, current handwriting recognition methods are too cumbersome. They cannot deal with the lack of labeled data and would require extensive laboratory training for each individual script, style, language and collection. We propose a biologically inspired whole-word recognition method which is used to incrementally elicit word labels in a live, web-based annotation system, named Monk. Since human labor should be minimized given the massive amount of image data, it becomes important to rely on robust perceptual mechanisms in the machine. Recent computational models of the neuro-physiology of vision are applied to isolated word classification. A primate cortex-like mechanism allows to classify text-images that have a low frequency of occurrence. Typically these images are the most difficult to retrieve and often contain named entities and are regarded as the most important to people. Usually standard pattern-recognition technology cannot deal with these text-images if there are not enough labeled instances. The results of this retrieval system are compared to normalized word-image matching and appear to be very promising.
Tong, Xiuli; Tong, Xiuhong; McBride-Chang, Catherine
2015-01-01
This study investigated the rate of school-aged Chinese-English language learners at risk for reading difficulties in either Chinese or English only, or both, among second and fifth graders in Hong Kong. In addition, we examined the metalinguistic skills that distinguished those who were poor in reading Chinese from those who were poor in reading English. The prevalence of poor English readers among children identified to be poor in Chinese word recognition across the five participating schools was approximately 42% at Grade 2 and 57% at Grade 5. Across grades, children who were poor readers of both languages tended to have difficulties in phonological and morphological awareness. Poor readers of English only were found to manifest significantly poorer phonological awareness, compared to those who were poor readers of Chinese only; their average tone awareness score was also lower relative to normally developing controls. Apart from indicating possible dissociations between Chinese first language (L1) word reading and English second language (L2) word reading, these findings suggested that the degree to which different metalinguistic skills are important for reading in different writing systems may depend on the linguistic features of the particular writing system. © Hammill Institute on Disabilities 2013.
Oryadi-Zanjani, Mohammad Majid; Vahab, Maryam; Bazrafkan, Mozhdeh; Haghjoo, Asghar
2015-12-01
The aim of this study was to examine the role of audiovisual speech recognition as a clinical criterion of cochlear implant or hearing aid efficiency in Persian-language children with severe-to-profound hearing loss. This research was administered as a cross-sectional study. The sample size was 60 Persian 5-7 year old children. The assessment tool was one of subtests of Persian version of the Test of Language Development-Primary 3. The study included two experiments: auditory-only and audiovisual presentation conditions. The test was a closed-set including 30 words which were orally presented by a speech-language pathologist. The scores of audiovisual word perception were significantly higher than auditory-only condition in the children with normal hearing (P<0.01) and cochlear implant (P<0.05); however, in the children with hearing aid, there was no significant difference between word perception score in auditory-only and audiovisual presentation conditions (P>0.05). The audiovisual spoken word recognition can be applied as a clinical criterion to assess the children with severe to profound hearing loss in order to find whether cochlear implant or hearing aid has been efficient for them or not; i.e. if a child with hearing impairment who using CI or HA can obtain higher scores in audiovisual spoken word recognition than auditory-only condition, his/her auditory skills have appropriately developed due to effective CI or HA as one of the main factors of auditory habilitation. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Goswami, Usha; Cumming, Ruth; Chait, Maria; Huss, Martina; Mead, Natasha; Wilson, Angela M.; Barnes, Lisa; Fosker, Tim
2016-01-01
Here we use two filtered speech tasks to investigate children’s processing of slow (<4 Hz) versus faster (∼33 Hz) temporal modulations in speech. We compare groups of children with either developmental dyslexia (Experiment 1) or speech and language impairments (SLIs, Experiment 2) to groups of typically-developing (TD) children age-matched to each disorder group. Ten nursery rhymes were filtered so that their modulation frequencies were either low-pass filtered (<4 Hz) or band-pass filtered (22 – 40 Hz). Recognition of the filtered nursery rhymes was tested in a picture recognition multiple choice paradigm. Children with dyslexia aged 10 years showed equivalent recognition overall to TD controls for both the low-pass and band-pass filtered stimuli, but showed significantly impaired acoustic learning during the experiment from low-pass filtered targets. Children with oral SLIs aged 9 years showed significantly poorer recognition of band pass filtered targets compared to their TD controls, and showed comparable acoustic learning effects to TD children during the experiment. The SLI samples were also divided into children with and without phonological difficulties. The children with both SLI and phonological difficulties were impaired in recognizing both kinds of filtered speech. These data are suggestive of impaired temporal sampling of the speech signal at different modulation rates by children with different kinds of developmental language disorder. Both SLI and dyslexic samples showed impaired discrimination of amplitude rise times. Implications of these findings for a temporal sampling framework for understanding developmental language disorders are discussed. PMID:27303348
Kronenberger, William G.; Castellanos, Irina; Pisoni, David B.
2017-01-01
Purpose We sought to determine whether speech perception and language skills measured early after cochlear implantation in children who are deaf, and early postimplant growth in speech perception and language skills, predict long-term speech perception, language, and neurocognitive outcomes. Method Thirty-six long-term users of cochlear implants, implanted at an average age of 3.4 years, completed measures of speech perception, language, and executive functioning an average of 14.4 years postimplantation. Speech perception and language skills measured in the 1st and 2nd years postimplantation and open-set word recognition measured in the 3rd and 4th years postimplantation were obtained from a research database in order to assess predictive relations with long-term outcomes. Results Speech perception and language skills at 6 and 18 months postimplantation were correlated with long-term outcomes for language, verbal working memory, and parent-reported executive functioning. Open-set word recognition was correlated with early speech perception and language skills and long-term speech perception and language outcomes. Hierarchical regressions showed that early speech perception and language skills at 6 months postimplantation and growth in these skills from 6 to 18 months both accounted for substantial variance in long-term outcomes for language and verbal working memory that was not explained by conventional demographic and hearing factors. Conclusion Speech perception and language skills measured very early postimplantation, and early postimplant growth in speech perception and language, may be clinically relevant markers of long-term language and neurocognitive outcomes in users of cochlear implants. Supplemental materials https://doi.org/10.23641/asha.5216200 PMID:28724130
[Vocal recognition in dental and oral radiology].
La Fianza, A; Giorgetti, S; Marelli, P; Campani, R
1993-10-01
Speech reporting benefits by units which can recognize sentences in any natural language in real time. The use of this method in the everyday practice of radiology departments shows its possible application fields. We used the speech recognition method to report orthopantomographic exams in order to evaluate the advantages the method offers to the management and quality of reporting the exams which are difficult to fit in other closed computed reporting systems. Both speech recognition and the conventional reporting method (tape recording and typewriting) were used to report 760 orthopantomographs. The average time needed to make the report, the legibility (or Flesch) index, as adapted for the Italian language, and finally a clinical index (the subjective opinion of 4 odontostomatologists) were evaluated for each exam, with both techniques. Moreover, errors in speech reporting (crude, human and overall errors) were also evaluated. The advantages of speech reporting consisted in the shorter time needed for the report to become available (2.24 vs 2.99 minutes) (p < 0.0005), in the improved Flesch index (30.62 vs 28.9) and in the clinical index. The data obtained from speech reporting in odontostomatologic radiology were useful not only to reduce the mean reporting time of orthopantomographic exams but also to improve report quality by reducing both grammar and transmission mistakes. However, the basic condition for such results to be obtained is the speaker's skills to make a good report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, H; Tan, J; Kavanaugh, J
Purpose: Radiotherapy (RT) contours delineated either manually or semiautomatically require verification before clinical usage. Manual evaluation is very time consuming. A new integrated software tool using supervised pattern contour recognition was thus developed to facilitate this process. Methods: The contouring tool was developed using an object-oriented programming language C# and application programming interfaces, e.g. visualization toolkit (VTK). The C# language served as the tool design basis. The Accord.Net scientific computing libraries were utilized for the required statistical data processing and pattern recognition, while the VTK was used to build and render 3-D mesh models from critical RT structures in real-timemore » and 360° visualization. Principal component analysis (PCA) was used for system self-updating geometry variations of normal structures based on physician-approved RT contours as a training dataset. The inhouse design of supervised PCA-based contour recognition method was used for automatically evaluating contour normality/abnormality. The function for reporting the contour evaluation results was implemented by using C# and Windows Form Designer. Results: The software input was RT simulation images and RT structures from commercial clinical treatment planning systems. Several abilities were demonstrated: automatic assessment of RT contours, file loading/saving of various modality medical images and RT contours, and generation/visualization of 3-D images and anatomical models. Moreover, it supported the 360° rendering of the RT structures in a multi-slice view, which allows physicians to visually check and edit abnormally contoured structures. Conclusion: This new software integrates the supervised learning framework with image processing and graphical visualization modules for RT contour verification. This tool has great potential for facilitating treatment planning with the assistance of an automatic contour evaluation module in avoiding unnecessary manual verification for physicians/dosimetrists. In addition, its nature as a compact and stand-alone tool allows for future extensibility to include additional functions for physicians’ clinical needs.« less
Improved Open-Microphone Speech Recognition
NASA Astrophysics Data System (ADS)
Abrash, Victor
2002-12-01
Many current and future NASA missions make extreme demands on mission personnel both in terms of work load and in performing under difficult environmental conditions. In situations where hands are impeded or needed for other tasks, eyes are busy attending to the environment, or tasks are sufficiently complex that ease of use of the interface becomes critical, spoken natural language dialog systems offer unique input and output modalities that can improve efficiency and safety. They also offer new capabilities that would not otherwise be available. For example, many NASA applications require astronauts to use computers in micro-gravity or while wearing space suits. Under these circumstances, command and control systems that allow users to issue commands or enter data in hands-and eyes-busy situations become critical. Speech recognition technology designed for current commercial applications limits the performance of the open-ended state-of-the-art dialog systems being developed at NASA. For example, today's recognition systems typically listen to user input only during short segments of the dialog, and user input outside of these short time windows is lost. Mistakes detecting the start and end times of user utterances can lead to mistakes in the recognition output, and the dialog system as a whole has no way to recover from this, or any other, recognition error. Systems also often require the user to signal when that user is going to speak, which is impractical in a hands-free environment, or only allow a system-initiated dialog requiring the user to speak immediately following a system prompt. In this project, SRI has developed software to enable speech recognition in a hands-free, open-microphone environment, eliminating the need for a push-to-talk button or other signaling mechanism. The software continuously captures a user's speech and makes it available to one or more recognizers. By constantly monitoring and storing the audio stream, it provides the spoken dialog manager extra flexibility to recognize the signal with no audio gaps between recognition requests, as well as to rerecognize portions of the signal, or to rerecognize speech with different grammars, acoustic models, recognizers, start times, and so on. SRI expects that this new open-mic functionality will enable NASA to develop better error-correction mechanisms for spoken dialog systems, and may also enable new interaction strategies.
Improved Open-Microphone Speech Recognition
NASA Technical Reports Server (NTRS)
Abrash, Victor
2002-01-01
Many current and future NASA missions make extreme demands on mission personnel both in terms of work load and in performing under difficult environmental conditions. In situations where hands are impeded or needed for other tasks, eyes are busy attending to the environment, or tasks are sufficiently complex that ease of use of the interface becomes critical, spoken natural language dialog systems offer unique input and output modalities that can improve efficiency and safety. They also offer new capabilities that would not otherwise be available. For example, many NASA applications require astronauts to use computers in micro-gravity or while wearing space suits. Under these circumstances, command and control systems that allow users to issue commands or enter data in hands-and eyes-busy situations become critical. Speech recognition technology designed for current commercial applications limits the performance of the open-ended state-of-the-art dialog systems being developed at NASA. For example, today's recognition systems typically listen to user input only during short segments of the dialog, and user input outside of these short time windows is lost. Mistakes detecting the start and end times of user utterances can lead to mistakes in the recognition output, and the dialog system as a whole has no way to recover from this, or any other, recognition error. Systems also often require the user to signal when that user is going to speak, which is impractical in a hands-free environment, or only allow a system-initiated dialog requiring the user to speak immediately following a system prompt. In this project, SRI has developed software to enable speech recognition in a hands-free, open-microphone environment, eliminating the need for a push-to-talk button or other signaling mechanism. The software continuously captures a user's speech and makes it available to one or more recognizers. By constantly monitoring and storing the audio stream, it provides the spoken dialog manager extra flexibility to recognize the signal with no audio gaps between recognition requests, as well as to rerecognize portions of the signal, or to rerecognize speech with different grammars, acoustic models, recognizers, start times, and so on. SRI expects that this new open-mic functionality will enable NASA to develop better error-correction mechanisms for spoken dialog systems, and may also enable new interaction strategies.
Wu, Che-Ming; Chen, Yen-An; Chan, Kai-Chieh; Lee, Li-Ang; Hsu, Kuang-Hung; Lin, Bao-Guey; Liu, Tien-Chen
2011-01-01
The aim of this study was to document receptive and expressive language levels and reading skills achieved by Mandarin-speaking children who had received cochlear implants (CIs) and used them for 4.75-7.42 years. The effects of possible associated factors were also analyzed. Standardized Mandarin language and reading tests were administered to 39 prelingually deaf children with Nucleus 24 devices. The Mandarin Chinese version of the Peabody Picture Vocabulary Test was used to assess their receptive vocabulary knowledge and the Revised Primary School Language Assessment Test for their receptive and expressive language skills. The Graded Chinese Character Recognition Test was used to test their written word recognition ability and the Reading Comprehension Test for their reading comprehension ability. Raw scores from both language and reading measurements were compared to normative data of nor- mal-hearing children to obtain standard scores. The results showed that the mean standard score for receptive vocabulary measurement and the mean T scores for the receptive language, expressive language and total language measurement were all in the low-average range in comparison to the normative sample. In contrast, the mean T scores for word and text reading comprehension were almost the same as for their age-matched hearing counterparts. Among all children with CIs, 75.7% scored within or above the normal range of their age-matched hearing peers on receptive vocabulary measurement. For total language, Chinese word recognition and reading scores, 71.8, 77 and 82% of children with CIs were age appropriate, respectively. A strong correlation was found between language and reading skills. Age at implantation and sentence perception scores account for 37% of variance for total language outcome. Sentence perception scores and preimplantation residual hearing were revealed to be associated with the outcome of reading comprehension. We concluded that by using standard tests, the language development and reading skill of Mandarin-speaking children who use CIs from a young age appear to fall within the normal range of their hearing age mates, at least after 4.8-7.4 years of experience. However, to fully evaluate the fine linguistic skills of these subjects, a more detailed study and longer follow-up period are needed. Copyright © 2010 S. Karger AG, Basel.
Speech Recognition for A Digital Video Library.
ERIC Educational Resources Information Center
Witbrock, Michael J.; Hauptmann, Alexander G.
1998-01-01
Production of the meta-data supporting the Informedia Digital Video Library interface is automated using techniques derived from artificial intelligence research. Speech recognition and natural-language processing, information retrieval, and image analysis are applied to produce an interface that helps users locate information and navigate more…
Willits, Jon A.; Seidenberg, Mark S.; Saffran, Jenny R.
2014-01-01
What makes some words easy for infants to recognize, and other words difficult? We addressed this issue in the context of prior results suggesting that infants have difficulty recognizing verbs relative to nouns. In this work, we highlight the role played by the distributional contexts in which nouns and verbs occur. Distributional statistics predict that English nouns should generally be easier to recognize than verbs in fluent speech. However, there are situations in which distributional statistics provide similar support for verbs. The statistics for verbs that occur with the English morpheme –ing, for example, should facilitate verb recognition. In two experiments with 7.5- and 9.5-month-old infants, we tested the importance of distributional statistics for word recognition by varying the frequency of the contextual frames in which verbs occur. The results support the conclusion that distributional statistics are utilized by infant language learners and contribute to noun–verb differences in word recognition. PMID:24908342
Automated Assessment of Child Vocalization Development Using LENA.
Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance
2017-07-12
To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.
Integration of Speech and Natural Language
1988-04-01
major activities: • Development of the syntax and semantics components for natural language processing. • Integration of the developed syntax and...evaluating the performance of speech recognition algonthms developed K» under the Strategic Computing Program. grs Our work on natural language processing...included the developement of a grammar (syntax) that uses the Uiuficanon gnmmaj formaMsm (an augmented context free formalism). The Unification
ERIC Educational Resources Information Center
Marchman, Virginia A.; Fernald, Anne
2008-01-01
The nature of predictive relations between early language and later cognitive function is a fundamental question in research on human cognition. In a longitudinal study assessing speed of language processing in infancy, Fernald, Perfors and Marchman (2006 ) found that reaction time at 25 months was strongly related to lexical and grammatical…
Auditory Word Recognition of Nouns and Verbs in Children with Specific Language Impairment (SLI)
ERIC Educational Resources Information Center
Andreu, Llorenc; Sanz-Torrent, Monica; Guardia-Olmos, Joan
2012-01-01
Nouns are fundamentally different from verbs semantically and syntactically, since verbs can specify one, two, or three nominal arguments. In this study, 25 children with Specific Language Impairment (age 5;3-8;2 years) and 50 typically developing children (3;3-8;2 years) participated in an eye-tracking experiment of spoken language comprehension…
ERIC Educational Resources Information Center
Hunter, Zoe R.; Brysbaert, Marc
2008-01-01
Traditional neuropsychology employs visual half-field (VHF) experiments to assess cerebral language dominance. This approach is based on the assumption that left cerebral dominance for language leads to faster and more accurate recognition of words in the right visual half-field (RVF) than in the left visual half-field (LVF) during tachistoscopic…
Masked Translation Priming Effects in Visual Word Recognition by Trilinguals
ERIC Educational Resources Information Center
Aparicio, Xavier; Lavaur, Jean-Marc
2016-01-01
The present study aims to investigate how trilinguals process their two non-dominant languages and how those languages influence one another, as well as the relative importance of the dominant language on their processing. With this in mind, 24 French (L1)- English (L2)- and Spanish (L3)-unbalanced trilinguals, deemed equivalent in their L2 and L3…
ERIC Educational Resources Information Center
Taha, Haitham
2017-01-01
The current research examined how Arabic diglossia affects verbal learning memory. Thirty native Arab college students were tested using auditory verbal memory test that was adapted according to the Rey Auditory Verbal Learning Test and developed in three versions: Pure spoken language version (SL), pure standard language version (SA), and…
Perceptual Decoding Processes for Language in a Visual Mode and for Language in an Auditory Mode.
ERIC Educational Resources Information Center
Myerson, Rosemarie Farkas
The purpose of this paper is to gain insight into the nature of the reading process through an understanding of the general nature of sensory processing mechanisms which reorganize and restructure input signals for central recognition, and an understanding of how the grammar of the language functions in defining the set of possible sentences in…
ANTLR Tree Grammar Generator and Extensions
NASA Technical Reports Server (NTRS)
Craymer, Loring
2005-01-01
A computer program implements two extensions of ANTLR (Another Tool for Language Recognition), which is a set of software tools for translating source codes between different computing languages. ANTLR supports predicated- LL(k) lexer and parser grammars, a notation for annotating parser grammars to direct tree construction, and predicated tree grammars. [ LL(k) signifies left-right, leftmost derivation with k tokens of look-ahead, referring to certain characteristics of a grammar.] One of the extensions is a syntax for tree transformations. The other extension is the generation of tree grammars from annotated parser or input tree grammars. These extensions can simplify the process of generating source-to-source language translators and they make possible an approach, called "polyphase parsing," to translation between computing languages. The typical approach to translator development is to identify high-level semantic constructs such as "expressions," "declarations," and "definitions" as fundamental building blocks in the grammar specification used for language recognition. The polyphase approach is to lump ambiguous syntactic constructs during parsing and then disambiguate the alternatives in subsequent tree transformation passes. Polyphase parsing is believed to be useful for generating efficient recognizers for C++ and other languages that, like C++, have significant ambiguities.
Research in speech communication.
Flanagan, J
1995-10-24
Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.
Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition
NASA Astrophysics Data System (ADS)
Ryumin, D.; Karpov, A. A.
2017-05-01
In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.
Nittrouer, Susan; Caldwell-Tarr, Amanda; Tarr, Eric; Lowenstein, Joanna H.; Rice, Caitlin; Moberly, Aaron C.
2014-01-01
Objective: This study examined speech recognition in noise for children with hearing loss, compared it to recognition for children with normal hearing, and examined mechanisms that might explain variance in children’s abilities to recognize speech in noise. Design: Word recognition was measured in two levels of noise, both when the speech and noise were co-located in front and when the noise came separately from one side. Four mechanisms were examined as factors possibly explaining variance: vocabulary knowledge, sensitivity to phonological structure, binaural summation, and head shadow. Study sample: Participants were 113 eight-year-old children. Forty-eight had normal hearing (NH) and 65 had hearing loss: 18 with hearing aids (HAs), 19 with one cochlear implant (CI), and 28 with two CIs. Results: Phonological sensitivity explained a significant amount of between-groups variance in speech-in-noise recognition. Little evidence of binaural summation was found. Head shadow was similar in magnitude for children with NH and with CIs, regardless of whether they wore one or two CIs. Children with HAs showed reduced head shadow effects. Conclusion: These outcomes suggest that in order to improve speech-in-noise recognition for children with hearing loss, intervention needs to be comprehensive, focusing on both language abilities and auditory mechanisms. PMID:23834373
Spoken word recognition by Latino children learning Spanish as their first language*
HURTADO, NEREYDA; MARCHMAN, VIRGINIA A.; FERNALD, ANNE
2010-01-01
Research on the development of efficiency in spoken language understanding has focused largely on middle-class children learning English. Here we extend this research to Spanish-learning children (n=49; M=2;0; range=1;3–3;1) living in the USA in Latino families from primarily low socioeconomic backgrounds. Children looked at pictures of familiar objects while listening to speech naming one of the objects. Analyses of eye movements revealed developmental increases in the efficiency of speech processing. Older children and children with larger vocabularies were more efficient at processing spoken language as it unfolds in real time, as previously documented with English learners. Children whose mothers had less education tended to be slower and less accurate than children of comparable age and vocabulary size whose mothers had more schooling, consistent with previous findings of slower rates of language learning in children from disadvantaged backgrounds. These results add to the cross-linguistic literature on the development of spoken word recognition and to the study of the impact of socioeconomic status (SES) factors on early language development. PMID:17542157
Multilingual education for European minority languages: The Basque Country and Friesland
NASA Astrophysics Data System (ADS)
Gorter, Durk; Cenoz, Jasone
2011-12-01
Over the last three decades, regional minority languages in Europe have regained increased recognition and support. Their revitalisation is partly due to their being taught in schools. Multilingualism has special characteristics for speakers of minority languages and it poses unique challenges for learning minority languages. This article looks at the cases of Basque and Frisian, comparing and contrasting their similarities and differences. The educational system in the Basque Autonomous Community underwent an important transformation, starting in 1979 from a situation where less than 5 per cent of all teachers were capable of teaching through Basque. Today this figure has changed to more than 80 per cent. An innovative approach was chosen for teaching the minority language, Basque, alongside the dominant language, Spanish, and the international language, English. The outcome is a substantial increase in the proficiency in the minority language among the younger age groups. The decline of the minority language has thus been successfully reversed and one of the major challenges now is to uphold a sustainable educational system. By contrast, the Frisian language has fared less well in the Netherlands, where developments over the last 30 years have been much slower and the results more modest. Here policy-making for education and for language is caught in a continuous debate between a weak provincial level and a powerful central state level. Overall, multilingualism as a resource for individuals is valued for "bigger" languages such as English, French and German, but not for a "small" language such as Frisian. Nevertheless, a few trilingual experiments have been carried out in some schools in Friesland in teaching Frisian, Dutch and English. These experiments may also be instructive for other cases of minority languages of a "moderate strength". In the cases of both Basque and Frisian multilingualism is generally perceived as an important resource.
Phonological Activation in Multi-Syllabic Sord Recognition
ERIC Educational Resources Information Center
Lee, Chang H.
2007-01-01
Three experiments were conducted to test the phonological recoding hypothesis in visual word recognition. Most studies on this issue have been conducted using mono-syllabic words, eventually constructing various models of phonological processing. Yet in many languages including English, the majority of words are multi-syllabic words. English…
Lieberman, Amy M.; Borovsky, Arielle; Hatrak, Marla; Mayberry, Rachel I.
2014-01-01
Sign language comprehension requires visual attention to the linguistic signal and visual attention to referents in the surrounding world, whereas these processes are divided between the auditory and visual modalities for spoken language comprehension. Additionally, the age-onset of first language acquisition and the quality and quantity of linguistic input and for deaf individuals is highly heterogeneous, which is rarely the case for hearing learners of spoken languages. Little is known about how these modality and developmental factors affect real-time lexical processing. In this study, we ask how these factors impact real-time recognition of American Sign Language (ASL) signs using a novel adaptation of the visual world paradigm in deaf adults who learned sign from birth (Experiment 1), and in deaf individuals who were late-learners of ASL (Experiment 2). Results revealed that although both groups of signers demonstrated rapid, incremental processing of ASL signs, only native-signers demonstrated early and robust activation of sub-lexical features of signs during real-time recognition. Our findings suggest that the organization of the mental lexicon into units of both form and meaning is a product of infant language learning and not the sensory and motor modality through which the linguistic signal is sent and received. PMID:25528091
Self-organized Evaluation of Dynamic Hand Gestures for Sign Language Recognition
NASA Astrophysics Data System (ADS)
Buciu, Ioan; Pitas, Ioannis
Two main theories exist with respect to face encoding and representation in the human visual system (HVS). The first one refers to the dense (holistic) representation of the face, where faces have "holon"-like appearance. The second one claims that a more appropriate face representation is given by a sparse code, where only a small fraction of the neural cells corresponding to face encoding is activated. Theoretical and experimental evidence suggest that the HVS performs face analysis (encoding, storing, face recognition, facial expression recognition) in a structured and hierarchical way, where both representations have their own contribution and goal. According to neuropsychological experiments, it seems that encoding for face recognition, relies on holistic image representation, while a sparse image representation is used for facial expression analysis and classification. From the computer vision perspective, the techniques developed for automatic face and facial expression recognition fall into the same two representation types. Like in Neuroscience, the techniques which perform better for face recognition yield a holistic image representation, while those techniques suitable for facial expression recognition use a sparse or local image representation. The proposed mathematical models of image formation and encoding try to simulate the efficient storing, organization and coding of data in the human cortex. This is equivalent with embedding constraints in the model design regarding dimensionality reduction, redundant information minimization, mutual information minimization, non-negativity constraints, class information, etc. The presented techniques are applied as a feature extraction step followed by a classification method, which also heavily influences the recognition results.
Muecas: A Multi-Sensor Robotic Head for Affective Human Robot Interaction and Imitation
Cid, Felipe; Moreno, Jose; Bustos, Pablo; Núñez, Pedro
2014-01-01
This paper presents a multi-sensor humanoid robotic head for human robot interaction. The design of the robotic head, Muecas, is based on ongoing research on the mechanisms of perception and imitation of human expressions and emotions. These mechanisms allow direct interaction between the robot and its human companion through the different natural language modalities: speech, body language and facial expressions. The robotic head has 12 degrees of freedom, in a human-like configuration, including eyes, eyebrows, mouth and neck, and has been designed and built entirely by IADeX (Engineering, Automation and Design of Extremadura) and RoboLab. A detailed description of its kinematics is provided along with the design of the most complex controllers. Muecas can be directly controlled by FACS (Facial Action Coding System), the de facto standard for facial expression recognition and synthesis. This feature facilitates its use by third party platforms and encourages the development of imitation and of goal-based systems. Imitation systems learn from the user, while goal-based ones use planning techniques to drive the user towards a final desired state. To show the flexibility and reliability of the robotic head, the paper presents a software architecture that is able to detect, recognize, classify and generate facial expressions in real time using FACS. This system has been implemented using the robotics framework, RoboComp, which provides hardware-independent access to the sensors in the head. Finally, the paper presents experimental results showing the real-time functioning of the whole system, including recognition and imitation of human facial expressions. PMID:24787636
Processing voiceless vowels in Japanese: Effects of language-specific phonological knowledge
NASA Astrophysics Data System (ADS)
Ogasawara, Naomi
2005-04-01
There has been little research on processing allophonic variation in the field of psycholinguistics. This study focuses on processing the voiced/voiceless allophonic alternation of high vowels in Japanese. Three perception experiments were conducted to explore how listeners parse out vowels with the voicing alternation from other segments in the speech stream and how the different voicing statuses of the vowel affect listeners' word recognition process. The results from the three experiments show that listeners use phonological knowledge of their native language for phoneme processing and for word recognition. However, interactions of the phonological and acoustic effects are observed to be different in each process. The facilitatory phonological effect and the inhibitory acoustic effect cancel out one another in phoneme processing; while in word recognition, the facilitatory phonological effect overrides the inhibitory acoustic effect.
NASA Technical Reports Server (NTRS)
2004-01-01
I/NET, Inc., is making the dream of natural human-computer conversation a practical reality. Through a combination of advanced artificial intelligence research and practical software design, I/NET has taken the complexity out of developing advanced, natural language interfaces. Conversational capabilities like pronoun resolution, anaphora and ellipsis processing, and dialog management that were once available only in the laboratory can now be brought to any application with any speech recognition system using I/NET s conversational engine middleware.
ERIC Educational Resources Information Center
Treville, Marie-Claude
This study investigated the effects of systematic use of similarities between the native and second languages on the lexical competence of second language learners. Subjects were 209 first- and second-year English-speaking university students in French language classes. The students were pre- and post-tested for their visual recognition of…
Quantify spatial relations to discover handwritten graphical symbols
NASA Astrophysics Data System (ADS)
Li, Jinpeng; Mouchère, Harold; Viard-Gaudin, Christian
2012-01-01
To model a handwritten graphical language, spatial relations describe how the strokes are positioned in the 2-dimensional space. Most of existing handwriting recognition systems make use of some predefined spatial relations. However, considering a complex graphical language, it is hard to express manually all the spatial relations. Another possibility would be to use a clustering technique to discover the spatial relations. In this paper, we discuss how to create a relational graph between strokes (nodes) labeled with graphemes in a graphical language. Then we vectorize spatial relations (edges) for clustering and quantization. As the targeted application, we extract the repetitive sub-graphs (graphical symbols) composed of graphemes and learned spatial relations. On two handwriting databases, a simple mathematical expression database and a complex flowchart database, the unsupervised spatial relations outperform the predefined spatial relations. In addition, we visualize the frequent patterns on two text-lines containing Chinese characters.
Weiland, Christina; Yoshikawa, Hirokazu
2013-01-01
Publicly funded prekindergarten programs have achieved small-to-large impacts on children's cognitive outcomes. The current study examined the impact of a prekindergarten program that implemented a coaching system and consistent literacy, language, and mathematics curricula on these and other nontargeted, essential components of school readiness, such as executive functioning. Participants included 2,018 four and five-year-old children. Findings indicated that the program had moderate-to-large impacts on children's language, literacy, numeracy and mathematics skills, and small impacts on children's executive functioning and a measure of emotion recognition. Some impacts were considerably larger for some subgroups. For urban public school districts, results inform important programmatic decisions. For policy makers, results confirm that prekindergarten programs can improve educationally vital outcomes for children in meaningful, important ways. © 2013 The Authors. Child Development © 2013 Society for Research in Child Development, Inc.
NASA Astrophysics Data System (ADS)
Lahamy, H.; Lichti, D.
2012-07-01
The automatic interpretation of human gestures can be used for a natural interaction with computers without the use of mechanical devices such as keyboards and mice. The recognition of hand postures have been studied for many years. However, most of the literature in this area has considered 2D images which cannot provide a full description of the hand gestures. In addition, a rotation-invariant identification remains an unsolved problem even with the use of 2D images. The objective of the current study is to design a rotation-invariant recognition process while using a 3D signature for classifying hand postures. An heuristic and voxelbased signature has been designed and implemented. The tracking of the hand motion is achieved with the Kalman filter. A unique training image per posture is used in the supervised classification. The designed recognition process and the tracking procedure have been successfully evaluated. This study has demonstrated the efficiency of the proposed rotation invariant 3D hand posture signature which leads to 98.24% recognition rate after testing 12723 samples of 12 gestures taken from the alphabet of the American Sign Language.
Speech Recognition Thresholds for Multilingual Populations.
ERIC Educational Resources Information Center
Ramkissoon, Ishara
2001-01-01
This article traces the development of speech audiometry in the United States and reports on the current status, focusing on the needs of a multilingual population in terms of measuring speech recognition threshold (SRT). It also discusses sociolinguistic considerations, alternative SRT stimuli for second language learners, and research on using…
Music Education Intervention Improves Vocal Emotion Recognition
ERIC Educational Resources Information Center
Mualem, Orit; Lavidor, Michal
2015-01-01
The current study is an interdisciplinary examination of the interplay among music, language, and emotions. It consisted of two experiments designed to investigate the relationship between musical abilities and vocal emotional recognition. In experiment 1 (N = 24), we compared the influence of two short-term intervention programs--music and…
Collaborative Efforts to Promote Emergent Literacy and Efficient Word Recognition Skills
ERIC Educational Resources Information Center
Roth, Froma P.; Troia, Gary A.
2006-01-01
In this article, 3 models of collaboration between speech-language pathologists and classroom teachers are discussed to promote emergent literacy and accurate and fluent word recognition. These models are demonstration lessons, team teaching, and consultation. A number of instructional principles are presented for emergent literacy and decoding…
How Cross-Language Similarity and Task Demands Affect Cognate Recognition
ERIC Educational Resources Information Center
Dijkstra, Ton; Miwa, Koji; Brummelhuis, Bianca; Sappelli, Maya; Baayen, Harald
2010-01-01
This study examines how the cross-linguistic similarity of translation equivalents affects bilingual word recognition. Performing one of three tasks, Dutch-English bilinguals processed cognates with varying degrees of form overlap between their English and Dutch counterparts (e.g., "lamp-lamp" vs. "flood-vloed" vs. "song-lied"). In lexical…
Language Education and Multilingualism in Colombia: Crossing the Divide
ERIC Educational Resources Information Center
de Mejía, Anne-Marie
2017-01-01
Despite Colombia's official recognition of its ethnic and cultural diversity, it has yet to develop in practice an inclusive educational vision involving the recognition of diversity, as well as promoting the country's insertion within the global market. Garcia et al. acknowledge the importance of "cultivating" students' diverse…
ERIC Educational Resources Information Center
Lehtonen, Minna; Hulten, Annika; Rodriguez-Fornells, Antoni; Cunillera, Toni; Tuomainen, Jyrki; Laine, Matti
2012-01-01
We investigated the behavioral and brain responses (ERPs) of bilingual word recognition to three fundamental psycholinguistic factors, frequency, morphology, and lexicality, in early bilinguals vs. monolinguals. Earlier behavioral studies have reported larger frequency effects in bilinguals' nondominant vs. dominant language and in some studies…
Cognitive Development and Reading Processes. Developmental Program Report Number 76.
ERIC Educational Resources Information Center
West, Richard F.
In discussing the relationship between cognitive development (perception, pattern recognition, and memory) and reading processes, this paper especially emphasizes developmental factors. After an overview of some issues that bear on how written language is processed, the paper presents a discussion of pattern recognition, including general pattern…
21 CFR 330.1 - General conditions for general recognition as safe, effective and not misbranded.
Code of Federal Regulations, 2012 CFR
2012-04-01
... exact language where exact language has been established and identified by quotation marks in an... “minimize”. (64) “Referred to as” or “of”. (65) “Sensation” or “feeling”. (66) “Solution” or “liquid”. (67...
21 CFR 330.1 - General conditions for general recognition as safe, effective and not misbranded.
Code of Federal Regulations, 2013 CFR
2013-04-01
... exact language where exact language has been established and identified by quotation marks in an... “minimize”. (64) “Referred to as” or “of”. (65) “Sensation” or “feeling”. (66) “Solution” or “liquid”. (67...
21 CFR 330.1 - General conditions for general recognition as safe, effective and not misbranded.
Code of Federal Regulations, 2014 CFR
2014-04-01
... exact language where exact language has been established and identified by quotation marks in an... “minimize”. (64) “Referred to as” or “of”. (65) “Sensation” or “feeling”. (66) “Solution” or “liquid”. (67...
Speech perception and spoken word recognition: past and present.
Jusezyk, Peter W; Luce, Paul A
2002-02-01
The scientific study of the perception of spoken language has been an exciting, prolific, and productive area of research for more than 50 yr. We have learned much about infants' and adults' remarkable capacities for perceiving and understanding the sounds of their language, as evidenced by our increasingly sophisticated theories of acquisition, process, and representation. We present a selective, but we hope, representative review of the past half century of research on speech perception, paying particular attention to the historical and theoretical contexts within which this research was conducted. Our foci in this review fall on three principle topics: early work on the discrimination and categorization of speech sounds, more recent efforts to understand the processes and representations that subserve spoken word recognition, and research on how infants acquire the capacity to perceive their native language. Our intent is to provide the reader a sense of the progress our field has experienced over the last half century in understanding the human's extraordinary capacity for the perception of spoken language.
Second Language Ability and Emotional Prosody Perception
Bhatara, Anjali; Laukka, Petri; Boll-Avetisyan, Natalie; Granjon, Lionel; Anger Elfenbein, Hillary; Bänziger, Tanja
2016-01-01
The present study examines the effect of language experience on vocal emotion perception in a second language. Native speakers of French with varying levels of self-reported English ability were asked to identify emotions from vocal expressions produced by American actors in a forced-choice task, and to rate their pleasantness, power, alertness and intensity on continuous scales. Stimuli included emotionally expressive English speech (emotional prosody) and non-linguistic vocalizations (affect bursts), and a baseline condition with Swiss-French pseudo-speech. Results revealed effects of English ability on the recognition of emotions in English speech but not in non-linguistic vocalizations. Specifically, higher English ability was associated with less accurate identification of positive emotions, but not with the interpretation of negative emotions. Moreover, higher English ability was associated with lower ratings of pleasantness and power, again only for emotional prosody. This suggests that second language skills may sometimes interfere with emotion recognition from speech prosody, particularly for positive emotions. PMID:27253326
Hidden Markov models in automatic speech recognition
NASA Astrophysics Data System (ADS)
Wrzoskowicz, Adam
1993-11-01
This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
ERIC Educational Resources Information Center
Green, Patricia J.; Sha, Mandy; Liu, Lu
2011-01-01
In 2001, the U.S. Department of Education and the Ministry of Education in China entered into a bilateral partnership to develop a technology-driven approach to foreign language learning that integrated gaming, immersion, voice recognition, problem-based learning tasks, and other features that made it a significant research and development pilot…
ERIC Educational Resources Information Center
Campos, Ana Duarte; Mendes Oliveira, Helena; Soares, Ana Paula
2018-01-01
The role of syllables as a sublexical unit in visual word recognition and reading is well established in deep and shallow syllable-timed languages such as French and Spanish, respectively. However, its role in intermediate stress-timed languages remains unclear. This paper aims to overcome this gap by studying for the first time the role of…
Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua
2017-11-24
Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Implicit phonological priming during visual word recognition.
Wilson, Lisa B; Tregellas, Jason R; Slason, Erin; Pasko, Bryce E; Rojas, Donald C
2011-03-15
Phonology is a lower-level structural aspect of language involving the sounds of a language and their organization in that language. Numerous behavioral studies utilizing priming, which refers to an increased sensitivity to a stimulus following prior experience with that or a related stimulus, have provided evidence for the role of phonology in visual word recognition. However, most language studies utilizing priming in conjunction with functional magnetic resonance imaging (fMRI) have focused on lexical-semantic aspects of language processing. The aim of the present study was to investigate the neurobiological substrates of the automatic, implicit stages of phonological processing. While undergoing fMRI, eighteen individuals performed a lexical decision task (LDT) on prime-target pairs including word-word homophone and pseudoword-word pseudohomophone pairs with a prime presentation below perceptual threshold. Whole-brain analyses revealed several cortical regions exhibiting hemodynamic response suppression due to phonological priming including bilateral superior temporal gyri (STG), middle temporal gyri (MTG), and angular gyri (AG) with additional region of interest (ROI) analyses revealing response suppression in the left lateralized supramarginal gyrus (SMG). Homophone and pseudohomophone priming also resulted in different patterns of hemodynamic responses relative to one another. These results suggest that phonological processing plays a key role in visual word recognition. Furthermore, enhanced hemodynamic responses for unrelated stimuli relative to primed stimuli were observed in midline cortical regions corresponding to the default-mode network (DMN) suggesting that DMN activity can be modulated by task requirements within the context of an implicit task. Copyright © 2010 Elsevier Inc. All rights reserved.
Relationships among Constructs of L2 Chinese Reading and Language Background
ERIC Educational Resources Information Center
Hsu, Wei-Li
2016-01-01
Extensive research has been conducted on the relationships of Chinese-character recognition to reading development; strategic competence to reading comprehension; and home linguistic exposure to heritage language acquisition. However, studies of these relationships have been marked by widely divergent theoretical underpinnings, and their results…
Implementing ICAO Language Proficiency Requirements in the Versant Aviation English Test
ERIC Educational Resources Information Center
Van Moere, Alistair; Suzuki, Masanori; Downey, Ryan; Cheng, Jian
2009-01-01
This paper discusses the development of an assessment to satisfy the International Civil Aviation Organization (ICAO) Language Proficiency Requirements. The Versant Aviation English Test utilizes speech recognition technology and a computerized testing platform, such that test administration and scoring are fully automated. Developed in…
The Role of Phonetics in the Teaching of Foreign Languages in India
ERIC Educational Resources Information Center
Bansal, R. K.
1974-01-01
Oral work is considered the most effective way of laying the foundations for language proficiency. Recognition and production of vowels and consonants, use of a pronouncing dictionary, and practice in accent rhythm and intonation should all be included in a pronunciation course. (SC)
Evans, Julia L; Gillam, Ronald B; Montgomery, James W
2018-05-10
This study examined the influence of cognitive factors on spoken word recognition in children with developmental language disorder (DLD) and typically developing (TD) children. Participants included 234 children (aged 7;0-11;11 years;months), 117 with DLD and 117 TD children, propensity matched for age, gender, socioeconomic status, and maternal education. Children completed a series of standardized assessment measures, a forward gating task, a rapid automatic naming task, and a series of tasks designed to examine cognitive factors hypothesized to influence spoken word recognition including phonological working memory, updating, attention shifting, and interference inhibition. Spoken word recognition for both initial and final accept gate points did not differ for children with DLD and TD controls after controlling target word knowledge in both groups. The 2 groups also did not differ on measures of updating, attention switching, and interference inhibition. Despite the lack of difference on these measures, for children with DLD, attention shifting and interference inhibition were significant predictors of spoken word recognition, whereas updating and receptive vocabulary were significant predictors of speed of spoken word recognition for the children in the TD group. Contrary to expectations, after controlling for target word knowledge, spoken word recognition did not differ for children with DLD and TD controls; however, the cognitive processing factors that influenced children's ability to recognize the target word in a stream of speech differed qualitatively for children with and without DLDs.
Heimann, Mikael; Strid, Karin; Smith, Lars; Tjus, Tomas; Ulvund, Stein Erik; Meltzoff, Andrew N.
2006-01-01
The relationship between recall memory, visual recognition memory, social communication, and the emergence of language skills was measured in a longitudinal study. Thirty typically developing Swedish children were tested at 6, 9 and 14 months. The result showed that, in combination, visual recognition memory at 6 months, deferred imitation at 9 months and turn-taking skills at 14 months could explain 41% of the variance in the infants’ production of communicative gestures as measured by a Swedish variant of the MacArthur Communicative Development Inventories (CDI). In this statistical model, deferred imitation stood out as the strongest predictor. PMID:16886041
Recognition of a person named entity from the text written in a natural language
NASA Astrophysics Data System (ADS)
Dolbin, A. V.; Rozaliev, V. L.; Orlova, Y. A.
2017-01-01
This work is devoted to the semantic analysis of texts, which were written in a natural language. The main goal of the research was to compare latent Dirichlet allocation and latent semantic analysis to identify elements of the human appearance in the text. The completeness of information retrieval was chosen as the efficiency criteria for methods comparison. However, it was insufficient to choose only one method for achieving high recognition rates. Thus, additional methods were used for finding references to the personality in the text. All these methods are based on the created information model, which represents person’s appearance.
A predictive study of reading comprehension in third-grade Spanish students.
López-Escribano, Carmen; Elosúa de Juan, María Rosa; Gómez-Veiga, Isabel; García-Madruga, Juan Antonio
2013-01-01
The study of the contribution of language and cognitive skills to reading comprehension is an important goal of current reading research. However, reading comprehension is not easily assessed by a single instrument, as different comprehension tests vary in the type of tasks used and in the cognitive demands required. This study examines the contribution of basic language and cognitive skills (decoding, word recognition, reading speed, verbal and nonverbal intelligence and working memory) to reading comprehension, assessed by two tests utilizing various tasks that require different skill sets in third-grade Spanish-speaking students. Linguistic and cognitive abilities predicted reading comprehension. A measure of reading speed (the reading time of pseudo-words) was the best predictor of reading comprehension when assessed by the PROLEC-R test. However, measures of word recognition (the orthographic choice task) and verbal working memory were the best predictors of reading comprehension when assessed by means of the DARC test. These results show, on the one hand, that reading speed and word recognition are better predictors of Spanish language comprehension than reading accuracy. On the other, the reading comprehension test applied here serves as a critical variable when analyzing and interpreting results regarding this topic.
Recognition of oral spelling is diagnostic of the central reading processes.
Schubert, Teresa; McCloskey, Michael
2015-01-01
The task of recognition of oral spelling (stimulus: "C-A-T", response: "cat") is often administered to individuals with acquired written language disorders, yet there is no consensus about the underlying cognitive processes. We adjudicate between two existing hypotheses: Recognition of oral spelling uses central reading processes, or recognition of oral spelling uses central spelling processes in reverse. We tested the recognition of oral spelling and spelling to dictation abilities of a single individual with acquired dyslexia and dysgraphia. She was impaired relative to matched controls in spelling to dictation but unimpaired in recognition of oral spelling. Recognition of oral spelling for exception words (e.g., colonel) and pronounceable nonwords (e.g., larth) was intact. Our results were predicted by the hypothesis that recognition of oral spelling involves the central reading processes. We conclude that recognition of oral spelling is a useful tool for probing the integrity of the central reading processes.
Schädler, Marc René; Warzybok, Anna; Meyer, Bernd T.; Brand, Thomas
2016-01-01
To characterize the individual patient’s hearing impairment as obtained with the matrix sentence recognition test, a simulation Framework for Auditory Discrimination Experiments (FADE) is extended here using the Attenuation and Distortion (A+D) approach by Plomp as a blueprint for setting the individual processing parameters. FADE has been shown to predict the outcome of both speech recognition tests and psychoacoustic experiments based on simulations using an automatic speech recognition system requiring only few assumptions. It builds on the closed-set matrix sentence recognition test which is advantageous for testing individual speech recognition in a way comparable across languages. Individual predictions of speech recognition thresholds in stationary and in fluctuating noise were derived using the audiogram and an estimate of the internal level uncertainty for modeling the individual Plomp curves fitted to the data with the Attenuation (A-) and Distortion (D-) parameters of the Plomp approach. The “typical” audiogram shapes from Bisgaard et al with or without a “typical” level uncertainty and the individual data were used for individual predictions. As a result, the individualization of the level uncertainty was found to be more important than the exact shape of the individual audiogram to accurately model the outcome of the German Matrix test in stationary or fluctuating noise for listeners with hearing impairment. The prediction accuracy of the individualized approach also outperforms the (modified) Speech Intelligibility Index approach which is based on the individual threshold data only. PMID:27604782
Automatic Speech Recognition: Reliability and Pedagogical Implications for Teaching Pronunciation
ERIC Educational Resources Information Center
Kim, In-Seok
2006-01-01
This study examines the reliability of automatic speech recognition (ASR) software used to teach English pronunciation, focusing on one particular piece of software, "FluSpeak, as a typical example." Thirty-six Korean English as a Foreign Language (EFL) college students participated in an experiment in which they listened to 15 sentences…
A Normed Study of Face Recognition in Autism and Related Disorders.
ERIC Educational Resources Information Center
Klin, Ami; Sparrow, Sara S.; de Bildt, Annelies; Cicchetti, Domenic V.; Cohen, Donald J.; Volkmar, Fred R.
1999-01-01
This study used a well-normed task of face recognition with 102 young children with autism, pervasive developmental disorder (PDD) not otherwise specified, and non-PDD disorders (mental retardation and language disorders) matched for chronological age and either verbal or nonverbal mental age. Autistic subjects exhibited pronounced deficits in…
ERIC Educational Resources Information Center
Spironelli, Chiara; Penolazzi, Barbara; Vio, Claudio; Angrilli, Alessandro
2010-01-01
Brain plasticity was investigated in 14 Italian children affected by developmental dyslexia after 6 months of phonological training. The means used to measure language reorganization was the recognition potential, an early wave, also called N150, elicited by automatic word recognition. This component peaks over the left temporo-occipital cortex…
Literacy in the Workplace: A Whole Language Approach.
ERIC Educational Resources Information Center
Carr, Kathryn S.
The personnel director of a local industry requested reading help from Central Missouri State University for several employees. After several meetings, a workplace literacy program that used the whole language approach supplemented by direct instruction in word recognition skills was developed. Two types of tests were written. One, a vocabulary…
Developmental Differences in Speech Act Recognition: A Pragmatic Awareness Study
ERIC Educational Resources Information Center
Garcia, Paula
2004-01-01
With the growing acknowledgement of the importance of pragmatic competence in second language (L2) learning, language researchers have identified the comprehension of speech acts as they occur in natural conversation as essential to communicative competence (e.g. Bardovi-Harlig, 2001; Thomas, 1983). Nonconventional indirect speech acts are formed…
Expert Knowledge, Distinctiveness, and Levels of Processing in Language Learning
ERIC Educational Resources Information Center
Bird, Steve
2012-01-01
The foreign language vocabulary learning research literature often attributes strong mnemonic potency to the cognitive processing of meaning when learning words. Routinely cited as support for this idea are experiments by Craik and Tulving (C&T) demonstrating superior recognition and recall of studied words following semantic tasks ("deep"…
Inferring Speaker Affect in Spoken Natural Language Communication
ERIC Educational Resources Information Center
Pon-Barry, Heather Roberta
2013-01-01
The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…
Methodological Note: Analyzing Signs for Recognition & Feature Salience.
ERIC Educational Resources Information Center
Shyan, Melissa R.
1985-01-01
Presents a method to determine how signs in American Sign Language are recognized by signers. The method uses natural settings and avoids common artificialities found in prior work. A pilot study is described involving language research with Atlantic Bottlenose Dolphins in which the method was successfully used. (SED)
Cabral Soares, Fernanda; de Oliveira, Thaís Cristina Galdino; de Macedo, Liliane Dias e Dias; Tomás, Alessandra Mendonça; Picanço-Diniz, Domingos Luiz Wanderley; Bento-Torres, João; Bento-Torres, Natáli Valim Oliver; Picanço-Diniz, Cristovam Wanderley
2015-01-01
Objective The recognition of the limits between normal and pathological aging is essential to start preventive actions. The aim of this paper is to compare the Cambridge Neuropsychological Test Automated Battery (CANTAB) and language tests to distinguish subtle differences in cognitive performances in two different age groups, namely young adults and elderly cognitively normal subjects. Method We selected 29 young adults (29.9±1.06 years) and 31 older adults (74.1±1.15 years) matched by educational level (years of schooling). All subjects underwent a general assessment and a battery of neuropsychological tests, including the Mini Mental State Examination, visuospatial learning, and memory tasks from CANTAB and language tests. Cluster and discriminant analysis were applied to all neuropsychological test results to distinguish possible subgroups inside each age group. Results Significant differences in the performance of aged and young adults were detected in both language and visuospatial memory tests. Intragroup cluster and discriminant analysis revealed that CANTAB, as compared to language tests, was able to detect subtle but significant differences between the subjects. Conclusion Based on these findings, we concluded that, as compared to language tests, large-scale application of automated visuospatial tests to assess learning and memory might increase our ability to discern the limits between normal and pathological aging. PMID:25565785
Le Bel, Ronald M; Pineda, Jaime A; Sharma, Anu
2009-01-01
The mirror neuron system (MNS) is a trimodal system composed of neuronal populations that respond to motor, visual, and auditory stimulation, such as when an action is performed, observed, heard or read about. In humans, the MNS has been identified using neuroimaging techniques (such as fMRI and mu suppression in the EEG). It reflects an integration of motor-auditory-visual information processing related to aspects of language learning including action understanding and recognition. Such integration may also form the basis for language-related constructs such as theory of mind. In this article, we review the MNS system as it relates to the cognitive development of language in typically developing children and in children at-risk for communication disorders, such as children with autism spectrum disorder (ASD) or hearing impairment. Studying MNS development in these children may help illuminate an important role of the MNS in children with communication disorders. Studies with deaf children are especially important because they offer potential insights into how the MNS is reorganized when one modality, such as audition, is deprived during early cognitive development, and this may have long-term consequences on language maturation and theory of mind abilities. Readers will be able to (1) understand the concept of mirror neurons, (2) identify cortical areas associated with the MNS in animal and human studies, (3) discuss the use of mu suppression in the EEG for measuring the MNS in humans, and (4) discuss MNS dysfunction in children with (ASD).
Spoken Language Processing in the Clarissa Procedure Browser
NASA Technical Reports Server (NTRS)
Rayner, M.; Hockey, B. A.; Renders, J.-M.; Chatzichrisafis, N.; Farrell, K.
2005-01-01
Clarissa, an experimental voice enabled procedure browser that has recently been deployed on the International Space Station, is as far as we know the first spoken dialog system in space. We describe the objectives of the Clarissa project and the system's architecture. In particular, we focus on three key problems: grammar-based speech recognition using the Regulus toolkit; methods for open mic speech recognition; and robust side-effect free dialogue management for handling undos, corrections and confirmations. We first describe the grammar-based recogniser we have build using Regulus, and report experiments where we compare it against a class N-gram recogniser trained off the same 3297 utterance dataset. We obtained a 15% relative improvement in WER and a 37% improvement in semantic error rate. The grammar-based recogniser moreover outperforms the class N-gram version for utterances of all lengths from 1 to 9 words inclusive. The central problem in building an open-mic speech recognition system is being able to distinguish between commands directed at the system, and other material (cross-talk), which should be rejected. Most spoken dialogue systems make the accept/reject decision by applying a threshold to the recognition confidence score. NASA shows how a simple and general method, based on standard approaches to document classification using Support Vector Machines, can give substantially better performance, and report experiments showing a relative reduction in the task-level error rate by about 25% compared to the baseline confidence threshold method. Finally, we describe a general side-effect free dialogue management architecture that we have implemented in Clarissa, which extends the "update semantics'' framework by including task as well as dialogue information in the information state. We show that this enables elegant treatments of several dialogue management problems, including corrections, confirmations, querying of the environment, and regression testing.
Clustering of Farsi sub-word images for whole-book recognition
NASA Astrophysics Data System (ADS)
Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier
2015-01-01
Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.
Language Measures of the NIH Toolbox Cognition Battery
Gershon, Richard C.; Cook, Karon F.; Mungas, Dan; Manly, Jennifer J.; Slotkin, Jerry; Beaumont, Jennifer L.; Weintraub, Sandra
2015-01-01
Language facilitates communication and efficient encoding of thought and experience. Because of its essential role in early childhood development, in educational achievement and in subsequent life adaptation, language was included as one of the subdomains in the NIH Toolbox for the Assessment of Neurological and Behavioral Function Cognition Battery (NIHTB-CB). There are many different components of language functioning, including syntactic processing (i.e., morphology and grammar) and lexical semantics. For purposes of the NIHTB-CB, two tests of language—a picture vocabulary test and a reading recognition test—were selected by consensus based on literature reviews, iterative expert input, and a desire to assess in English and Spanish. NIHTB-CB’s picture vocabulary and reading recognition tests are administered using computer adaptive testing and scored using item response theory. Data are presented from the validation of the English versions in a sample of adults ages 20–85 years (Spanish results will be presented in a future publication). Both tests demonstrated high test–retest reliability and good construct validity compared to corresponding gold-standard measures. Scores on the NIH Toolbox measures were consistent with age-related expectations, namely, growth in language during early development, with relative stabilization into late adulthood. PMID:24960128
Speech and language development in cognitively delayed children with cochlear implants.
Holt, Rachael Frush; Kirk, Karen Iler
2005-04-01
The primary goals of this investigation were to examine the speech and language development of deaf children with cochlear implants and mild cognitive delay and to compare their gains with those of children with cochlear implants who do not have this additional impairment. We retrospectively examined the speech and language development of 69 children with pre-lingual deafness. The experimental group consisted of 19 children with cognitive delays and no other disabilities (mean age at implantation = 38 months). The control group consisted of 50 children who did not have cognitive delays or any other identified disability. The control group was stratified by primary communication mode: half used total communication (mean age at implantation = 32 months) and the other half used oral communication (mean age at implantation = 26 months). Children were tested on a variety of standard speech and language measures and one test of auditory skill development at 6-month intervals. The results from each test were collapsed from blocks of two consecutive 6-month intervals to calculate group mean scores before implantation and at 1-year intervals after implantation. The children with cognitive delays and those without such delays demonstrated significant improvement in their speech and language skills over time on every test administered. Children with cognitive delays had significantly lower scores than typically developing children on two of the three measures of receptive and expressive language and had significantly slower rates of auditory-only sentence recognition development. Finally, there were no significant group differences in auditory skill development based on parental reports or in auditory-only or multimodal word recognition. The results suggest that deaf children with mild cognitive impairments benefit from cochlear implantation. Specifically, improvements are evident in their ability to perceive speech and in their reception and use of language. However, it may be reduced relative to their typically developing peers with cochlear implants, particularly in domains that require higher level skills, such as sentence recognition and receptive and expressive language. These findings suggest that children with mild cognitive deficits be considered for cochlear implantation with less trepidation than has been the case in the past. Although their speech and language gains may be tempered by their cognitive abilities, these limitations do not appear to preclude benefit from cochlear implant stimulation, as assessed by traditional measures of speech and language development.
Discovering Peripheral Arterial Disease Cases from Radiology Notes Using Natural Language Processing
Savova, Guergana K.; Fan, Jin; Ye, Zi; Murphy, Sean P.; Zheng, Jiaping; Chute, Christopher G.; Kullo, Iftikhar J.
2010-01-01
As part of the Electronic Medical Records and Genomics Network, we applied, extended and evaluated an open source clinical Natural Language Processing system, Mayo’s Clinical Text Analysis and Knowledge Extraction System, for the discovery of peripheral arterial disease cases from radiology reports. The manually created gold standard consisted of 223 positive, 19 negative, 63 probable and 150 unknown cases. Overall accuracy agreement between the system and the gold standard was 0.93 as compared to a named entity recognition baseline of 0.46. Sensitivity for the positive, probable and unknown cases was 0.93–0.96, and for the negative cases was 0.72. Specificity and negative predictive value for all categories were in the 90’s. The positive predictive value for the positive and unknown categories was in the high 90’s, for the negative category was 0.84, and for the probable category was 0.63. We outline the main sources of errors and suggest improvements. PMID:21347073
Taha, Haitham
2017-06-01
The current research examined how Arabic diglossia affects verbal learning memory. Thirty native Arab college students were tested using auditory verbal memory test that was adapted according to the Rey Auditory Verbal Learning Test and developed in three versions: Pure spoken language version (SL), pure standard language version (SA), and phonologically similar version (PS). The result showed that for immediate free-recall, the performances were better for the SL and the PS conditions compared to the SA one. However, for the parts of delayed recall and recognition, the results did not reveal any significant consistent effect of diglossia. Accordingly, it was suggested that diglossia has a significant effect on the storage and short term memory functions but not on long term memory functions. The results were discussed in light of different approaches in the field of bilingual memory.
Internship Abstract and Final Reflection
NASA Technical Reports Server (NTRS)
Sandor, Edward
2016-01-01
The primary objective for this internship is the evaluation of an embedded natural language processor (NLP) as a way to introduce voice control into future space suits. An embedded natural language processor would provide an astronaut hands-free control for making adjustments to the environment of the space suit and checking status of consumables procedures and navigation. Additionally, the use of an embedded NLP could potentially reduce crew fatigue, increase the crewmember's situational awareness during extravehicular activity (EVA) and improve the ability to focus on mission critical details. The use of an embedded NLP may be valuable for other human spaceflight applications desiring hands-free control as well. An embedded NLP is unique because it is a small device that performs language tasks, including speech recognition, which normally require powerful processors. The dedicated device could perform speech recognition locally with a smaller form-factor and lower power consumption than traditional methods.
Aspect-Oriented Programming is Quantification and Implicit Invocation
NASA Technical Reports Server (NTRS)
Filman, Robert E.; Friedman, Daniel P.; Koga, Dennis (Technical Monitor)
2001-01-01
We propose that the distinguishing characteristic of Aspect-Oriented Programming (AOP) languages is that they allow programming by making quantified programmatic assertions over programs that lack local notation indicating the invocation of these assertions. This suggests that AOP systems can be analyzed with respect to three critical dimensions: the kinds of quantifications allowed, the nature of the interactions that can be asserted, and the mechanism for combining base-level actions with asserted actions. Consequences of this perspective are the recognition that certain systems are not AOP and that some mechanisms are metabolism: they are sufficiently expressive to allow straightforwardly programming an AOP system within them.
Research in speech communication.
Flanagan, J
1995-01-01
Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13 PMID:7479806
ERIC Educational Resources Information Center
Steinel, Margarita P.; Hulstijn, Jan H.; Steinel, Wolfgang
2007-01-01
In a paired-associate learning (PAL) task, Dutch university students (n = 129) learned 20 English second language (L2) idioms either receptively or productively (i.e., L2-first language [L1] or L1-L2) and were tested in two directions (i.e., recognition or production) immediately after learning and 3 weeks later. Receptive and productive…
Recognition of Langue des Signes Quebecoise in Eastern Canada
ERIC Educational Resources Information Center
Parisot, Anne-Marie; Rinfret, Julie
2012-01-01
This article presents a portrait of two community-level and legal efforts in Canada to obtain official recognition of ASL and LSQ (Langue des signes quebecoise), both of which are recognized as official languages by the Canadian Association of the Deaf (CAD). In order to situate this issue in the Canadian linguistic context, the authors first…
Orthographic Neighborhood Effects in Recognition and Recall Tasks in a Transparent Orthography
ERIC Educational Resources Information Center
Justi, Francis R. R.; Jaeger, Antonio
2017-01-01
The number of orthographic neighbors of a word influences its probability of being retrieved in recognition and free recall memory tests. Even though this phenomenon is well demonstrated for English words, it has yet to be demonstrated for languages with more predictable grapheme-phoneme mappings than English. To address this issue, 4 experiments…
ERIC Educational Resources Information Center
Laufer, Batia; Aviad-Levitzky, Tami
2017-01-01
This study examined how well second language (L2) recall and recognition vocabulary tests correlated with a reading test, how well each vocabulary test discriminated between reading proficiency levels, and how accurate each test was in predicting reading proficiency when compared with corpus studies. A total of 116 college-level learners of…
ERIC Educational Resources Information Center
Gross, Thomas F.
2008-01-01
The recognition of facial immaturity and emotional expression by children with autism, language disorders, mental retardation, and non-disabled controls was studied in two experiments. Children identified immaturity and expression in upright and inverted faces. The autism group identified fewer immature faces and expressions than control (Exp. 1 &…
The Effects of Textual Enhancement Type on L2 Form Recognition and Reading Comprehension in Spanish
ERIC Educational Resources Information Center
LaBrozzi, Ryan M.
2016-01-01
Previous research investigating the effectiveness of textual enhancement as a tool to draw adult second language (L2) learners' attention to the targeted linguistic form has consistently produced mixed results. This article examines how L2 form recognition and reading comprehension are affected by different types of textual enhancement.…
Investigating an Innovative Computer Application to Improve L2 Word Recognition from Speech
ERIC Educational Resources Information Center
Matthews, Joshua; O'Toole, John Mitchell
2015-01-01
The ability to recognise words from the aural modality is a critical aspect of successful second language (L2) listening comprehension. However, little research has been reported on computer-mediated development of L2 word recognition from speech in L2 learning contexts. This report describes the development of an innovative computer application…
Letter Names: Effect on Letter Saying, Spelling, and Word Recognition in Hebrew.
ERIC Educational Resources Information Center
Levin, Iris; Patel, Sigal; Margalit, Tamar; Barad, Noa
2002-01-01
Examined whether letter names, which bridge the gap between oral and written language among English speaking children, have a similar function in Hebrew. In findings from studies of Israeli kindergartners and first graders, children were found to rely on letter names in performing a number of letter saying, spelling, and word recognition tasks.…
From Numbers to Letters: Feedback Regularization in Visual Word Recognition
ERIC Educational Resources Information Center
Molinaro, Nicola; Dunabeitia, Jon Andoni; Marin-Gutierrez, Alejandro; Carreiras, Manuel
2010-01-01
Word reading in alphabetic languages involves letter identification, independently of the format in which these letters are written. This process of letter "regularization" is sensitive to word context, leading to the recognition of a word even when numbers that resemble letters are inserted among other real letters (e.g., M4TERI4L). The present…
ASL Handshape Stories, Word Recognition and Signing Deaf Readers: An Exploratory Study
ERIC Educational Resources Information Center
Gietz, Merrilee R.
2013-01-01
The effectiveness of using American Sign Language (ASL) handshape stories to teach word recognition in whole stories using a descriptive case study approach was explored. Four profoundly deaf children ages 7 to 8, enrolled in a self-contained deaf education classroom in a public school in the south participated in the story time five-week…
Arbib, Michael A
2005-04-01
The article analyzes the neural and functional grounding of language skills as well as their emergence in hominid evolution, hypothesizing stages leading from abilities known to exist in monkeys and apes and presumed to exist in our hominid ancestors right through to modern spoken and signed languages. The starting point is the observation that both premotor area F5 in monkeys and Broca's area in humans contain a "mirror system" active for both execution and observation of manual actions, and that F5 and Broca's area are homologous brain regions. This grounded the mirror system hypothesis of Rizzolatti and Arbib (1998) which offers the mirror system for grasping as a key neural "missing link" between the abilities of our nonhuman ancestors of 20 million years ago and modern human language, with manual gestures rather than a system for vocal communication providing the initial seed for this evolutionary process. The present article, however, goes "beyond the mirror" to offer hypotheses on evolutionary changes within and outside the mirror systems which may have occurred to equip Homo sapiens with a language-ready brain. Crucial to the early stages of this progression is the mirror system for grasping and its extension to permit imitation. Imitation is seen as evolving via a so-called simple system such as that found in chimpanzees (which allows imitation of complex "object-oriented" sequences but only as the result of extensive practice) to a so-called complex system found in humans (which allows rapid imitation even of complex sequences, under appropriate conditions) which supports pantomime. This is hypothesized to have provided the substrate for the development of protosign, a combinatorially open repertoire of manual gestures, which then provides the scaffolding for the emergence of protospeech (which thus owes little to nonhuman vocalizations), with protosign and protospeech then developing in an expanding spiral. It is argued that these stages involve biological evolution of both brain and body. By contrast, it is argued that the progression from protosign and protospeech to languages with full-blown syntax and compositional semantics was a historical phenomenon in the development of Homo sapiens, involving few if any further biological changes.
A Suggested Automated Branch Program for Foreign Languages.
ERIC Educational Resources Information Center
Barrutia, Richard
1964-01-01
Completely automated and operated by student feedback, this program teaches and tests foreign language recognition and retention, gives repeated audiolingual practice on model structures, and allows the student to tailor the program to his individual needs. The program is recorded on four tape tracks (track 1 for the most correct answer, etc.).…
Finding Words in a Language that Allows Words without Vowels
ERIC Educational Resources Information Center
El Aissati, Abder; McQueen, James M.; Cutler, Anne
2012-01-01
Across many languages from unrelated families, spoken-word recognition is subject to a constraint whereby potential word candidates must contain a vowel. This constraint minimizes competition from embedded words (e.g., in English, disfavoring "win" in "twin" because "t" cannot be a word). However, the constraint would be counter-productive in…
ERIC Educational Resources Information Center
Wong, Simpson W. L.; Chow, Bonnie Wing-Yin; Ho, Connie Suk-Han; Waye, Mary M. Y.; Bishop, Dorothy V. M.
2014-01-01
This twin study examined the relative contributions of genes and environment on 2nd language reading acquisition of Chinese-speaking children learning English. We examined whether specific skills-visual word recognition, receptive vocabulary, phonological awareness, phonological memory, and speech discrimination-in the 1st and 2nd languages have…
Actes des Journees de linguistique (Proceedings of the Linguistics Conference) (9th, 1995).
ERIC Educational Resources Information Center
Audette, Julie, Ed.; And Others
Papers (entirely in French) presented at the conference on linguistics include these topics: language used in the legislature of New Brunswick; cohesion in the text of Arabic-speaking language learners; automatic adverb recognition; logic of machine translation in teaching revision; expansion in physics texts; discourse analysis and the syntax of…
The English Language of the Nigeria Police
ERIC Educational Resources Information Center
Chinwe, Udo Victoria
2015-01-01
In the present day Nigeria, the quality of the English language spoken by Nigerians, is perceived to have been deteriorating and needs urgent attention. The proliferation of books and articles in the recent years can be seen as the native outcrop of its received attention and recognition as a matter of discourse. Evidently, every profession,…
Language Names and Norms in Bosnia and Herzegovina
ERIC Educational Resources Information Center
Swagman, Kirstin J.
2011-01-01
The institutionalization of separate standard varieties for Bosnian, Croatian, and Serbian in the 1990s was hailed by many Bosnians as the long-denied recognition of the Bosnian idiom as distinct from the Serbian and Croatian varieties it had so often been subordinated under. Yet the accompanying codification of Bosnian standard language forms has…
Complementary Schools in Action: Networking for Language Development in East London
ERIC Educational Resources Information Center
Sneddon, Raymonde
2014-01-01
In a challenging economic and political context, complementary schools in East London are mentoring each other and forming networks across communities to gain recognition and status for community languages in education and the wider community. As issues of power and status impact in different ways on differently situated communities, complementary…
Reading in EFL: Facts and Fictions.
ERIC Educational Resources Information Center
Paran, Amos
1996-01-01
Examines the representation of the reading process in English as a Foreign Language (EFL) texts. The article argues that many of these representations are dated and based on a theory that was never a mainstream theory of first-language reading. Suggestions for exercises to strengthen automatic word recognition in EFL readers are provided. (33…
Perceiving and Remembering Events Cross-Linguistically: Evidence from Dual-Task Paradigms
ERIC Educational Resources Information Center
Trueswell, John C.; Papafragou, Anna
2010-01-01
What role does language play during attention allocation in perceiving and remembering events? We recorded adults' eye movements as they studied animated motion events for a later recognition task. We compared native speakers of two languages that use different means of expressing motion (Greek and English). In Experiment 1, eye movements revealed…
Relationships between Lexical Processing Speed, Language Skills, and Autistic Traits in Children
ERIC Educational Resources Information Center
Abrigo, Erin
2012-01-01
According to current models of spoken word recognition listeners understand speech as it unfolds over time. Eye tracking provides a non-invasive, on-line method to monitor attention, providing insight into the processing of spoken language. In the current project a spoken lexical processing assessment (LPA) confirmed current theories of spoken…
Games in Language Learning: Opportunities and Challenges
ERIC Educational Resources Information Center
Godwin-Jones, Robert
2014-01-01
There has been a substantial increase in recent years in the interest in using digital games for language learning. This coincides with the explosive growth in multiplayer online gaming and with the proliferation of mobile games for smart phones. It also reflects the growing recognition among educators of the importance of extramural, informal…
Preserved Visual Language Identification Despite Severe Alexia
ERIC Educational Resources Information Center
Di Pietro, Marie; Ptak, Radek; Schnider, Armin
2012-01-01
Patients with letter-by-letter alexia may have residual access to lexical or semantic representations of words despite severely impaired overt word recognition (reading). Here, we report a multilingual patient with severe letter-by-letter alexia who rapidly identified the language of written words and sentences in French and English while he had…
Cognate and Word Class Ambiguity Effects in Noun and Verb Processing
ERIC Educational Resources Information Center
Bultena, Sybrine; Dijkstra, Ton; van Hell, Janet G.
2013-01-01
This study examined how noun and verb processing in bilingual visual word recognition are affected by within and between-language overlap. We investigated how word class ambiguous noun and verb cognates are processed by bilinguals, to see if co-activation of overlapping word forms between languages benefits from additional overlap within a…
Loukusa, Soile; Mäkinen, Leena; Kuusikko-Gauffin, Sanna; Ebeling, Hanna; Moilanen, Irma
2014-01-01
Social perception skills, such as understanding the mind and emotions of others, affect children's communication abilities in real-life situations. In addition to autism spectrum disorder (ASD), there is increasing knowledge that children with specific language impairment (SLI) also demonstrate difficulties in their social perception abilities. To compare the performance of children with SLI, ASD and typical development (TD) in social perception tasks measuring Theory of Mind (ToM) and emotion recognition. In addition, to evaluate the association between social perception tasks and language tests measuring word-finding abilities, knowledge of grammatical morphology and verbal working memory. Children with SLI (n = 18), ASD (n = 14) and TD (n = 25) completed two NEPSY-II subtests measuring social perception abilities: (1) Affect Recognition and (2) ToM (includes Verbal and non-verbal Contextual tasks). In addition, children's word-finding abilities were measured with the TWF-2, grammatical morphology by using the Grammatical Closure subtest of ITPA, and verbal working memory by using subtests of Sentence Repetition or Word List Interference (chosen according the child's age) of the NEPSY-II. Children with ASD scored significantly lower than children with SLI or TD on the NEPSY-II Affect Recognition subtest. Both SLI and ASD groups scored significantly lower than TD children on Verbal tasks of the ToM subtest of NEPSY-II. However, there were no significant group differences on non-verbal Contextual tasks of the ToM subtest of the NEPSY-II. Verbal tasks of the ToM subtest were correlated with the Grammatical Closure subtest and TWF-2 in children with SLI. In children with ASD correlation between TWF-2 and ToM: Verbal tasks was moderate, almost achieving statistical significance, but no other correlations were found. Both SLI and ASD groups showed difficulties in tasks measuring verbal ToM but differences were not found in tasks measuring non-verbal Contextual ToM. The association between Verbal ToM tasks and language tests was stronger in children with SLI than in children with ASD. There is a need for further studies in order to understand interaction between different areas of language and cognitive development. © 2014 Royal College of Speech and Language Therapists.
Savundranayagam, Marie Y; Moore-Nielsen, Kelsey
2015-10-01
There are many recommended language-based strategies for effective communication with persons with dementia. What is unknown is whether effective language-based strategies are also person centered. Accordingly, the objective of this study was to examine whether language-based strategies for effective communication with persons with dementia overlapped with the following indicators of person-centered communication: recognition, negotiation, facilitation, and validation. Conversations (N = 46) between staff-resident dyads were audio-recorded during routine care tasks over 12 weeks. Staff utterances were coded twice, using language-based and person-centered categories. There were 21 language-based categories and 4 person-centered categories. There were 5,800 utterances transcribed: 2,409 without indicators, 1,699 coded as language or person centered, and 1,692 overlapping utterances. For recognition, 26% of utterances were greetings, 21% were affirmations, 13% were questions (yes/no and open-ended), and 15% involved rephrasing. Questions (yes/no, choice, and open-ended) comprised 74% of utterances that were coded as negotiation. A similar pattern was observed for utterances coded as facilitation where 51% of utterances coded as facilitation were yes/no questions, open-ended questions, and choice questions. However, 21% of facilitative utterances were affirmations and 13% involved rephrasing. Finally, 89% of utterances coded as validation were affirmations. The findings identify specific language-based strategies that support person-centered communication. However, between 1 and 4, out of a possible 21 language-based strategies, overlapped with at least 10% of utterances coded as each person-centered indicator. This finding suggests that staff need training to use more diverse language strategies that support personhood of residents with dementia.
HWDA: A coherence recognition and resolution algorithm for hybrid web data aggregation
NASA Astrophysics Data System (ADS)
Guo, Shuhang; Wang, Jian; Wang, Tong
2017-09-01
Aiming at the object confliction recognition and resolution problem for hybrid distributed data stream aggregation, a distributed data stream object coherence solution technology is proposed. Firstly, the framework was defined for the object coherence conflict recognition and resolution, named HWDA. Secondly, an object coherence recognition technology was proposed based on formal language description logic and hierarchical dependency relationship between logic rules. Thirdly, a conflict traversal recognition algorithm was proposed based on the defined dependency graph. Next, the conflict resolution technology was prompted based on resolution pattern matching including the definition of the three types of conflict, conflict resolution matching pattern and arbitration resolution method. At last, the experiment use two kinds of web test data sets to validate the effect of application utilizing the conflict recognition and resolution technology of HWDA.
Phoneme Error Pattern by Heritage Speakers of Spanish on an English Word Recognition Test.
Shi, Lu-Feng
2017-04-01
Heritage speakers acquire their native language from home use in their early childhood. As the native language is typically a minority language in the society, these individuals receive their formal education in the majority language and eventually develop greater competency with the majority than their native language. To date, there have not been specific research attempts to understand word recognition by heritage speakers. It is not clear if and to what degree we may infer from evidence based on bilingual listeners in general. This preliminary study investigated how heritage speakers of Spanish perform on an English word recognition test and analyzed their phoneme errors. A prospective, cross-sectional, observational design was employed. Twelve normal-hearing adult Spanish heritage speakers (four men, eight women, 20-38 yr old) participated in the study. Their language background was obtained through the Language Experience and Proficiency Questionnaire. Nine English monolingual listeners (three men, six women, 20-41 yr old) were also included for comparison purposes. Listeners were presented with 200 Northwestern University Auditory Test No. 6 words in quiet. They repeated each word orally and in writing. Their responses were scored by word, word-initial consonant, vowel, and word-final consonant. Performance was compared between groups with Student's t test or analysis of variance. Group-specific error patterns were primarily descriptive, but intergroup comparisons were made using 95% or 99% confidence intervals for proportional data. The two groups of listeners yielded comparable scores when their responses were examined by word, vowel, and final consonant. However, heritage speakers of Spanish misidentified significantly more word-initial consonants and had significantly more difficulty with initial /p, b, h/ than their monolingual peers. The two groups yielded similar patterns for vowel and word-final consonants, but heritage speakers made significantly fewer errors with /e/ and more errors with word-final /p, k/. Data reported in the present study lead to a twofold conclusion. On the one hand, normal-hearing heritage speakers of Spanish may misidentify English phonemes in patterns different from those of English monolingual listeners. Not all phoneme errors can be readily understood by comparing Spanish and English phonology, suggesting that Spanish heritage speakers differ in performance from other Spanish-English bilingual listeners. On the other hand, the absolute number of errors and the error pattern of most phonemes were comparable between English monolingual listeners and Spanish heritage speakers, suggesting that audiologists may assess word recognition in quiet in the same way for these two groups of listeners, if diagnosis is based on words, not phonemes. American Academy of Audiology
Shi, Lu-Feng; Koenig, Laura L
2016-01-01
Non-native listeners do not recognize English sentences as effectively as native listeners, especially in noise. It is not entirely clear to what extent such group differences arise from differences in relative weight of semantic versus syntactic cues. This study quantified the use and weighting of these contextual cues via Boothroyd and Nittrouer's j and k factors. The j represents the probability of recognizing sentences with or without context, whereas the k represents the degree to which context improves recognition performance. Four groups of 13 normal-hearing young adult listeners participated. One group consisted of native English monolingual (EMN) listeners, whereas the other three consisted of non-native listeners contrasting in their language dominance and first language: English-dominant Russian-English, Russian-dominant Russian-English, and Spanish-dominant Spanish-English bilinguals. All listeners were presented three sets of four-word sentences: high-predictability sentences included both semantic and syntactic cues, low-predictability sentences included syntactic cues only, and zero-predictability sentences included neither semantic nor syntactic cues. Sentences were presented at 65 dB SPL binaurally in the presence of speech-spectrum noise at +3 dB SNR. Listeners orally repeated each sentence and recognition was calculated for individual words as well as the sentence as a whole. Comparable j values across groups for high-predictability, low-predictability, and zero-predictability sentences suggested that all listeners, native and non-native, utilized contextual cues to recognize English sentences. Analysis of the k factor indicated that non-native listeners took advantage of syntax as effectively as EMN listeners. However, only English-dominant bilinguals utilized semantics to the same extent as EMN listeners; semantics did not provide a significant benefit for the two non-English-dominant groups. When combined, semantics and syntax benefitted EMN listeners significantly more than all three non-native groups of listeners. Language background influenced the use and weighting of semantic and syntactic cues in a complex manner. A native language advantage existed in the effective use of both cues combined. A language-dominance effect was seen in the use of semantics. No first-language effect was present for the use of either or both cues. For all non-native listeners, syntax contributed significantly more to sentence recognition than semantics, possibly due to the fact that semantics develops more gradually than syntax in second-language acquisition. The present study provides evidence that Boothroyd and Nittrouer's j and k factors can be successfully used to quantify the effectiveness of contextual cue use in clinically relevant, linguistically diverse populations.
Action and Emotion Recognition from Point Light Displays: An Investigation of Gender Differences
Alaerts, Kaat; Nackaerts, Evelien; Meyns, Pieter; Swinnen, Stephan P.; Wenderoth, Nicole
2011-01-01
Folk psychology advocates the existence of gender differences in socio-cognitive functions such as ‘reading’ the mental states of others or discerning subtle differences in body-language. A female advantage has been demonstrated for emotion recognition from facial expressions, but virtually nothing is known about gender differences in recognizing bodily stimuli or body language. The aim of the present study was to investigate potential gender differences in a series of tasks, involving the recognition of distinct features from point light displays (PLDs) depicting bodily movements of a male and female actor. Although recognition scores were considerably high at the overall group level, female participants were more accurate than males in recognizing the depicted actions from PLDs. Response times were significantly higher for males compared to females on PLD recognition tasks involving (i) the general recognition of ‘biological motion’ versus ‘non-biological’ (or ‘scrambled’ motion); or (ii) the recognition of the ‘emotional state’ of the PLD-figures. No gender differences were revealed for a control test (involving the identification of a color change in one of the dots) and for recognizing the gender of the PLD-figure. In addition, previous findings of a female advantage on a facial emotion recognition test (the ‘Reading the Mind in the Eyes Test’ (Baron-Cohen, 2001)) were replicated in this study. Interestingly, a strong correlation was revealed between emotion recognition from bodily PLDs versus facial cues. This relationship indicates that inter-individual or gender-dependent differences in recognizing emotions are relatively generalized across facial and bodily emotion perception. Moreover, the tight correlation between a subject's ability to discern subtle emotional cues from PLDs and the subject's ability to basically discriminate biological from non-biological motion provides indications that differences in emotion recognition may - at least to some degree – be related to more basic differences in processing biological motion per se. PMID:21695266
Action and emotion recognition from point light displays: an investigation of gender differences.
Alaerts, Kaat; Nackaerts, Evelien; Meyns, Pieter; Swinnen, Stephan P; Wenderoth, Nicole
2011-01-01
Folk psychology advocates the existence of gender differences in socio-cognitive functions such as 'reading' the mental states of others or discerning subtle differences in body-language. A female advantage has been demonstrated for emotion recognition from facial expressions, but virtually nothing is known about gender differences in recognizing bodily stimuli or body language. The aim of the present study was to investigate potential gender differences in a series of tasks, involving the recognition of distinct features from point light displays (PLDs) depicting bodily movements of a male and female actor. Although recognition scores were considerably high at the overall group level, female participants were more accurate than males in recognizing the depicted actions from PLDs. Response times were significantly higher for males compared to females on PLD recognition tasks involving (i) the general recognition of 'biological motion' versus 'non-biological' (or 'scrambled' motion); or (ii) the recognition of the 'emotional state' of the PLD-figures. No gender differences were revealed for a control test (involving the identification of a color change in one of the dots) and for recognizing the gender of the PLD-figure. In addition, previous findings of a female advantage on a facial emotion recognition test (the 'Reading the Mind in the Eyes Test' (Baron-Cohen, 2001)) were replicated in this study. Interestingly, a strong correlation was revealed between emotion recognition from bodily PLDs versus facial cues. This relationship indicates that inter-individual or gender-dependent differences in recognizing emotions are relatively generalized across facial and bodily emotion perception. Moreover, the tight correlation between a subject's ability to discern subtle emotional cues from PLDs and the subject's ability to basically discriminate biological from non-biological motion provides indications that differences in emotion recognition may - at least to some degree - be related to more basic differences in processing biological motion per se.
Han, Xu; Kim, Jung-jae; Kwoh, Chee Keong
2016-01-01
Biomedical text mining may target various kinds of valuable information embedded in the literature, but a critical obstacle to the extension of the mining targets is the cost of manual construction of labeled data, which are required for state-of-the-art supervised learning systems. Active learning is to choose the most informative documents for the supervised learning in order to reduce the amount of required manual annotations. Previous works of active learning, however, focused on the tasks of entity recognition and protein-protein interactions, but not on event extraction tasks for multiple event types. They also did not consider the evidence of event participants, which might be a clue for the presence of events in unlabeled documents. Moreover, the confidence scores of events produced by event extraction systems are not reliable for ranking documents in terms of informativity for supervised learning. We here propose a novel committee-based active learning method that supports multi-event extraction tasks and employs a new statistical method for informativity estimation instead of using the confidence scores from event extraction systems. Our method is based on a committee of two systems as follows: We first employ an event extraction system to filter potential false negatives among unlabeled documents, from which the system does not extract any event. We then develop a statistical method to rank the potential false negatives of unlabeled documents 1) by using a language model that measures the probabilities of the expression of multiple events in documents and 2) by using a named entity recognition system that locates the named entities that can be event arguments (e.g. proteins). The proposed method further deals with unknown words in test data by using word similarity measures. We also apply our active learning method for the task of named entity recognition. We evaluate the proposed method against the BioNLP Shared Tasks datasets, and show that our method can achieve better performance than such previous methods as entropy and Gibbs error based methods and a conventional committee-based method. We also show that the incorporation of named entity recognition into the active learning for event extraction and the unknown word handling further improve the active learning method. In addition, the adaptation of the active learning method into named entity recognition tasks also improves the document selection for manual annotation of named entities.
Zhang, Juan; Meng, Yaxuan; Wu, Chenggang; Zhou, Danny Q
2017-01-01
Music and language share many attributes and a large body of evidence shows that sensitivity to acoustic cues in music is positively related to language development and even subsequent reading acquisition. However, such association was mainly found in alphabetic languages. What remains unclear is whether sensitivity to acoustic cues in music is associated with reading in Chinese, a morphosyllabic language. The present study aimed to answer this question by measuring music (i.e., musical metric perception and pitch discrimination), language (i.e., phonological awareness, lexical tone sensitivity), and reading abilities (i.e., word recognition) among 54 third-grade Chinese-English bilingual children. After controlling for age and non-verbal intelligence, we found that both musical metric perception and pitch discrimination accounted for unique variance of Chinese phonological awareness while pitch discrimination rather than musical metric perception predicted Chinese lexical tone sensitivity. More importantly, neither musical metric perception nor pitch discrimination was associated with Chinese reading. As for English, musical metric perception and pitch discrimination were correlated with both English phonological awareness and English reading. Furthermore, sensitivity to acoustic cues in music was associated with English reading through the mediation of English phonological awareness. The current findings indicate that the association between sensitivity to acoustic cues in music and reading may be modulated by writing systems. In Chinese, the mapping between orthography and phonology is not as transparent as in alphabetic languages such as English. Thus, this opaque mapping may alter the auditory perceptual sensitivity in music to Chinese reading.
Zhang, Juan; Meng, Yaxuan; Wu, Chenggang; Zhou, Danny Q.
2017-01-01
Music and language share many attributes and a large body of evidence shows that sensitivity to acoustic cues in music is positively related to language development and even subsequent reading acquisition. However, such association was mainly found in alphabetic languages. What remains unclear is whether sensitivity to acoustic cues in music is associated with reading in Chinese, a morphosyllabic language. The present study aimed to answer this question by measuring music (i.e., musical metric perception and pitch discrimination), language (i.e., phonological awareness, lexical tone sensitivity), and reading abilities (i.e., word recognition) among 54 third-grade Chinese–English bilingual children. After controlling for age and non-verbal intelligence, we found that both musical metric perception and pitch discrimination accounted for unique variance of Chinese phonological awareness while pitch discrimination rather than musical metric perception predicted Chinese lexical tone sensitivity. More importantly, neither musical metric perception nor pitch discrimination was associated with Chinese reading. As for English, musical metric perception and pitch discrimination were correlated with both English phonological awareness and English reading. Furthermore, sensitivity to acoustic cues in music was associated with English reading through the mediation of English phonological awareness. The current findings indicate that the association between sensitivity to acoustic cues in music and reading may be modulated by writing systems. In Chinese, the mapping between orthography and phonology is not as transparent as in alphabetic languages such as English. Thus, this opaque mapping may alter the auditory perceptual sensitivity in music to Chinese reading. PMID:29170647
Cultural Specific Effects on the Recognition of Basic Emotions: A Study on Italian Subjects
NASA Astrophysics Data System (ADS)
Esposito, Anna; Riviello, Maria Teresa; Bourbakis, Nikolaos
The present work reports the results of perceptual experiments aimed to investigate if some of the basic emotions are perceptually privileged and if the cultural environment and the perceptual mode play a role in this preference. To this aim, Italian subjects were requested to assess emotional stimuli extracted from Italian and American English movies in the single (either video or audio alone) and the combined audio/video mode. Results showed that anger, fear, and sadness are better perceived than surprise, happiness in both the cultural environments (irony instead strongly depend on the language), that emotional information is affected by the communication mode and that language plays a role in assessing emotional information. Implications for the implementation of emotionally colored interactive systems are discussed.
Speaker gender identification based on majority vote classifiers
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2017-03-01
Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.
A Multidimensional Approach to the Study of Emotion Recognition in Autism Spectrum Disorders
Xavier, Jean; Vignaud, Violaine; Ruggiero, Rosa; Bodeau, Nicolas; Cohen, David; Chaby, Laurence
2015-01-01
Although deficits in emotion recognition have been widely reported in autism spectrum disorder (ASD), experiments have been restricted to either facial or vocal expressions. Here, we explored multimodal emotion processing in children with ASD (N = 19) and with typical development (TD, N = 19), considering uni (faces and voices) and multimodal (faces/voices simultaneously) stimuli and developmental comorbidities (neuro-visual, language and motor impairments). Compared to TD controls, children with ASD had rather high and heterogeneous emotion recognition scores but showed also several significant differences: lower emotion recognition scores for visual stimuli, for neutral emotion, and a greater number of saccades during visual task. Multivariate analyses showed that: (1) the difficulties they experienced with visual stimuli were partially alleviated with multimodal stimuli. (2) Developmental age was significantly associated with emotion recognition in TD children, whereas it was the case only for the multimodal task in children with ASD. (3) Language impairments tended to be associated with emotion recognition scores of ASD children in the auditory modality. Conversely, in the visual or bimodal (visuo-auditory) tasks, the impact of developmental coordination disorder or neuro-visual impairments was not found. We conclude that impaired emotion processing constitutes a dimension to explore in the field of ASD, as research has the potential to define more homogeneous subgroups and tailored interventions. However, it is clear that developmental age, the nature of the stimuli, and other developmental comorbidities must also be taken into account when studying this dimension. PMID:26733928
ERIC Educational Resources Information Center
Pooley, Robert C.; Golub, Lester S.
Emphasizing the behavioral and social aspects of language as a foundation for instruction, 16 concepts for learning the structure of English in grades 7-9 are outlined in an attempt to set down in logical order the basic concepts involved in the understanding of the English language. The concepts begin with a recognition of the social purposes of…
Identification of related gene/protein names based on an HMM of name variations.
Yeganova, L; Smith, L; Wilbur, W J
2004-04-01
Gene and protein names follow few, if any, true naming conventions and are subject to great variation in different occurrences of the same name. This gives rise to two important problems in natural language processing. First, can one locate the names of genes or proteins in free text, and second, can one determine when two names denote the same gene or protein? The first of these problems is a special case of the problem of named entity recognition, while the second is a special case of the problem of automatic term recognition (ATR). We study the second problem, that of gene or protein name variation. Here we describe a system which, given a query gene or protein name, identifies related gene or protein names in a large list. The system is based on a dynamic programming algorithm for sequence alignment in which the mutation matrix is allowed to vary under the control of a fully trainable hidden Markov model.
Discovering latent commercial networks from online financial news articles
NASA Astrophysics Data System (ADS)
Xia, Yunqing; Su, Weifeng; Lau, Raymond Y. K.; Liu, Yi
2013-08-01
Unlike most online social networks where explicit links among individual users are defined, the relations among commercial entities (e.g. firms) may not be explicitly declared in commercial Web sites. One main contribution of this article is the development of a novel computational model for the discovery of the latent relations among commercial entities from online financial news. More specifically, a CRF model which can exploit both structural and contextual features is applied to commercial entity recognition. In addition, a point-wise mutual information (PMI)-based unsupervised learning method is developed for commercial relation identification. To evaluate the effectiveness of the proposed computational methods, a prototype system called CoNet has been developed. Based on the financial news articles crawled from Google finance, the CoNet system achieves average F-scores of 0.681 and 0.754 in commercial entity recognition and commercial relation identification, respectively. Our experimental results confirm that the proposed shallow natural language processing methods are effective for the discovery of latent commercial networks from online financial news.
Rieffe, Carolien; Wiefferink, Carin H
2017-03-01
The capacity for emotion recognition and understanding is crucial for daily social functioning. We examined to what extent this capacity is impaired in young children with a Language Impairment (LI). In typical development, children learn to recognize emotions in faces and situations through social experiences and social learning. Children with LI have less access to these experiences and are therefore expected to fall behind their peers without LI. In this study, 89 preschool children with LI and 202 children without LI (mean age 3 years and 10 months in both groups) were tested on three indices for facial emotion recognition (discrimination, identification, and attribution in emotion evoking situations). Parents reported on their children's emotion vocabulary and ability to talk about their own emotions. Preschoolers with and without LI performed similarly on the non-verbal task for emotion discrimination. Children with LI fell behind their peers without LI on both other tasks for emotion recognition that involved labelling the four basic emotions (happy, sad, angry, fear). The outcomes of these two tasks were also related to children's level of emotion language. These outcomes emphasize the importance of 'emotion talk' at the youngest age possible for children with LI. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fengler, Ineke; Delfau, Pia-Céline; Röder, Brigitte
2018-04-01
It is yet unclear whether congenitally deaf cochlear implant (CD CI) users' visual and multisensory emotion perception is influenced by their history in sign language acquisition. We hypothesized that early-signing CD CI users, relative to late-signing CD CI users and hearing, non-signing controls, show better facial expression recognition and rely more on the facial cues of audio-visual emotional stimuli. Two groups of young adult CD CI users-early signers (ES CI users; n = 11) and late signers (LS CI users; n = 10)-and a group of hearing, non-signing, age-matched controls (n = 12) performed an emotion recognition task with auditory, visual, and cross-modal emotionally congruent and incongruent speech stimuli. On different trials, participants categorized either the facial or the vocal expressions. The ES CI users more accurately recognized affective prosody than the LS CI users in the presence of congruent facial information. Furthermore, the ES CI users, but not the LS CI users, gained more than the controls from congruent visual stimuli when recognizing affective prosody. Both CI groups performed overall worse than the controls in recognizing affective prosody. These results suggest that early sign language experience affects multisensory emotion perception in CD CI users.
Kollmeier, Birger; Schädler, Marc René; Warzybok, Anna; Meyer, Bernd T; Brand, Thomas
2016-09-07
To characterize the individual patient's hearing impairment as obtained with the matrix sentence recognition test, a simulation Framework for Auditory Discrimination Experiments (FADE) is extended here using the Attenuation and Distortion (A+D) approach by Plomp as a blueprint for setting the individual processing parameters. FADE has been shown to predict the outcome of both speech recognition tests and psychoacoustic experiments based on simulations using an automatic speech recognition system requiring only few assumptions. It builds on the closed-set matrix sentence recognition test which is advantageous for testing individual speech recognition in a way comparable across languages. Individual predictions of speech recognition thresholds in stationary and in fluctuating noise were derived using the audiogram and an estimate of the internal level uncertainty for modeling the individual Plomp curves fitted to the data with the Attenuation (A-) and Distortion (D-) parameters of the Plomp approach. The "typical" audiogram shapes from Bisgaard et al with or without a "typical" level uncertainty and the individual data were used for individual predictions. As a result, the individualization of the level uncertainty was found to be more important than the exact shape of the individual audiogram to accurately model the outcome of the German Matrix test in stationary or fluctuating noise for listeners with hearing impairment. The prediction accuracy of the individualized approach also outperforms the (modified) Speech Intelligibility Index approach which is based on the individual threshold data only. © The Author(s) 2016.
Object recognition with severe spatial deficits in Williams syndrome: sparing and breakdown.
Landau, Barbara; Hoffman, James E; Kurz, Nicole
2006-07-01
Williams syndrome (WS) is a rare genetic disorder that results in severe visual-spatial cognitive deficits coupled with relative sparing in language, face recognition, and certain aspects of motion processing. Here, we look for evidence for sparing or impairment in another cognitive system-object recognition. Children with WS, normal mental-age (MA) and chronological age-matched (CA) children, and normal adults viewed pictures of a large range of objects briefly presented under various conditions of degradation, including canonical and unusual orientations, and clear or blurred contours. Objects were shown as either full-color views (Experiment 1) or line drawings (Experiment 2). Across both experiments, WS and MA children performed similarly in all conditions while CA children performed better than both WS group and MA groups with unusual views. This advantage, however, was eliminated when images were also blurred. The error types and relative difficulty of different objects were similar across all participant groups. The results indicate selective sparing of basic mechanisms of object recognition in WS, together with developmental delay or arrest in recognition of objects from unusual viewpoints. These findings are consistent with the growing literature on brain abnormalities in WS which points to selective impairment in the parietal areas of the brain. As a whole, the results lend further support to the growing literature on the functional separability of object recognition mechanisms from other spatial functions, and raise intriguing questions about the link between genetic deficits and cognition.
Background feature descriptor for offline handwritten numeral recognition
NASA Astrophysics Data System (ADS)
Ming, Delie; Wang, Hao; Tian, Tian; Jie, Feiran; Lei, Bo
2011-11-01
This paper puts forward an offline handwritten numeral recognition method based on background structural descriptor (sixteen-value numerical background expression). Through encoding the background pixels in the image according to a certain rule, 16 different eigenvalues were generated, which reflected the background condition of every digit, then reflected the structural features of the digits. Through pattern language description of images by these features, automatic segmentation of overlapping digits and numeral recognition can be realized. This method is characterized by great deformation resistant ability, high recognition speed and easy realization. Finally, the experimental results and conclusions are presented. The experimental results of recognizing datasets from various practical application fields reflect that with this method, a good recognition effect can be achieved.
NASA Astrophysics Data System (ADS)
Kaur, Jaswinder; Jagdev, Gagandeep, Dr.
2018-01-01
Optical character recognition is concerned with the recognition of optically processed characters. The recognition is done offline after the writing or printing has been completed, unlike online recognition where the computer has to recognize the characters instantly as they are drawn. The performance of character recognition depends upon the quality of scanned documents. The preprocessing steps are used for removing low-frequency background noise and normalizing the intensity of individual scanned documents. Several filters are used for reducing certain image details and enabling an easier or faster evaluation. The primary aim of the research work is to recognize handwritten and machine written characters and differentiate them. The language opted for the research work is Punjabi Gurmukhi and tool utilized is Matlab.
Real-Time Hand Posture Recognition Using a Range Camera
NASA Astrophysics Data System (ADS)
Lahamy, Herve
The basic goal of human computer interaction is to improve the interaction between users and computers by making computers more usable and receptive to the user's needs. Within this context, the use of hand postures in replacement of traditional devices such as keyboards, mice and joysticks is being explored by many researchers. The goal is to interpret human postures via mathematical algorithms. Hand posture recognition has gained popularity in recent years, and could become the future tool for humans to interact with computers or virtual environments. An exhaustive description of the frequently used methods available in literature for hand posture recognition is provided. It focuses on the different types of sensors and data used, the segmentation and tracking methods, the features used to represent the hand postures as well as the classifiers considered in the recognition process. Those methods are usually presented as highly robust with a recognition rate close to 100%. However, a couple of critical points necessary for a successful real-time hand posture recognition system require major improvement. Those points include the features used to represent the hand segment, the number of postures simultaneously recognizable, the invariance of the features with respect to rotation, translation and scale and also the behavior of the classifiers against non-perfect hand segments for example segments including part of the arm or missing part of the palm. A 3D time-of-flight camera named SR4000 has been chosen to develop a new methodology because of its capability to provide in real-time and at high frame rate 3D information on the scene imaged. This sensor has been described and evaluated for its capability for capturing in real-time a moving hand. A new recognition method that uses the 3D information provided by the range camera to recognize hand postures has been proposed. The different steps of this methodology including the segmentation, the tracking, the hand modeling and finally the recognition process have been described and evaluated extensively. In addition, the performance of this method has been analyzed against several existing hand posture recognition techniques found in literature. The proposed system is able to recognize with an overall recognition rate of 98% and in real-time 18 out the 33 postures of the American sign language alphabet. This recognition is translation, rotation and scale invariant.
On the Suitability of Mobile Cloud Computing at the Tactical Edge
2014-04-23
geolocation; Facial recognition (photo identification/classification); Intelligence, Surveillance, and Reconnaissance (ISR); and Fusion of Electronic...could benefit most from MCC are those with large processing overhead, low bandwidth requirements, and a need for large database support (e.g., facial ... recognition , language translation). The effect—specifically on the communication links—of supporting these applications at the tactical edge
Mandarin Chinese Tone Identification in Cochlear Implants: Predictions from Acoustic Models
Morton, Kenneth D.; Torrione, Peter A.; Throckmorton, Chandra S.; Collins, Leslie M.
2015-01-01
It has been established that current cochlear implants do not supply adequate spectral information for perception of tonal languages. Comprehension of a tonal language, such as Mandarin Chinese, requires recognition of lexical tones. New strategies of cochlear stimulation such as variable stimulation rate and current steering may provide the means of delivering more spectral information and thus may provide the auditory fine structure required for tone recognition. Several cochlear implant signal processing strategies are examined in this study, the continuous interleaved sampling (CIS) algorithm, the frequency amplitude modulation encoding (FAME) algorithm, and the multiple carrier frequency algorithm (MCFA). These strategies provide different types and amounts of spectral information. Pattern recognition techniques can be applied to data from Mandarin Chinese tone recognition tasks using acoustic models as a means of testing the abilities of these algorithms to transmit the changes in fundamental frequency indicative of the four lexical tones. The ability of processed Mandarin Chinese tones to be correctly classified may predict trends in the effectiveness of different signal processing algorithms in cochlear implants. The proposed techniques can predict trends in performance of the signal processing techniques in quiet conditions but fail to do so in noise. PMID:18706497
Distributed cooperating processes in a mobile robot control system
NASA Technical Reports Server (NTRS)
Skillman, Thomas L., Jr.
1988-01-01
A mobile inspection robot has been proposed for the NASA Space Station. It will be a free flying autonomous vehicle that will leave a berthing unit to accomplish a variety of inspection tasks around the Space Station, and then return to its berth to recharge, refuel, and transfer information. The Flying Eye robot will receive voice communication to change its attitude, move at a constant velocity, and move to a predefined location along a self generated path. This mobile robot control system requires integration of traditional command and control techniques with a number of AI technologies. Speech recognition, natural language understanding, task and path planning, sensory abstraction and pattern recognition are all required for successful implementation. The interface between the traditional numeric control techniques and the symbolic processing to the AI technologies must be developed, and a distributed computing approach will be needed to meet the real time computing requirements. To study the integration of the elements of this project, a novel mobile robot control architecture and simulation based on the blackboard architecture was developed. The control system operation and structure is discussed.
Quantity Recognition among Speakers of an Anumeric Language
ERIC Educational Resources Information Center
Everett, Caleb; Madora, Keren
2012-01-01
Recent research has suggested that the Piraha, an Amazonian tribe with a number-less language, are able to match quantities greater than 3 if the matching task does not require recall or spatial transposition. This finding contravenes previous work among the Piraha. In this study, we re-tested the Pirahas' performance in the crucial one-to-one…
ERIC Educational Resources Information Center
Rahimi, Mehrak; Karkami, Fatemeh Hosseini
2015-01-01
This study investigated the role of EFL teachers' classroom discipline strategies in their teaching effectiveness and their students' motivation and achievement in learning English as a foreign language. 1408 junior high-school students expressed their perceptions of the strategies their English teachers used (punishment, recognition/reward,…
ERIC Educational Resources Information Center
Guttinger, Hellen I., Ed.
The reading improvement activities in this handbook are intended for use by middle school language arts teachers. Focusing on study skills, vocabulary development, and comprehension development, the activities include (1) surveying literary materials, (2) outlining, (3) spelling, (4) syllabication, (5) word recognition, (6) using synonyms, (7)…
ERIC Educational Resources Information Center
Diesendruck, Gil
2003-01-01
Drawing on the notion of the domain-specificity of recognition, reviews evidence on the effect of language in classification of and reasoning about categories from different domains. Looks at anthropological infant classification, and preschool categorization literature. Suggests the causal nature and indicative power of animal categories seem to…
Reading Comprehension in Autism Spectrum Disorders: The Role of Oral Language and Social Functioning
ERIC Educational Resources Information Center
Ricketts, Jessie; Jones, Catherine R. G.; Happe, Francesca; Charman, Tony
2013-01-01
Reading comprehension is an area of difficulty for many individuals with autism spectrum disorders (ASD). According to the Simple View of Reading, word recognition and oral language are both important determinants of reading comprehension ability. We provide a novel test of this model in 100 adolescents with ASD of varying intellectual ability.…
Federal Recognition of the Rights of Minority Language Groups.
ERIC Educational Resources Information Center
Leibowitz, Arnold H.
Federal laws, policies, and court decisions pertaining to the civil rights of minority language groups are reviewed, with an emphasis on political, legal, economic, and educational access. Areas in which progress has been made and those in which access is still limited are identified. It is argued that a continuing federal role is necessary to…
ERIC Educational Resources Information Center
van Jaarsveld, Pieter
2016-01-01
Pre-service secondary mathematics teachers have a poor command of the exact language of mathematics as evidenced in assignments, micro-lessons and practicums. The unrelenting notorious annual South African National Senior Certificate outcomes in mathematics and the recognition by the Department of Basic Education (DBE) that the correct use of…