lipreading: Topics by Science.gov

Sample records for lipreading

Lip-reading enhancement for law enforcement

NASA Astrophysics Data System (ADS)

Theobald, Barry J.; Harvey, Richard; Cox, Stephen J.; Lewis, Colin; Owen, Gari P.

2006-09-01

Accurate lip-reading techniques would be of enormous benefit for agencies involved in counter-terrorism and other law-enforcement areas. Unfortunately, there are very few skilled lip-readers, and it is apparently a difficult skill to transmit, so the area is under-resourced. In this paper we investigate the possibility of making the lip-reading task more amenable to a wider range of operators by enhancing lip movements in video sequences using active appearance models. These are generative, parametric models commonly used to track faces in images and video sequences. The parametric nature of the model allows a face in an image to be encoded in terms of a few tens of parameters, while the generative nature allows faces to be re-synthesised using the parameters. The aim of this study is to determine if exaggerating lip-motions in video sequences by amplifying the parameters of the model improves lip-reading ability. We also present results of lip-reading tests undertaken by experienced (but non-expert) adult subjects who claim to use lip-reading in their speech recognition process. The results, which are comparisons of word error-rates on unprocessed and processed video, are mixed. We find that there appears to be the potential to improve the word error rate but, for the method to improve the intelligibility there is need for more sophisticated tracking and visual modelling. Our technique can also act as an expression or visual gesture amplifier and so has applications to animation and the presentation of information via avatars or synthetic humans.
The relationship between two visual communication systems: reading and lipreading.

PubMed

Williams, A

1982-12-01

To explore the relationship between reading and lipreading and to determine whether readers and lipreaders use similar strategies to comprehend verbal messages, 60 female junior and sophomore high school students--30 good and 30 poor readers--were given a filmed lipreading test, a test to measure eye-voice span, a test of cloze ability, and a test of their ability to comprehend printed material presented one word at a time in the absence of an opportunity to regress or scan ahead. The results of this study indicated that (a) there is a significant relationship between reading and lipreading ability; (b) although good readers may be either good or poor lipreaders, poor readers are more likely to be poor than good lipreaders; (c) there are similarities in the strategies used by readers and lipreaders in their approach to comprehending spoken and written material; (d) word-by-word reading of continuous prose appears to be a salient characteristic of both poor reading and poor lipreading ability; and (c) good readers and lipreaders do not engage in word-by-word reading but rather use a combination of visual and linguistic cues to interpret written and spoken messages.
Response Errors in Females' and Males' Sentence Lipreading Necessitate Structurally Different Models for Predicting Lipreading Accuracy

ERIC Educational Resources Information Center

Bernstein, Lynne E.

2018-01-01

Lipreaders recognize words with phonetically impoverished stimuli, an ability that varies widely in normal-hearing adults. Lipreading accuracy for sentence stimuli was modeled with data from 339 normal-hearing adults. Models used measures of phonemic perceptual errors, insertion of text that was not in the stimulus, gender, and auditory speech…
Lipreading in the prelingually deaf: what makes a skilled speechreader?

PubMed

Rodríguez Ortiz, Isabel de los Reyes

2008-11-01

Lipreading proficiency was investigated in a group of hearing-impaired people, all of them knowing Spanish Sign Language (SSL). The aim of this study was to establish the relationships between lipreading and some other variables (gender, intelligence, audiological variables, participants' education, parents' education, communication practices, intelligibility, use of SSL). The 32 participants were between 14 and 47 years of age. They all had sensorineural hearing losses (from severe to profound). The lipreading procedures comprised identification of words in isolation. The words selected for presentation in isolation were spoken by the same talker. Identification of words required participants to select their responses from set of four pictures appropriately labelled. Lipreading was significantly correlated with intelligence and intelligibility. Multiple regression analyses were used to obtain a prediction equation for the lipreading measures. As a result of this procedure, it is concluded that proficient deaf lipreaders are more intelligent and their oral speech was more comprehensible for others.
Experience with a talker can transfer across modalities to facilitate lipreading.

PubMed

Sanchez, Kauyumari; Dias, James W; Rosenblum, Lawrence D

2013-10-01

Rosenblum, Miller, and Sanchez (Psychological Science, 18, 392-396, 2007) found that subjects first trained to lip-read a particular talker were then better able to perceive the auditory speech of that same talker, as compared with that of a novel talker. This suggests that the talker experience a perceiver gains in one sensory modality can be transferred to another modality to make that speech easier to perceive. An experiment was conducted to examine whether this cross-sensory transfer of talker experience could occur (1) from auditory to lip-read speech, (2) with subjects not screened for adequate lipreading skill, (3) when both a familiar and an unfamiliar talker are presented during lipreading, and (4) for both old (presentation set) and new words. Subjects were first asked to identify a set of words from a talker. They were then asked to perform a lipreading task from two faces, one of which was of the same talker they heard in the first phase of the experiment. Results revealed that subjects who lip-read from the same talker they had heard performed better than those who lip-read a different talker, regardless of whether the words were old or new. These results add further evidence that learning of amodal talker information can facilitate speech perception across modalities and also suggest that this information is not restricted to previously heard words.
Experience with a Talker Can Transfer Across Modalities to Facilitate Lipreading

PubMed Central

Sanchez, Kauyumari; Dias, James W.; Rosenblum, Lawrence D.

2013-01-01

Rosenblum, Miller, and Sanchez (2007) found that participants first trained to lipread a particular talker were then better able to perceive the auditory speech of that same talker, compared to that of a novel talker. This suggests that the talker experience a perceiver gains in one sensory modality can be transferred to another modality to make that speech easier to perceive. An experiment was conducted to examine whether this cross-sensory transfer of talker experience could occur: 1) from auditory to lipread speech; 2) with subjects not screened for adequate lipreading skill; 3) when both a familiar and unfamiliar talker are presented during lipreading; and 4) for both old (presentation set) and new words. Subjects were first asked to identify a set of words from a talker. They were then asked to perform a lipreading task from two faces, one of which was of the same talker they heard in the first phase of the experiment. Results revealed that subjects who lipread from the same talker they had heard performed better than those who lipead a different talker, regardless of whether the words were old or new. These results add further evidence that learning of amodal talker information can facilitate speech perception across modalities and also suggest that this information is not restricted to previously heard words. PMID:23955059
Lipreading in School-Age Children: The Roles of Age, Hearing Status, and Cognitive Ability

ERIC Educational Resources Information Center

Tye-Murray, Nancy; Hale, Sandra; Spehar, Brent; Myerson, Joel; Sommers, Mitchell S.

2014-01-01

Purpose: The study addressed three research questions: Does lipreading improve between the ages of 7 and 14 years? Does hearing loss affect the development of lipreading? How do individual differences in lipreading relate to other abilities? Method: Forty children with normal hearing (NH) and 24 with hearing loss (HL) were tested using 4…
Some observations on computer lip-reading: moving from the dream to the reality

NASA Astrophysics Data System (ADS)

Bear, Helen L.; Owen, Gari; Harvey, Richard; Theobald, Barry-John

2014-10-01

In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called "visemes" for example). Here we review these and other assumptions and show the surprising result that computer lip-reading is not heavily constrained by video resolution, pose, lighting and other practical factors. However, the working assumption that visemes, which are the visual equivalent of phonemes, are the best unit for recognition does need further examination. We conclude that visemes, which were defined over a century ago, are unlikely to be optimal for a modern computer lip-reading system.
Lip-Reading by Deaf and Hearing Children

ERIC Educational Resources Information Center

Conradm, R.

1977-01-01

A group of profoundly deaf 15-year-old subjects with no other handicap and of average non-verbal intelligence were given a lip-reading test. The same test was given to comparable hearing subjects "deafened" by white noise masking. The difference between the groups was not significant. (Editor)
The Neural Basis of Speech Perception through Lipreading and Manual Cues: Evidence from Deaf Native Users of Cued Speech

PubMed Central

Aparicio, Mario; Peigneux, Philippe; Charlier, Brigitte; Balériaux, Danielle; Kavec, Martin; Leybaert, Jacqueline

2017-01-01

We present here the first neuroimaging data for perception of Cued Speech (CS) by deaf adults who are native users of CS. CS is a visual mode of communicating a spoken language through a set of manual cues which accompany lipreading and disambiguate it. With CS, sublexical units of the oral language are conveyed clearly and completely through the visual modality without requiring hearing. The comparison of neural processing of CS in deaf individuals with processing of audiovisual (AV) speech in normally hearing individuals represents a unique opportunity to explore the similarities and differences in neural processing of an oral language delivered in a visuo-manual vs. an AV modality. The study included deaf adult participants who were early CS users and native hearing users of French who process speech audiovisually. Words were presented in an event-related fMRI design. Three conditions were presented to each group of participants. The deaf participants saw CS words (manual + lipread), words presented as manual cues alone, and words presented to be lipread without manual cues. The hearing group saw AV spoken words, audio-alone and lipread-alone. Three findings are highlighted. First, the middle and superior temporal gyrus (excluding Heschl’s gyrus) and left inferior frontal gyrus pars triangularis constituted a common, amodal neural basis for AV and CS perception. Second, integration was inferred in posterior parts of superior temporal sulcus for audio and lipread information in AV speech, but in the occipito-temporal junction, including MT/V5, for the manual cues and lipreading in CS. Third, the perception of manual cues showed a much greater overlap with the regions activated by CS (manual + lipreading) than lipreading alone did. This supports the notion that manual cues play a larger role than lipreading for CS processing. The present study contributes to a better understanding of the role of manual cues as support of visual speech perception in the framework
Lipreading Ability and Its Cognitive Correlates in Typically Developing Children and Children with Specific Language Impairment

ERIC Educational Resources Information Center

Heikkilä, Jenni; Lonka, Eila; Ahola, Sanna; Meronen, Auli; Tiippana, Kaisa

2017-01-01

Purpose: Lipreading and its cognitive correlates were studied in school-age children with typical language development and delayed language development due to specific language impairment (SLI). Method: Forty-two children with typical language development and 20 children with SLI were tested by using a word-level lipreading test and an extensive…
Lip-read me now, hear me better later: cross-modal transfer of talker-familiarity effects.

PubMed

Rosenblum, Lawrence D; Miller, Rachel M; Sanchez, Kauyumari

2007-05-01

There is evidence that for both auditory and visual speech perception, familiarity with the talker facilitates speech recognition. Explanations of these effects have concentrated on the retention of talker information specific to each of these modalities. It could be, however, that some amodal, talker-specific articulatory-style information facilitates speech perception in both modalities. If this is true, then experience with a talker in one modality should facilitate perception of speech from that talker in the other modality. In a test of this prediction, subjects were given about 1 hr of experience lipreading a talker and were then asked to recover speech in noise from either this same talker or a different talker. Results revealed that subjects who lip-read and heard speech from the same talker performed better on the speech-in-noise task than did subjects who lip-read from one talker and then heard speech from a different talker.
Fast 3D NIR systems for facial measurement and lip-reading

NASA Astrophysics Data System (ADS)

Brahm, Anika; Ramm, Roland; Heist, Stefan; Rulff, Christian; Kühmstedt, Peter; Notni, Gunther

2017-05-01

Structured-light projection is a well-established optical method for the non-destructive contactless three-dimensional (3D) measurement of object surfaces. In particular, there is a great demand for accurate and fast 3D scans of human faces or facial regions of interest in medicine, safety, face modeling, games, virtual life, or entertainment. New developments of facial expression detection and machine lip-reading can be used for communication tasks, future machine control, or human-machine interactions. In such cases, 3D information may offer more detailed information than 2D images which can help to increase the power of current facial analysis algorithms. In this contribution, we present new 3D sensor technologies based on three different methods of near-infrared projection technologies in combination with a stereo vision setup of two cameras. We explain the optical principles of an NIR GOBO projector, an array projector and a modified multi-aperture projection method and compare their performance parameters to each other. Further, we show some experimental measurement results of applications where we realized fast, accurate, and irritation-free measurements of human faces.
Validating a Method to Assess Lipreading, Audiovisual Gain, and Integration During Speech Reception With Cochlear-Implanted and Normal-Hearing Subjects Using a Talking Head.

PubMed

Schreitmüller, Stefan; Frenken, Miriam; Bentz, Lüder; Ortmann, Magdalene; Walger, Martin; Meister, Hartmut

Watching a talker's mouth is beneficial for speech reception (SR) in many communication settings, especially in noise and when hearing is impaired. Measures for audiovisual (AV) SR can be valuable in the framework of diagnosing or treating hearing disorders. This study addresses the lack of standardized methods in many languages for assessing lipreading, AV gain, and integration. A new method is validated that supplements a German speech audiometric test with visualizations of the synthetic articulation of an avatar that was used, for it is feasible to lip-sync auditory speech in a highly standardized way. Three hypotheses were formed according to the literature on AV SR that used live or filmed talkers. It was tested whether respective effects could be reproduced with synthetic articulation: (1) cochlear implant (CI) users have a higher visual-only SR than normal-hearing (NH) individuals, and younger individuals obtain higher lipreading scores than older persons. (2) Both CI and NH gain from presenting AV over unimodal (auditory or visual) sentences in noise. (3) Both CI and NH listeners efficiently integrate complementary auditory and visual speech features. In a controlled, cross-sectional study with 14 experienced CI users (mean age 47.4) and 14 NH individuals (mean age 46.3, similar broad age distribution), lipreading, AV gain, and integration of a German matrix sentence test were assessed. Visual speech stimuli were synthesized by the articulation of the Talking Head system "MASSY" (Modular Audiovisual Speech Synthesizer), which displayed standardized articulation with respect to the visibility of German phones. In line with the hypotheses and previous literature, CI users had a higher mean visual-only SR than NH individuals (CI, 38%; NH, 12%; p < 0.001). Age was correlated with lipreading such that within each group, younger individuals obtained higher visual-only scores than older persons (rCI = -0.54; p = 0.046; rNH = -0.78; p < 0.001). Both CI and NH
Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading

PubMed Central

O’Sullivan, Aisling E.; Crosse, Michael J.; Di Liberto, Giovanni M.; Lalor, Edmund C.

2017-01-01

Speech is a multisensory percept, comprising an auditory and visual component. While the content and processing pathways of audio speech have been well characterized, the visual component is less well understood. In this work, we expand current methodologies using system identification to introduce a framework that facilitates the study of visual speech in its natural, continuous form. Specifically, we use models based on the unheard acoustic envelope (E), the motion signal (M) and categorical visual speech features (V) to predict EEG activity during silent lipreading. Our results show that each of these models performs similarly at predicting EEG in visual regions and that respective combinations of the individual models (EV, MV, EM and EMV) provide an improved prediction of the neural activity over their constituent models. In comparing these different combinations, we find that the model incorporating all three types of features (EMV) outperforms the individual models, as well as both the EV and MV models, while it performs similarly to the EM model. Importantly, EM does not outperform EV and MV, which, considering the higher dimensionality of the V model, suggests that more data is needed to clarify this finding. Nevertheless, the performance of EMV, and comparisons of the subject performances for the three individual models, provides further evidence to suggest that visual regions are involved in both low-level processing of stimulus dynamics and categorical speech perception. This framework may prove useful for investigating modality-specific processing of visual speech under naturalistic conditions. PMID:28123363
A Study of the Combined Use of a Hearing Aid and Tactual Aid in an Adult with Profound Hearing Loss

ERIC Educational Resources Information Center

Reed, Charlotte M.; Delhorne, Lorraine A.

2006-01-01

This study examined the benefits of the combined used of a hearing aid and tactual aid to supplement lip-reading in the reception of speech and for the recognition of environmental sounds in an adult with profound hearing loss. Speech conditions included lip-reading alone (L), lip-reading + tactual aid (L+TA) lip-reading + hearing aid (L+HA) and…
Modalities of memory: is reading lips like hearing voices?

PubMed

Maidment, David W; Macken, Bill; Jones, Dylan M

2013-12-01

Functional similarities in verbal memory performance across presentation modalities (written, heard, lipread) are often taken to point to a common underlying representational form upon which the modalities converge. We show here instead that the pattern of performance depends critically on presentation modality and different mechanisms give rise to superficially similar effects across modalities. Lipread recency is underpinned by different mechanisms to auditory recency, and while the effect of an auditory suffix on an auditory list is due to the perceptual grouping of the suffix with the list, the corresponding effect with lipread speech is due to misidentification of the lexical content of the lipread suffix. Further, while a lipread suffix does not disrupt auditory recency, an auditory suffix does disrupt recency for lipread lists. However, this effect is due to attentional capture ensuing from the presentation of an unexpected auditory event, and is evident both with verbal and nonverbal auditory suffixes. These findings add to a growing body of evidence that short-term verbal memory performance is determined by modality-specific perceptual and motor processes, rather than by the storage and manipulation of phonological representations. Copyright © 2013 Elsevier B.V. All rights reserved.
77 FR 24554 - Culturally Significant Objects Imported for Exhibition; Determinations: “Quay Brothers: On...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-04-24

... DEPARTMENT OF STATE [Public Notice 7855] Culturally Significant Objects Imported for Exhibition; Determinations: ``Quay Brothers: On Deciphering the Pharmacist's Prescription for Lip-Reading Puppets'' AGENCY...: On Deciphering the Pharmacist's Prescription for Lip-Reading Puppets'' imported from abroad for...
Computer Aided Lip Reading Training Tool

ERIC Educational Resources Information Center

Sarmasik, Gamze; Dalkilic, Gokhan; Kut, Alp; Cebi, Yalcin; Serbetcioglu, Bulent

2007-01-01

Worldwide auditory-verbal education is becoming widespread for deaf children. But many prelingually, late-diagnosed deaf children and adults may utilize neither hearing aids nor cochlear implants and needed the support of lip-reading. Therefore, lip-reading skill remains to be important for oral education programmes of hearing impaired. The…
Can Explicit Training in Cued Speech Improve Phoneme Identification?

ERIC Educational Resources Information Center

Rees, R.; Fitzpatrick, C.; Foulkes, J.; Peterson, H.; Newton, C.

2017-01-01

When identifying phonemes in new spoken words, lipreading is an important source of information for many deaf people. Because many groups of phonemes are virtually indistinguishable by sight, deaf people are able to identify about 30% of phonemes when lipreading non-words. Cued speech (CS) is a system of hand shapes and hand positions used…

Phonetic Recalibration Only Occurs in Speech Mode

ERIC Educational Resources Information Center

Vroomen, Jean; Baart, Martijn

2009-01-01

Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…
Visual cues and listening effort: individual variability.

PubMed

Picou, Erin M; Ricketts, Todd A; Hornsby, Benjamin W Y

2011-10-01

To investigate the effect of visual cues on listening effort as well as whether predictive variables such as working memory capacity (WMC) and lipreading ability affect the magnitude of listening effort. Twenty participants with normal hearing were tested using a paired-associates recall task in 2 conditions (quiet and noise) and 2 presentation modalities (audio only [AO] and auditory-visual [AV]). Signal-to-noise ratios were adjusted to provide matched speech recognition across audio-only and AV noise conditions. Also measured were subjective perceptions of listening effort and 2 predictive variables: (a) lipreading ability and (b) WMC. Objective and subjective results indicated that listening effort increased in the presence of noise, but on average the addition of visual cues did not significantly affect the magnitude of listening effort. Although there was substantial individual variability, on average participants who were better lipreaders or had larger WMCs demonstrated reduced listening effort in noise in AV conditions. Overall, the results support the hypothesis that integrating auditory and visual cues requires cognitive resources in some participants. The data indicate that low lipreading ability or low WMC is associated with relatively effortful integration of auditory and visual information in noise.
A Selective Deficit in Phonetic Recalibration by Text in Developmental Dyslexia.

PubMed

Keetels, Mirjam; Bonte, Milene; Vroomen, Jean

2018-01-01

Upon hearing an ambiguous speech sound, listeners may adjust their perceptual interpretation of the speech input in accordance with contextual information, like accompanying text or lipread speech (i.e., phonetic recalibration; Bertelson et al., 2003). As developmental dyslexia (DD) has been associated with reduced integration of text and speech sounds, we investigated whether this deficit becomes manifest when text is used to induce this type of audiovisual learning. Adults with DD and normal readers were exposed to ambiguous consonants halfway between /aba/ and /ada/ together with text or lipread speech. After this audiovisual exposure phase, they categorized auditory-only ambiguous test sounds. Results showed that individuals with DD, unlike normal readers, did not use text to recalibrate their phoneme categories, whereas their recalibration by lipread speech was spared. Individuals with DD demonstrated similar deficits when ambiguous vowels (halfway between /wIt/ and /wet/) were recalibrated by text. These findings indicate that DD is related to a specific letter-speech sound association deficit that extends over phoneme classes (vowels and consonants), but - as lipreading was spared - does not extend to a more general audio-visual integration deficit. In particular, these results highlight diminished reading-related audiovisual learning in addition to the commonly reported phonological problems in developmental dyslexia.
How to improve communication with deaf children in the dental clinic.

PubMed

Alsmark, Silvia San Bernardino; García, Joaquín; Martínez, María Rosa Mourelle; López, Nuria Esther Gallardo

2007-12-01

It may be difficult for hearing-impaired people to communicate with people who hear. In the health care area, there is often little awareness of the communication barriers faced by the deaf and, in dentistry, the attitude adopted towards the deaf is not always correct. A review is given of the basic rules and advice given for communicating with the hearing-impaired. The latter are classified in three groups - lip-readers, sign language users and those with hearing aids. The advice given varies for the different groups although the different methods of communication are often combined (e.g. sign language plus lip-reading, hearing-aids plus lip-reading). Treatment of hearing-impaired children in the dental clinic must be personalised. Each child is different, depending on the education received, the communication skills possessed, family factors (degree of parental protection, etc.), the existence of associated problems (learning difficulties), degree of loss of hearing, age, etc.
Reading your own lips: common-coding theory and visual speech perception.

PubMed

Tye-Murray, Nancy; Spehar, Brent P; Myerson, Joel; Hale, Sandra; Sommers, Mitchell S

2013-02-01

Common-coding theory posits that (1) perceiving an action activates the same representations of motor plans that are activated by actually performing that action, and (2) because of individual differences in the ways that actions are performed, observing recordings of one's own previous behavior activates motor plans to an even greater degree than does observing someone else's behavior. We hypothesized that if observing oneself activates motor plans to a greater degree than does observing others, and if these activated plans contribute to perception, then people should be able to lipread silent video clips of their own previous utterances more accurately than they can lipread video clips of other talkers. As predicted, two groups of participants were able to lipread video clips of themselves, recorded more than two weeks earlier, significantly more accurately than video clips of others. These results suggest that visual input activates speech motor activity that links to word representations in the mental lexicon.
Electrophysiological evidence for speech-specific audiovisual integration.

PubMed

Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean

2014-01-01

Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode. © 2013 Published by Elsevier Ltd.
Visual words for lip-reading

NASA Astrophysics Data System (ADS)

Hassanat, Ahmad B. A.; Jassim, Sabah

2010-04-01

In this paper, the automatic lip reading problem is investigated, and an innovative approach to providing solutions to this problem has been proposed. This new VSR approach is dependent on the signature of the word itself, which is obtained from a hybrid feature extraction method dependent on geometric, appearance, and image transform features. The proposed VSR approach is termed "visual words". The visual words approach consists of two main parts, 1) Feature extraction/selection, and 2) Visual speech feature recognition. After localizing face and lips, several visual features for the lips where extracted. Such as the height and width of the mouth, mutual information and the quality measurement between the DWT of the current ROI and the DWT of the previous ROI, the ratio of vertical to horizontal features taken from DWT of ROI, The ratio of vertical edges to horizontal edges of ROI, the appearance of the tongue and the appearance of teeth. Each spoken word is represented by 8 signals, one of each feature. Those signals maintain the dynamic of the spoken word, which contains a good portion of information. The system is then trained on these features using the KNN and DTW. This approach has been evaluated using a large database for different people, and large experiment sets. The evaluation has proved the visual words efficiency, and shown that the VSR is a speaker dependent problem.
Modulations of 'late' event-related brain potentials in humans by dynamic audiovisual speech stimuli.

PubMed

Lebib, Riadh; Papo, David; Douiri, Abdel; de Bode, Stella; Gillon Dowens, Margaret; Baudonnière, Pierre-Marie

2004-11-30

Lipreading reliably improve speech perception during face-to-face conversation. Within the range of good dubbing, however, adults tolerate some audiovisual (AV) discrepancies and lipreading, then, can give rise to confusion. We used event-related brain potentials (ERPs) to study the perceptual strategies governing the intermodal processing of dynamic and bimodal speech stimuli, either congruently dubbed or not. Electrophysiological analyses revealed that non-coherent audiovisual dubbings modulated in amplitude an endogenous ERP component, the N300, we compared to a 'N400-like effect' reflecting the difficulty to integrate these conflicting pieces of information. This result adds further support for the existence of a cerebral system underlying 'integrative processes' lato sensu. Further studies should take advantage of this 'N400-like effect' with AV speech stimuli to open new perspectives in the domain of psycholinguistics.
Development of a speech autocuer

NASA Astrophysics Data System (ADS)

Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; McCartney, M. L.

1980-12-01

A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.
Development of a speech autocuer

NASA Technical Reports Server (NTRS)

Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; Mccartney, M. L.

1980-01-01

A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.
Hearing for Success in the Classroom.

ERIC Educational Resources Information Center

Ireland, JoAnn C.; And Others

1988-01-01

Hearing-impaired children in mainstreamed classes require assistive listening devices beyond hearing aids, lipreading, and preferential seating. Frequency modulation auditory training devices can improve speech intelligibility and provide an adequate signal-to-noise ratio, and should be incorporated into regular classes containing hearing-impaired…
Bimodal bilingualism as multisensory training?: Evidence for improved audiovisual speech perception after sign language exposure.

PubMed

Williams, Joshua T; Darcy, Isabelle; Newman, Sharlene D

2016-02-15

The aim of the present study was to characterize effects of learning a sign language on the processing of a spoken language. Specifically, audiovisual phoneme comprehension was assessed before and after 13 weeks of sign language exposure. L2 ASL learners performed this task in the fMRI scanner. Results indicated that L2 American Sign Language (ASL) learners' behavioral classification of the speech sounds improved with time compared to hearing nonsigners. Results indicated increased activation in the supramarginal gyrus (SMG) after sign language exposure, which suggests concomitant increased phonological processing of speech. A multiple regression analysis indicated that learner's rating on co-sign speech use and lipreading ability was correlated with SMG activation. This pattern of results indicates that the increased use of mouthing and possibly lipreading during sign language acquisition may concurrently improve audiovisual speech processing in budding hearing bimodal bilinguals. Copyright © 2015 Elsevier B.V. All rights reserved.
Visual speech influences speech perception immediately but not automatically.

PubMed

Mitterer, Holger; Reinisch, Eva

2017-02-01

Two experiments examined the time course of the use of auditory and visual speech cues to spoken word recognition using an eye-tracking paradigm. Results of the first experiment showed that the use of visual speech cues from lipreading is reduced if concurrently presented pictures require a division of attentional resources. This reduction was evident even when listeners' eye gaze was on the speaker rather than the (static) pictures. Experiment 2 used a deictic hand gesture to foster attention to the speaker. At the same time, the visual processing load was reduced by keeping the visual display constant over a fixed number of successive trials. Under these conditions, the visual speech cues from lipreading were used. Moreover, the eye-tracking data indicated that visual information was used immediately and even earlier than auditory information. In combination, these data indicate that visual speech cues are not used automatically, but if they are used, they are used immediately.
Audio-Visual Speech in Noise Perception in Dyslexia

ERIC Educational Resources Information Center

van Laarhoven, Thijs; Keetels, Mirjam; Schakel, Lemmy; Vroomen, Jean

2018-01-01

Individuals with developmental dyslexia (DD) may experience, besides reading problems, other speech-related processing deficits. Here, we examined the influence of visual articulatory information (lip-read speech) at various levels of background noise on auditory word recognition in children and adults with DD. We found that children with a…
AN EXPERIMENTAL EVALUATION OF AUDIO-VISUAL METHODS--CHANGING ATTITUDES TOWARD EDUCATION.

ERIC Educational Resources Information Center

LOWELL, EDGAR L.; AND OTHERS

AUDIOVISUAL PROGRAMS FOR PARENTS OF DEAF CHILDREN WERE DEVELOPED AND EVALUATED. EIGHTEEN SOUND FILMS AND ACCOMPANYING RECORDS PRESENTED INFORMATION ON HEARING, LIPREADING AND SPEECH, AND ATTEMPTED TO CHANGE PARENTAL ATTITUDES TOWARD CHILDREN AND SPOUSES. TWO VERSIONS OF THE FILMS AND RECORDS WERE NARRATED BY (1) "STARS" WHO WERE…
TEACHING DEAF CHILDREN TO TALK.

ERIC Educational Resources Information Center

EWING, ALEXANDER; EWING, ETHEL C.

DESIGNED AS A TEXT FOR AUDIOLOGISTS AND TEACHERS OF HEARING IMPAIRED CHILDREN, THIS BOOK PRESENTS BASIC INFORMATION ABOUT SPOKEN LANGUAGE, HEARING, AND LIPREADING. METHODS AND RESULTS OF EVALUATING SPOKEN LANGUAGE OF AURALLY HANDICAPPED CHILDREN WITHOUT USING READING OR WRITING ARE REPORTED. VARIOUS TYPES OF INDIVIDUAL AND GROUP HEARING AIDS ARE…
THE HARD OF HEARING. PRENTICE-HALL FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

ERIC Educational Resources Information Center

O'NEILL, JOHN J.

BASIC INFORMATION ABOUT TESTING, DIAGNOSING, AND REHABILITATING THE HARD OF HEARING IS OFFERED IN THIS INTRODUCTORY TEXT. THE PHYSICS OF SOUND, AUDITORY THEORY, ANATOMY AND PATHOLOGY OF THE EAR, AND DIAGNOSTIC ROUTINES ARE DISCUSSED. A CHAPTER ON AURAL REHABILITATION INCLUDES AN OVERVIEW OF LIPREADING AND AUDITORY TRAINING TECHNIQUES FOR ADULTS…
Visual Cues and Listening Effort: Individual Variability

ERIC Educational Resources Information Center

Picou, Erin M.; Ricketts, Todd A; Hornsby, Benjamin W. Y.

2011-01-01

Purpose: To investigate the effect of visual cues on listening effort as well as whether predictive variables such as working memory capacity (WMC) and lipreading ability affect the magnitude of listening effort. Method: Twenty participants with normal hearing were tested using a paired-associates recall task in 2 conditions (quiet and noise) and…
Supreme Court of the United States. Southeastern Community College v. Davis. Certiorari to the United States Court of Appeals for the Fourth Circuit.

ERIC Educational Resources Information Center

Supreme Court of the U. S., Washington, DC.

A suit was brought by Frances B. Davis against Southeastern Community College, which had denied her admission to its nursing program because of her serious hearing disability. (An audiologist's report indicated that she cannot understand speech directed to her except by lipreading.) She alleged that this denial constituted a violation of section…
Effects of English Cued Speech on Speech Perception, Phonological Awareness and Literacy: A Case Study of a 9-Year-Old Deaf Boy Using a Cochlear Implant

ERIC Educational Resources Information Center

Rees, Rachel; Bladel, Judith

2013-01-01

Many studies have shown that French Cued Speech (CS) can enhance lipreading and the development of phonological awareness and literacy in deaf children but, as yet, there is little evidence that these findings can be generalized to English CS. This study investigated the possible effects of English CS on the speech perception, phonological…

Multisensory training can promote or impede visual perceptual learning of speech stimuli: visual-tactile vs. visual-auditory training.

PubMed

Eberhardt, Silvio P; Auer, Edward T; Bernstein, Lynne E

2014-01-01

In a series of studies we have been investigating how multisensory training affects unisensory perceptual learning with speech stimuli. Previously, we reported that audiovisual (AV) training with speech stimuli can promote auditory-only (AO) perceptual learning in normal-hearing adults but can impede learning in congenitally deaf adults with late-acquired cochlear implants. Here, impeder and promoter effects were sought in normal-hearing adults who participated in lipreading training. In Experiment 1, visual-only (VO) training on paired associations between CVCVC nonsense word videos and nonsense pictures demonstrated that VO words could be learned to a high level of accuracy even by poor lipreaders. In Experiment 2, visual-auditory (VA) training in the same paradigm but with the addition of synchronous vocoded acoustic speech impeded VO learning of the stimuli in the paired-associates paradigm. In Experiment 3, the vocoded AO stimuli were shown to be less informative than the VO speech. Experiment 4 combined vibrotactile speech stimuli with the visual stimuli during training. Vibrotactile stimuli were shown to promote visual perceptual learning. In Experiment 5, no-training controls were used to show that training with visual speech carried over to consonant identification of untrained CVCVC stimuli but not to lipreading words in sentences. Across this and previous studies, multisensory training effects depended on the functional relationship between pathways engaged during training. Two principles are proposed to account for stimulus effects: (1) Stimuli presented to the trainee's primary perceptual pathway will impede learning by a lower-rank pathway. (2) Stimuli presented to the trainee's lower rank perceptual pathway will promote learning by a higher-rank pathway. The mechanisms supporting these principles are discussed in light of multisensory reverse hierarchy theory (RHT).
Auditory Brainstem Implantation in Chinese Patients With Neurofibromatosis Type II: The Hong Kong Experience.

PubMed

Thong, Jiun Fong; Sung, John K K; Wong, Terence K C; Tong, Michael C F

2016-08-01

To describe our experience and outcomes of auditory brainstem implantation (ABI) in Chinese patients with Neurofibromatosis Type II (NF2). Retrospective case review. Tertiary referral center. Patients with NF2 who received ABIs. Between 1997 and 2014, eight patients with NF2 received 9 ABIs after translabyrinthine removal of their vestibular schwannomas. One patient did not have auditory response using the ABI after activation. Environmental sounds could be differentiated by six (75%) patients after 6 months of ABI use (mean score 46% [range 28-60%]), and by five (63%) patients after 1 year (mean score 57% [range 36-76%]) and 2 years of ABI use (mean score 48% [range 24-76%]). Closed-set word identification was possible in four (50%) patients after 6 months (mean score 39% [range 12-72%]), 1 year (mean score 68% [range 48-92%]), and 2 years of ABI use (mean score 62% [range 28-100%]). No patient demonstrated open-set sentence recognition in quiet in the ABI-only condition. However, the use of ABI together with lip-reading conferred an improvement over lip-reading alone in open-set sentence recognition scores in two (25%) patients after 6 months of ABI use (mean improvement 46%), and five (63%) patients after 1 year (mean improvement 25%) and 2 years of ABI use (mean improvement 28%). At 2 years postoperatively, three (38%) patients remained ABI users. This is the only published study to date examining ABI outcomes in Cantonese-speaking Chinese NF2 patients and the data seems to show poorer outcomes compared with English-speaking and other nontonal language-speaking NF2 patients. Environmental sound awareness and lip-reading enhancement are the main benefits observed in our patients. More work is needed to improve auditory implant speech-processing strategies for tonal languages and these advancements may yield better speech perception outcomes in the future.
Multisensory training can promote or impede visual perceptual learning of speech stimuli: visual-tactile vs. visual-auditory training

PubMed Central

Eberhardt, Silvio P.; Auer Jr., Edward T.; Bernstein, Lynne E.

2014-01-01

In a series of studies we have been investigating how multisensory training affects unisensory perceptual learning with speech stimuli. Previously, we reported that audiovisual (AV) training with speech stimuli can promote auditory-only (AO) perceptual learning in normal-hearing adults but can impede learning in congenitally deaf adults with late-acquired cochlear implants. Here, impeder and promoter effects were sought in normal-hearing adults who participated in lipreading training. In Experiment 1, visual-only (VO) training on paired associations between CVCVC nonsense word videos and nonsense pictures demonstrated that VO words could be learned to a high level of accuracy even by poor lipreaders. In Experiment 2, visual-auditory (VA) training in the same paradigm but with the addition of synchronous vocoded acoustic speech impeded VO learning of the stimuli in the paired-associates paradigm. In Experiment 3, the vocoded AO stimuli were shown to be less informative than the VO speech. Experiment 4 combined vibrotactile speech stimuli with the visual stimuli during training. Vibrotactile stimuli were shown to promote visual perceptual learning. In Experiment 5, no-training controls were used to show that training with visual speech carried over to consonant identification of untrained CVCVC stimuli but not to lipreading words in sentences. Across this and previous studies, multisensory training effects depended on the functional relationship between pathways engaged during training. Two principles are proposed to account for stimulus effects: (1) Stimuli presented to the trainee’s primary perceptual pathway will impede learning by a lower-rank pathway. (2) Stimuli presented to the trainee’s lower rank perceptual pathway will promote learning by a higher-rank pathway. The mechanisms supporting these principles are discussed in light of multisensory reverse hierarchy theory (RHT). PMID:25400566
"All Methods--and Wedded to None": The Deaf Education Methods Debate and Progressive Educational Reform in Toronto, Canada, 1922-1945

ERIC Educational Resources Information Center

Ellis, Jason A.

2014-01-01

This article is about the deaf education methods debate in the public schools of Toronto, Canada. The author demonstrates how pure oralism (lip-reading and speech instruction to the complete exclusion of sign language) and day school classes for deaf schoolchildren were introduced as a progressive school reform in 1922. Plans for further oralist…
Perception of the Auditory-Visual Illusion in Speech Perception by Children with Phonological Disorders

ERIC Educational Resources Information Center

Dodd, Barbara; McIntosh, Beth; Erdener, Dogu; Burnham, Denis

2008-01-01

An example of the auditory-visual illusion in speech perception, first described by McGurk and MacDonald, is the perception of [ta] when listeners hear [pa] in synchrony with the lip movements for [ka]. One account of the illusion is that lip-read and heard speech are combined in an articulatory code since people who mispronounce words respond…
A FEASIBILITY STUDY TO INVESTIGATE THE INSTRUMENTATION, ESTABLISHMENT, AND OPERATION OF A LEARNING LABORATORY FOR HARD-OF-HEARING CHILDREN, FINAL REPORT.

ERIC Educational Resources Information Center

STEPP, ROBERT E.

TEN CHILDREN AGED 5-8 WERE SELECTED TO TEST A SELF-INSTRUCTIONAL, SELF-OPERATING SYSTEM TO DEVELOP LIPREADING SKILLS. THEIR HEARING DEFICIENCY RANGED FROM HARD OF HEARING TO PROFOUNDLY DEAF. THE SYSTEM CONSISTED OF THREE STUDY CARRELS, AN 8-MM CARTRIDGE-LOADING SOUND MOTION PICTURE PROJECTOR, AND AN OBSERVATION BOOTH UTILIZING A ONE-WAY MIRROR.…
[Intermodal timing cues for audio-visual speech recognition].

PubMed

Hashimoto, Masahiro; Kumashiro, Masaharu

2004-06-01

The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.
Phi-square Lexical Competition Database (Phi-Lex): an online tool for quantifying auditory and visual lexical competition.

PubMed

Strand, Julia F

2014-03-01

A widely agreed-upon feature of spoken word recognition is that multiple lexical candidates in memory are simultaneously activated in parallel when a listener hears a word, and that those candidates compete for recognition (Luce, Goldinger, Auer, & Vitevitch, Perception 62:615-625, 2000; Luce & Pisoni, Ear and Hearing 19:1-36, 1998; McClelland & Elman, Cognitive Psychology 18:1-86, 1986). Because the presence of those competitors influences word recognition, much research has sought to quantify the processes of lexical competition. Metrics that quantify lexical competition continuously are more effective predictors of auditory and visual (lipread) spoken word recognition than are the categorical metrics traditionally used (Feld & Sommers, Speech Communication 53:220-228, 2011; Strand & Sommers, Journal of the Acoustical Society of America 130:1663-1672, 2011). A limitation of the continuous metrics is that they are somewhat computationally cumbersome and require access to existing speech databases. This article describes the Phi-square Lexical Competition Database (Phi-Lex): an online, searchable database that provides access to multiple metrics of auditory and visual (lipread) lexical competition for English words, available at www.juliastrand.com/phi-lex .
Visual abilities are important for auditory-only speech recognition: evidence from autism spectrum disorder.

PubMed

Schelinski, Stefanie; Riedel, Philipp; von Kriegstein, Katharina

2014-12-01

In auditory-only conditions, for example when we listen to someone on the phone, it is essential to fast and accurately recognize what is said (speech recognition). Previous studies have shown that speech recognition performance in auditory-only conditions is better if the speaker is known not only by voice, but also by face. Here, we tested the hypothesis that such an improvement in auditory-only speech recognition depends on the ability to lip-read. To test this we recruited a group of adults with autism spectrum disorder (ASD), a condition associated with difficulties in lip-reading, and typically developed controls. All participants were trained to identify six speakers by name and voice. Three speakers were learned by a video showing their face and three others were learned in a matched control condition without face. After training, participants performed an auditory-only speech recognition test that consisted of sentences spoken by the trained speakers. As a control condition, the test also included speaker identity recognition on the same auditory material. The results showed that, in the control group, performance in speech recognition was improved for speakers known by face in comparison to speakers learned in the matched control condition without face. The ASD group lacked such a performance benefit. For the ASD group auditory-only speech recognition was even worse for speakers known by face compared to speakers not known by face. In speaker identity recognition, the ASD group performed worse than the control group independent of whether the speakers were learned with or without face. Two additional visual experiments showed that the ASD group performed worse in lip-reading whereas face identity recognition was within the normal range. The findings support the view that auditory-only communication involves specific visual mechanisms. Further, they indicate that in ASD, speaker-specific dynamic visual information is not available to optimize auditory
Effects of aging on audio-visual speech integration.

PubMed

Huyse, Aurélie; Leybaert, Jacqueline; Berthommier, Frédéric

2014-10-01

This study investigated the impact of aging on audio-visual speech integration. A syllable identification task was presented in auditory-only, visual-only, and audio-visual congruent and incongruent conditions. Visual cues were either degraded or unmodified. Stimuli were embedded in stationary noise alternating with modulated noise. Fifteen young adults and 15 older adults participated in this study. Results showed that older adults had preserved lipreading abilities when the visual input was clear but not when it was degraded. The impact of aging on audio-visual integration also depended on the quality of the visual cues. In the visual clear condition, the audio-visual gain was similar in both groups and analyses in the framework of the fuzzy-logical model of perception confirmed that older adults did not differ from younger adults in their audio-visual integration abilities. In the visual reduction condition, the audio-visual gain was reduced in the older group, but only when the noise was stationary, suggesting that older participants could compensate for the loss of lipreading abilities by using the auditory information available in the valleys of the noise. The fuzzy-logical model of perception confirmed the significant impact of aging on audio-visual integration by showing an increased weight of audition in the older group.
Accessibility of spoken, written, and sign language in Landau-Kleffner syndrome: a linguistic and functional MRI study.

PubMed

Sieratzki, J S; Calvert, G A; Brammer, M; David, A; Woll, B

2001-06-01

Landau-Kleffner syndrome (LKS) is an acquired aphasia which begins in childhood and is thought to arise from an epileptic disorder within the auditory speech cortex. Although the epilepsy usually subsides at puberty, a severe communication impairment often persists. Here we report on a detailed study of a 26-year old, left-handed male, with onset of LKS at age 5 years, who is aphasic for English but who learned British Sign Language (BSL) at age 13. We have investigated his skills in different language modalities, recorded EEGs during wakefulness, sleep, and under conditions of auditory stimulation, measured brain stem auditory-evoked potentials (BAEP), and performed functional MRI (fMRI) during a range of linguistic tasks. Our investigation demonstrated severe restrictions in comprehension and production of spoken English as well as lip-reading, while reading was comparatively less impaired. BSL was by far the most efficient mode of communication. All EEG recordings were normal, while BAEP showed minor abnormalities. fMRI revealed: 1) powerful and extensive bilateral (R > L) activation of auditory cortices in response to heard speech, much stronger than when listening to music; 2) very little response to silent lip-reading; 3) strong activation in the temporo-parieto-occipital association cortex, exclusively in the right hemisphere (RH), when viewing BSL signs. Analysis of these findings provides novel insights into the disturbance of the auditory speech cortex which underlies LKS and its diagnostic evaluation by fMRI, and underpins a strategy of restoring communication abilities in LKS through a natural sign language of the deaf (with Video)
Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition

NASA Astrophysics Data System (ADS)

Ryumin, D.; Karpov, A. A.

2017-05-01

In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.
Impact of Audio-Visual Asynchrony on Lip-Reading Effects -Neuromagnetic and Psychophysical Study-

PubMed Central

Yahata, Izumi; Kanno, Akitake; Sakamoto, Shuichi; Takanashi, Yoshitaka; Takata, Shiho; Nakasato, Nobukazu; Kawashima, Ryuta; Katori, Yukio

2016-01-01

The effects of asynchrony between audio and visual (A/V) stimuli on the N100m responses of magnetoencephalography in the left hemisphere were compared with those on the psychophysical responses in 11 participants. The latency and amplitude of N100m were significantly shortened and reduced in the left hemisphere by the presentation of visual speech as long as the temporal asynchrony between A/V stimuli was within 100 ms, but were not significantly affected with audio lags of -500 and +500 ms. However, some small effects were still preserved on average with audio lags of 500 ms, suggesting similar asymmetry of the temporal window to that observed in psychophysical measurements, which tended to be more robust (wider) for audio lags; i.e., the pattern of visual-speech effects as a function of A/V lag observed in the N100m in the left hemisphere grossly resembled that in psychophysical measurements on average, although the individual responses were somewhat varied. The present results suggest that the basic configuration of the temporal window of visual effects on auditory-speech perception could be observed from the early auditory processing stage. PMID:28030631
PERVALE-S: a new cognitive task to assess deaf people's ability to perceive basic and social emotions.

PubMed

Mestre, José M; Larrán, Cristina; Herrero, Joaquín; Guil, Rocío; de la Torre, Gabriel G

2015-01-01

A poorly understood aspect of deaf people (DP) is how their emotional information is processed. Verbal ability is key to improve emotional knowledge in people. Nevertheless, DP are unable to distinguish intonation, intensity, and the rhythm of language due to lack of hearing. Some DP have acquired both lip-reading abilities and sign language, but others have developed only sign language. PERVALE-S was developed to assess the ability of DP to perceive both social and basic emotions. PERVALE-S presents different sets of visual images of a real deaf person expressing both basic and social emotions, according to the normative standard of emotional expressions in Spanish Sign Language. Emotional expression stimuli were presented at two different levels of intensity (1: low; and 2: high) because DP do not distinguish an object in the same way as hearing people (HP) do. Then, participants had to click on the more suitable emotional expression. PERVALE-S contains video instructions (given by a sign language interpreter) to improve DP's understanding about how to use the software. DP had to watch the videos before answering the items. To test PERVALE-S, a sample of 56 individuals was recruited (18 signers, 8 lip-readers, and 30 HP). Participants also performed a personality test (High School Personality Questionnaire adapted) and a fluid intelligence (Gf) measure (RAPM). Moreover, all deaf participants were rated by four teachers for the deaf. there were no significant differences between deaf and HP in performance in PERVALE-S. Confusion matrices revealed that embarrassment, envy, and jealousy were worse perceived. Age was just related to social-emotional tasks (but not in basic emotional tasks). Emotional perception ability was related mainly to warmth and consciousness, but negatively related to tension. Meanwhile, Gf was related to only social-emotional tasks. There were no gender differences.
PERVALE-S: a new cognitive task to assess deaf people’s ability to perceive basic and social emotions

PubMed Central

Mestre, José M.; Larrán, Cristina; Herrero, Joaquín; Guil, Rocío; de la Torre, Gabriel G.

2015-01-01

A poorly understood aspect of deaf people (DP) is how their emotional information is processed. Verbal ability is key to improve emotional knowledge in people. Nevertheless, DP are unable to distinguish intonation, intensity, and the rhythm of language due to lack of hearing. Some DP have acquired both lip-reading abilities and sign language, but others have developed only sign language. PERVALE-S was developed to assess the ability of DP to perceive both social and basic emotions. PERVALE-S presents different sets of visual images of a real deaf person expressing both basic and social emotions, according to the normative standard of emotional expressions in Spanish Sign Language. Emotional expression stimuli were presented at two different levels of intensity (1: low; and 2: high) because DP do not distinguish an object in the same way as hearing people (HP) do. Then, participants had to click on the more suitable emotional expression. PERVALE-S contains video instructions (given by a sign language interpreter) to improve DP’s understanding about how to use the software. DP had to watch the videos before answering the items. To test PERVALE-S, a sample of 56 individuals was recruited (18 signers, 8 lip-readers, and 30 HP). Participants also performed a personality test (High School Personality Questionnaire adapted) and a fluid intelligence (Gf) measure (RAPM). Moreover, all deaf participants were rated by four teachers for the deaf. Results: there were no significant differences between deaf and HP in performance in PERVALE-S. Confusion matrices revealed that embarrassment, envy, and jealousy were worse perceived. Age was just related to social-emotional tasks (but not in basic emotional tasks). Emotional perception ability was related mainly to warmth and consciousness, but negatively related to tension. Meanwhile, Gf was related to only social-emotional tasks. There were no gender differences. PMID:26300828
Speech Analysis Based On Image Information from Lip Movement

NASA Astrophysics Data System (ADS)

Talha, Kamil S.; Wan, Khairunizam; Za'ba, S. K.; Mohamad Razlan, Zuradzman; B, Shahriman A.

2013-12-01

Deaf and hard of hearing people often have problems being able to understand and lip read other people. Usually deaf and hard of hearing people feel left out of conversation and sometimes they are actually ignored by other people. There are a variety of ways hearing-impaired person can communicate and gain accsss to the information. Communication support includes both technical and human aids. Human aids include interpreters, lip-readers and note-takers. Interpreters translate the Sign Language and must therefore be qualified. In this paper, vision system is used to track movements of the lip. In the experiment, the proposed system succesfully can differentiate 11 type of phonemes and then classified it to the respective viseme group. By using the proposed system the hearing-impaired persons could practise pronaunciations by themselve without support from the instructor.
Audibility and visual biasing in speech perception

NASA Astrophysics Data System (ADS)

Clement, Bart Richard

Although speech perception has been considered a predominantly auditory phenomenon, large benefits from vision in degraded acoustic conditions suggest integration of audition and vision. More direct evidence of this comes from studies of audiovisual disparity that demonstrate vision can bias and even dominate perception (McGurk & MacDonald, 1976). It has been observed that hearing-impaired listeners demonstrate more visual biasing than normally hearing listeners (Walden et al., 1990). It is argued here that stimulus audibility must be equated across groups before true differences can be established. In the present investigation, effects of visual biasing on perception were examined as audibility was degraded for 12 young normally hearing listeners. Biasing was determined by quantifying the degree to which listener identification functions for a single synthetic auditory /ba-da-ga/ continuum changed across two conditions: (1)an auditory-only listening condition; and (2)an auditory-visual condition in which every item of the continuum was synchronized with visual articulations of the consonant-vowel (CV) tokens /ba/ and /ga/, as spoken by each of two talkers. Audibility was altered by presenting the conditions in quiet and in noise at each of three signal-to- noise (S/N) ratios. For the visual-/ba/ context, large effects of audibility were found. As audibility decreased, visual biasing increased. A large talker effect also was found, with one talker eliciting more biasing than the other. An independent lipreading measure demonstrated that this talker was more visually intelligible than the other. For the visual-/ga/ context, audibility and talker effects were less robust, possibly obscured by strong listener effects, which were characterized by marked differences in perceptual processing patterns among participants. Some demonstrated substantial biasing whereas others demonstrated little, indicating a strong reliance on audition even in severely degraded acoustic
Effect of age at cochlear implantation and at exposure to Cued Speech on literacy skills in deaf children.

PubMed

Colin, S; Ecalle, J; Truy, E; Lina-Granade, G; Magnan, A

2017-12-01

The aim of this study was to investigate how age at cochlear implantation (CI) and age at exposure to Cued Speech (CS, Manual system that resolves the ambiguity inherent lipreading) could impact literacy skills in deaf children. Ninety deaf children fitted with CI (early vs late) and exposed to CS (early vs late) from primary schools (from Grade 2 to Grade 5) took part in this study. Five literacy skills were assessed: phonological skills through phoneme deletion, reading (decoding and sentence comprehension), word spelling and vocabulary. The results showed that both age at CI and age at first exposure to CS had some influence on literacy skills but there was no interaction between these factors. This implies that the positive effects of age at CI, especially on all literacy skills in the younger children, were not strengthened by age at exposure to CS. Copyright © 2017 Elsevier Ltd. All rights reserved.
Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space.

PubMed

Ma, Wei Ji; Zhou, Xiang; Ross, Lars A; Foxe, John J; Parra, Lucas C

2009-01-01

Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
The Perceptual Characteristics of Voice-Hallucinations in Deaf People: Insights into the Nature of Subvocal Thought and Sensory Feedback Loops

PubMed Central

Atkinson, Joanna R.

2006-01-01

The study of voice-hallucinations in deaf individuals, who exploit the visuomotor rather than auditory modality for communication, provides rare insight into the relationship between sensory experience and how “voices” are perceived. Relatively little is known about the perceptual characteristics of voice-hallucinations in congenitally deaf people who use lip-reading or sign language as their preferred means of communication. The existing literature on hallucinations in deaf people is reviewed, alongside consideration of how such phenomena may fit into explanatory subvocal articulation hypotheses proposed for auditory verbal hallucinations in hearing people. It is suggested that a failure in subvocal articulation processes may account for voice-hallucinations in both hearing and deaf people but that the distinct way in which hallucinations are experienced may be due to differences in a sensory feedback component, which is influenced by both auditory deprivation and language modality. This article highlights how the study of deaf people may inform wider understanding of auditory verbal hallucinations and subvocal processes generally. PMID:16510696

Age-related hearing difficulties. I. Hearing impairment, disability, and handicap--a controlled study.

PubMed

Salomon, G; Vesterager, V; Jagd, M

1988-01-01

The study compares the audiological profile of a group of first-time applicants for hearing aids, a group of re-applicants and a group of non-complainers, aged 70-75 years (n = 71). In spite of overlap in range, a significant difference in thresholds and discrimination was found. The lip-reading capacity was well preserved in the elderly, but showed a significant correlation to the general health condition. The audiological benefit of hearing-aids did not increase with early fitting. General satisfaction with life was independent of satisfaction with hearing; two thirds of the patients were satisfied with their aids and used them regularly. The rest were dissatisfied and used them less than once a week. The aids were most systematically used to watch TV. Pure-tone average and handicap scaling were compared as guidelines for hearing-aid fitting. The most powerful tool to identify those in need of hearing-aids was handicap scaling based on interviews concerning self-assessed hearing difficulties.
Degradation of labial information modifies audiovisual speech perception in cochlear-implanted children.

PubMed

Huyse, Aurélie; Berthommier, Frédéric; Leybaert, Jacqueline

2013-01-01

The aim of the present study was to examine audiovisual speech integration in cochlear-implanted children and in normally hearing children exposed to degraded auditory stimuli. Previous studies have shown that speech perception in cochlear-implanted users is biased toward the visual modality when audition and vision provide conflicting information. Our main question was whether an experimentally designed degradation of the visual speech cue would increase the importance of audition in the response pattern. The impact of auditory proficiency was also investigated. A group of 31 children with cochlear implants and a group of 31 normally hearing children matched for chronological age were recruited. All children with cochlear implants had profound congenital deafness and had used their implants for at least 2 years. Participants had to perform an /aCa/ consonant-identification task in which stimuli were presented randomly in three conditions: auditory only, visual only, and audiovisual (congruent and incongruent McGurk stimuli). In half of the experiment, the visual speech cue was normal; in the other half (visual reduction) a degraded visual signal was presented, aimed at preventing lipreading of good quality. The normally hearing children received a spectrally reduced speech signal (simulating the input delivered by the cochlear implant). First, performance in visual-only and in congruent audiovisual modalities were decreased, showing that the visual reduction technique used here was efficient at degrading lipreading. Second, in the incongruent audiovisual trials, visual reduction led to a major increase in the number of auditory based responses in both groups. Differences between proficient and nonproficient children were found in both groups, with nonproficient children's responses being more visual and less auditory than those of proficient children. Further analysis revealed that differences between visually clear and visually reduced conditions and between
The effect of audiovisual and binaural listening on the acceptable noise level (ANL): establishing an ANL conceptual model.

PubMed

Wu, Yu-Hsiang; Stangl, Elizabeth; Pang, Carol; Zhang, Xuyang

2014-02-01

Little is known regarding the acoustic features of a stimulus used by listeners to determine the acceptable noise level (ANL). Features suggested by previous research include speech intelligibility (noise is unacceptable when it degrades speech intelligibility to a certain degree; the intelligibility hypothesis) and loudness (noise is unacceptable when the speech-to-noise loudness ratio is poorer than a certain level; the loudness hypothesis). The purpose of the study was to investigate if speech intelligibility or loudness is the criterion feature that determines ANL. To achieve this, test conditions were chosen so that the intelligibility and loudness hypotheses would predict different results. In Experiment 1, the effect of audiovisual (AV) and binaural listening on ANL was investigated; in Experiment 2, the effect of interaural correlation (ρ) on ANL was examined. A single-blinded, repeated-measures design was used. Thirty-two and twenty-five younger adults with normal hearing participated in Experiments 1 and 2, respectively. In Experiment 1, both ANL and speech recognition performance were measured using the AV version of the Connected Speech Test (CST) in three conditions: AV-binaural, auditory only (AO)-binaural, and AO-monaural. Lipreading skill was assessed using the Utley lipreading test. In Experiment 2, ANL and speech recognition performance were measured using the Hearing in Noise Test (HINT) in three binaural conditions, wherein the interaural correlation of noise was varied: ρ = 1 (N(o)S(o) [a listening condition wherein both speech and noise signals are identical across two ears]), -1 (NπS(o) [a listening condition wherein speech signals are identical across two ears whereas the noise signals of two ears are 180 degrees out of phase]), and 0 (N(u)S(o) [a listening condition wherein speech signals are identical across two ears whereas noise signals are uncorrelated across ears]). The results were compared to the predictions made based on the
Prototype to product—developing a commercially viable neural prosthesis

NASA Astrophysics Data System (ADS)

Seligman, Peter

2009-12-01

The Cochlear implant or 'Bionic ear' is a device that enables people who do not get sufficient benefit from a hearing aid to communicate with the hearing world. The Cochlear implant is not an amplifier, but a device that electrically stimulates the auditory nerve in a way that crudely mimics normal hearing, thus providing a hearing percept. Many recipients are able to understand running speech without the help of lipreading. Cochlear implants have reached a stage of maturity where there are now 170 000 recipients implanted worldwide. The commercial development of these devices has occurred over the last 30 years. This development has been multidisciplinary, including audiologists, engineers, both mechanical and electrical, histologists, materials scientists, physiologists, surgeons and speech pathologists. This paper will trace the development of the device we have today, from the engineering perspective. The special challenges of designing an active device that will work in the human body for a lifetime will be outlined. These challenges include biocompatibility, extreme reliability, safety, patient fitting and surgical issues. It is emphasized that the successful development of a neural prosthesis requires the partnership of academia and industry.
Prototype to product-developing a commercially viable neural prosthesis.

PubMed

Seligman, Peter

2009-12-01

The Cochlear implant or 'Bionic ear' is a device that enables people who do not get sufficient benefit from a hearing aid to communicate with the hearing world. The Cochlear implant is not an amplifier, but a device that electrically stimulates the auditory nerve in a way that crudely mimics normal hearing, thus providing a hearing percept. Many recipients are able to understand running speech without the help of lipreading. Cochlear implants have reached a stage of maturity where there are now 170 000 recipients implanted worldwide. The commercial development of these devices has occurred over the last 30 years. This development has been multidisciplinary, including audiologists, engineers, both mechanical and electrical, histologists, materials scientists, physiologists, surgeons and speech pathologists. This paper will trace the development of the device we have today, from the engineering perspective. The special challenges of designing an active device that will work in the human body for a lifetime will be outlined. These challenges include biocompatibility, extreme reliability, safety, patient fitting and surgical issues. It is emphasized that the successful development of a neural prosthesis requires the partnership of academia and industry.
Automatic lip reading by using multimodal visual features

NASA Astrophysics Data System (ADS)

Takahashi, Shohei; Ohya, Jun

2013-12-01

Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.
Speechreading development in deaf and hearing children: introducing the test of child speechreading.

PubMed

Kyle, Fiona E; Campbell, Ruth; Mohammed, Tara; Coleman, Mike; Macsweeney, Mairéad

2013-04-01

In this article, the authors describe the development of a new instrument, the Test of Child Speechreading (ToCS), which was specifically designed for use with deaf and hearing children. Speechreading is a skill that is required for deaf children to access the language of the hearing community. ToCS is a deaf-friendly, computer-based test that measures child speechreading (silent lipreading) at 3 psycholinguistic levels: (a) Words, (b) Sentences, and (c) Short Stories. The aims of the study were to standardize the ToCS with deaf and hearing children and to investigate the effects of hearing status, age, and linguistic complexity on speechreading ability. Eighty-six severely and profoundly deaf children and 91 hearing children participated. All children were between the ages of 5 and 14 years. The deaf children were from a range of language and communication backgrounds, and their preferred mode of communication varied. Speechreading skills significantly improved with age for both groups of children. There was no effect of hearing status on speechreading ability, and children from both groups showed similar performance across all subtests of the ToCS. The ToCS is a valid and reliable assessment of speechreading ability in school-age children that can be used to measure individual differences in performance in speechreading ability.
Nurse practitioner perceptions of barriers and facilitators in providing health care for deaf American Sign Language users: A qualitative socio-ecological approach.

PubMed

Pendergrass, Kathy M; Nemeth, Lynne; Newman, Susan D; Jenkins, Carolyn M; Jones, Elaine G

2017-06-01

Nurse practitioners (NPs), as well as all healthcare clinicians, have a legal and ethical responsibility to provide health care for deaf American Sign Language (ASL) users equal to that of other patients, including effective communication, autonomy, and confidentiality. However, very little is known about the feasibility to provide equitable health care. The purpose of this study was to examine NP perceptions of barriers and facilitators in providing health care for deaf ASL users. Semistructured interviews in a qualitative design using a socio-ecological model (SEM). Barriers were identified at all levels of the SEM. NPs preferred interpreters to facilitate the visit, but were unaware of their role in assuring effective communication is achieved. A professional sign language interpreter was considered a last resort when all other means of communication failed. Gesturing, note-writing, lip-reading, and use of a familial interpreter were all considered facilitators. Interventions are needed at all levels of the SEM. Resources are needed to provide awareness of deaf communication issues and legal requirements for caring for deaf signers for practicing and student NPs. Protocols need to be developed and present in all healthcare facilities for hiring interpreters as well as quick access to contact information for these interpreters. ©2017 American Association of Nurse Practitioners.
Segmentation of human face using gradient-based approach

NASA Astrophysics Data System (ADS)

Baskan, Selin; Bulut, M. Mete; Atalay, Volkan

2001-04-01

This paper describes a method for automatic segmentation of facial features such as eyebrows, eyes, nose, mouth and ears in color images. This work is an initial step for wide range of applications based on feature-based approaches, such as face recognition, lip-reading, gender estimation, facial expression analysis, etc. Human face can be characterized by its skin color and nearly elliptical shape. For this purpose, face detection is performed using color and shape information. Uniform illumination is assumed. No restrictions on glasses, make-up, beard, etc. are imposed. Facial features are extracted using the vertically and horizontally oriented gradient projections. The gradient of a minimum with respect to its neighbor maxima gives the boundaries of a facial feature. Each facial feature has a different horizontal characteristic. These characteristics are derived by extensive experimentation with many face images. Using fuzzy set theory, the similarity between the candidate and the feature characteristic under consideration is calculated. Gradient-based method is accompanied by the anthropometrical information, for robustness. Ear detection is performed using contour-based shape descriptors. This method detects the facial features and circumscribes each facial feature with the smallest rectangle possible. AR database is used for testing. The developed method is also suitable for real-time systems.
Speechreading development in deaf and hearing children: introducing a new Test of Child Speechreading (ToCS)

PubMed Central

Kyle, Fiona Elizabeth; Campbell, Ruth; Mohammed, Tara; Coleman, Mike; MacSweeney, Mairéad

2016-01-01

Purpose We describe the development of a new Test of Child Speechreading (ToCS) specifically designed for use with deaf and hearing children. Speechreading is a skill which is required for deaf children to access the language of the hearing community. ToCS is a deaf-friendly, computer-based test that measures child speechreading (silent lipreading) at three psycholinguistic levels: words, sentences and short stories. The aims of the study were to standardize ToCS with deaf and hearing children and investigate the effects of hearing status, age and linguistic complexity on speechreading ability. Method 86 severely and profoundly deaf and 91 hearing children aged between 5 and 14 years participated. The deaf children were from a range of language and communication backgrounds and their preferred mode of communication varied. Results: Speechreading skills significantly improved with age for both deaf and hearing children. There was no effect of hearing status on speechreading ability and deaf and hearing showed similar performance across all subtests on ToCS. Conclusions The Test of Child Speechreading (ToCS) is a valid and reliable assessment of speechreading ability in school-aged children that can be used to measure individual differences in performance in speechreading ability. PMID:23275416
Cochlear implantation in children and adults in Switzerland.

PubMed

Brand, Yves; Senn, Pascal; Kompis, Martin; Dillier, Norbert; Allum, John H J

2014-02-04

The cochlear implant (CI) is one of the most successful neural prostheses developed to date. It offers artificial hearing to individuals with profound sensorineural hearing loss and with insufficient benefit from conventional hearing aids. The first implants available some 30 years ago provided a limited sensation of sound. The benefit for users of these early systems was mostly a facilitation of lip-reading based communication rather than an understanding of speech. Considerable progress has been made since then. Modern, multichannel implant systems feature complex speech processing strategies, high stimulation rates and multiple sites of stimulation in the cochlea. Equipped with such a state-of-the-art system, the majority of recipients today can communicate orally without visual cues and can even use the telephone. The impact of CIs on deaf individuals and on the deaf community has thus been exceptional. To date, more than 300,000 patients worldwide have received CIs. In Switzerland, the first implantation was performed in 1977 and, as of 2012, over 2,000 systems have been implanted with a current rate of around 150 CIs per year. The primary purpose of this article is to provide a contemporary overview of cochlear implantation, emphasising the situation in Switzerland.
Technical survey of the French role in multichannel cochlear implant development.

PubMed

Chouard, Claude-Henri

2015-06-01

The objective of this review is to remind the ENT community of the essential role of the French teams in the development and finalization of the multi-electrode cochlear implant (MCI), which has deliberately been neglected, and to repair the oblivion into which France was curiously cast at the end of the last century. It aims to underline significant scientific publications from the researchers who played key roles in the development of MCIs. In conclusion, the Parisian team of the ENT Lab in Saint Antoine Hospital in Paris can claim priority for its work in five regards. We were the first: (1) to plot in 1976 a frequency map of the whole length of three living human cochleas; (2) on September 22, 1976, to set up total cochlear implantation in a deaf adult male with eight electrodes; (3) on March 16, 1977, to apply for a patent for an implantable hearing aid in humans; (4) to describe sound signal processing (SSP) for a functional cochlear implant able to supply totally deaf patients with speech discrimination without the help of lip-reading; (5) in 1983, to experimentally demonstrate why it was necessary to place a cochlear implant as early as possible, in case of profound neonatal deafness. An injustice has occurred. These facts will be brought to the knowledge of the scientific community.
Adaptation by normal listeners to upward spectral shifts of speech: implications for cochlear implants.

PubMed

Rosen, S; Faulkner, A; Wilkinson, L

1999-12-01

Multi-channel cochlear implants typically present spectral information to the wrong "place" in the auditory nerve array, because electrodes can only be inserted partway into the cochlea. Although such spectral shifts are known to cause large immediate decrements in performance in simulations, the extent to which listeners can adapt to such shifts has yet to be investigated. Here, the effects of a four-channel implant in normal listeners have been simulated, and performance tested with unshifted spectral information and with the equivalent of a 6.5-mm basalward shift on the basilar membrane (1.3-2.9 octaves, depending on frequency). As expected, the unshifted simulation led to relatively high levels of mean performance (e.g., 64% of words in sentences correctly identified) whereas the shifted simulation led to very poor results (e.g., 1% of words). However, after just nine 20-min sessions of connected discourse tracking with the shifted simulation, performance improved significantly for the identification of intervocalic consonants, medial vowels in monosyllables, and words in sentences (30% of words). Also, listeners were able to track connected discourse of shifted signals without lipreading at rates up to 40 words per minute. Although we do not know if complete adaptation to the shifted signals is possible, it is clear that short-term experiments seriously exaggerate the long-term consequences of such spectral shifts.
Electrophysiological evidence for Audio-visuo-lingual speech integration.

PubMed

Treille, Avril; Vilain, Coriandre; Schwartz, Jean-Luc; Hueber, Thomas; Sato, Marc

2018-01-31

Recent neurophysiological studies demonstrate that audio-visual speech integration partly operates through temporal expectations and speech-specific predictions. From these results, one common view is that the binding of auditory and visual, lipread, speech cues relies on their joint probability and prior associative audio-visual experience. The present EEG study examined whether visual tongue movements integrate with relevant speech sounds, despite little associative audio-visual experience between the two modalities. A second objective was to determine possible similarities and differences of audio-visual speech integration between unusual audio-visuo-lingual and classical audio-visuo-labial modalities. To this aim, participants were presented with auditory, visual, and audio-visual isolated syllables, with the visual presentation related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, with lingual and facial movements previously recorded by an ultrasound imaging system and a video camera. In line with previous EEG studies, our results revealed an amplitude decrease and a latency facilitation of P2 auditory evoked potentials in both audio-visual-lingual and audio-visuo-labial conditions compared to the sum of unimodal conditions. These results argue against the view that auditory and visual speech cues solely integrate based on prior associative audio-visual perceptual experience. Rather, they suggest that dynamic and phonetic informational cues are sharable across sensory modalities, possibly through a cross-modal transfer of implicit articulatory motor knowledge. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sizing up the competition: quantifying the influence of the mental lexicon on auditory and visual spoken word recognition.

PubMed

Strand, Julia F; Sommers, Mitchell S

2011-09-01

Much research has explored how spoken word recognition is influenced by the architecture and dynamics of the mental lexicon (e.g., Luce and Pisoni, 1998; McClelland and Elman, 1986). A more recent question is whether the processes underlying word recognition are unique to the auditory domain, or whether visually perceived (lipread) speech may also be sensitive to the structure of the mental lexicon (Auer, 2002; Mattys, Bernstein, and Auer, 2002). The current research was designed to test the hypothesis that both aurally and visually perceived spoken words are isolated in the mental lexicon as a function of their modality-specific perceptual similarity to other words. Lexical competition (the extent to which perceptually similar words influence recognition of a stimulus word) was quantified using metrics that are well-established in the literature, as well as a statistical method for calculating perceptual confusability based on the phi-square statistic. Both auditory and visual spoken word recognition were influenced by modality-specific lexical competition as well as stimulus word frequency. These findings extend the scope of activation-competition models of spoken word recognition and reinforce the hypothesis (Auer, 2002; Mattys et al., 2002) that perceptual and cognitive properties underlying spoken word recognition are not specific to the auditory domain. In addition, the results support the use of the phi-square statistic as a better predictor of lexical competition than metrics currently used in models of spoken word recognition. © 2011 Acoustical Society of America
Auditory Midbrain Implant: Research and Development Towards a Second Clinical Trial

PubMed Central

Lim, Hubert H.; Lenarz, Thomas

2015-01-01

The cochlear implant is considered one of the most successful neural prostheses to date, which was made possible by visionaries who continued to develop the cochlear implant through multiple technological and clinical challenges. However, patients without a functional auditory nerve or implantable cochlea cannot benefit from a cochlear implant. The focus of the paper is to review the development and translation of a new type of central auditory prosthesis for this group of patients, which is known as the auditory midbrain implant (AMI) and is designed for electrical stimulation within the inferior colliculus. The rationale and results for the first AMI clinical study using a multi-site single-shank array will be presented initially. Although the AMI has achieved encouraging results in terms of safety and improvements in lip-reading capabilities and environmental awareness, it has not yet provided sufficient speech perception. Animal and human data will then be presented to show that a two-shank AMI array can potentially improve hearing performance by targeting specific neurons of the inferior colliculus. Modifications to the AMI array design, stimulation strategy, and surgical approach have been made that are expected to improve hearing performance in the patients implanted with a two-shank array in an upcoming clinical trial funded by the National Institutes of Health. Positive outcomes from this clinical trial will motivate new efforts and developments toward improving central auditory prostheses for those who cannot sufficiently benefit from cochlear implants. PMID:25613994
Auditory midbrain implant: a review.

PubMed

Lim, Hubert H; Lenarz, Minoo; Lenarz, Thomas

2009-09-01

The auditory midbrain implant (AMI) is a new hearing prosthesis designed for stimulation of the inferior colliculus in deaf patients who cannot sufficiently benefit from cochlear implants. The authors have begun clinical trials in which five patients have been implanted with a single shank AMI array (20 electrodes). The goal of this review is to summarize the development and research that has led to the translation of the AMI from a concept into the first patients. This study presents the rationale and design concept for the AMI as well a summary of the animal safety and feasibility studies that were required for clinical approval. The authors also present the initial surgical, psychophysical, and speech results from the first three implanted patients. Overall, the results have been encouraging in terms of the safety and functionality of the implant. All patients obtain improvements in hearing capabilities on a daily basis. However, performance varies dramatically across patients depending on the implant location within the midbrain with the best performer still not able to achieve open set speech perception without lip-reading cues. Stimulation of the auditory midbrain provides a wide range of level, spectral, and temporal cues, all of which are important for speech understanding, but they do not appear to sufficiently fuse together to enable open set speech perception with the currently used stimulation strategies. Finally, several issues and hypotheses for why current patients obtain limited speech perception along with several feasible solutions for improving AMI implementation are presented.
Auditory performance and subjective benefits in adults with congenital or prelinguistic deafness who receive cochlear implants during adulthood.

PubMed

Duchesne, Louise; Millette, Isabelle; Bhérer, Maurice; Gobeil, Suzie

2017-05-01

(1) To describe auditory performance and subjective benefits in adults with congenital or prelingual deafness who received a cochlear implant (CI) during adolescence or adulthood, and (2) to examine the benefits as experienced by these CI users. Twenty-one adults aged 23-65 years participated in the study. All had a congenital or prelingual deafness (onset before age 3). They received a CI between the age of 16 and 61 years (mean age: 31). Speech recognition scores before and after implantation were computed and a questionnaire on subjective benefits (French adaptation of the Adult Cochlear Implant Questionnaire, designed by Zwolan and collaborators (1996, Self-report of CI use and satisfaction by prelingually deafened adults. Ear and Hearing, 17(3): 198-210) was administered. Semi-structured interviews were subsequently conducted with a subsample of seven participants. Speech recognition scores after implantation ranged from 0 to 95%. Despite large inter-individual variability, most participants expressed high levels satisfaction and overall usefulness. Correlational analyses showed that speech recognition performance was moderately associated with subjective benefits. Data from the interviews revealed that the underlying sources of satisfaction with the implant are related to the discovery and enjoyment of environmental sounds, easier lip-reading, and improvement of self-confidence during communicative interactions. CI benefits are mostly subjective in this particular population: descriptive and qualitative approaches allow us to obtain a nuanced portrait of their experience and provide us with important elements that are not easily measurable with tests and scores.
Feasibility of the electrolarynx for enabling communication in the chronically critically ill: The EECCHO study.

PubMed

Rose, Louise; Istanboulian, Laura; Smith, Orla M; Silencieux, Soledad; Cuthbertson, Brian H; Amaral, Andre Carlos Kajdacsy-Balla; Fraser, Ian; Grey, Joanne; Dale, Craig

2018-06-12

To assess feasibility of producing intelligible and comprehensible speech with an electrolarynx; measure anxiety, communication ease, and satisfaction before/after electrolarynx training; and identify barriers/facilitators. We included tracheostomized adults from 3 units following commands, reading English, and mouthing words. On enrolment, we measured anxiety, ease, and satisfaction with communication. We gave electrolarynx instruction for ≤5 days then 2 independent raters assessed intelligibility, sentence comprehensibility (9-point difficulty scale), and Electrolarynx Effectiveness Score (EES), and re-evaluated anxiety, communication ease, and satisfaction. Interviews explored barriers/facilitators. We recruited 24 participants (Jan2015-Dec2016). Mean (SD) intelligibility was 45%(18%) words correct: 57%(21%) when facing. Mean comprehension difficulty was 6.4(2.0) overall, indicating moderate difficulty (5.5(2.5) scored visualizing). Mean EES was 2.9(1.0) (3 = improved lip-reading through recognizable sounds). Anxiety decreased from median 3.8 to 2.0 (P = .007). Communication was rated easier (median 15 vs 12, P = .04) whereas satisfaction remained similar (P = .06). Facilitators included device friendliness, patient independence, and word intelligibility. Barriers were patient weakness, difficulty positioning the device, and limited sentence as opposed to word intelligibility. The electrolarynx may aid intelligible speech for some tracheostomized patients if the communication partner can visualize the users face, and reduce anxiety and make patient perceived communication easier. Copyright © 2018 Elsevier Inc. All rights reserved.
Individual differences and the effect of face configuration information in the McGurk effect.

PubMed

Ujiie, Yuta; Asai, Tomohisa; Wakabayashi, Akio

2018-04-01

The McGurk effect, which denotes the influence of visual information on audiovisual speech perception, is less frequently observed in individuals with autism spectrum disorder (ASD) compared to those without it; the reason for this remains unclear. Several studies have suggested that facial configuration context might play a role in this difference. More specifically, people with ASD show a local processing bias for faces-that is, they process global face information to a lesser extent. This study examined the role of facial configuration context in the McGurk effect in 46 healthy students. Adopting an analogue approach using the Autism-Spectrum Quotient (AQ), we sought to determine whether this facial configuration context is crucial to previously observed reductions in the McGurk effect in people with ASD. Lip-reading and audiovisual syllable identification tasks were assessed via presentation of upright normal, inverted normal, upright Thatcher-type, and inverted Thatcher-type faces. When the Thatcher-type face was presented, perceivers were found to be sensitive to the misoriented facial characteristics, causing them to perceive a weaker McGurk effect than when the normal face was presented (this is known as the McThatcher effect). Additionally, the McGurk effect was weaker in individuals with high AQ scores than in those with low AQ scores in the incongruent audiovisual condition, regardless of their ability to read lips or process facial configuration contexts. Our findings, therefore, do not support the assumption that individuals with ASD show a weaker McGurk effect due to a difficulty in processing facial configuration context.

Brain networks engaged in audiovisual integration during speech perception revealed by persistent homology-based network filtration.

PubMed

Kim, Heejung; Hahm, Jarang; Lee, Hyekyoung; Kang, Eunjoo; Kang, Hyejin; Lee, Dong Soo

2015-05-01

The human brain naturally integrates audiovisual information to improve speech perception. However, in noisy environments, understanding speech is difficult and may require much effort. Although the brain network is supposed to be engaged in speech perception, it is unclear how speech-related brain regions are connected during natural bimodal audiovisual or unimodal speech perception with counterpart irrelevant noise. To investigate the topological changes of speech-related brain networks at all possible thresholds, we used a persistent homological framework through hierarchical clustering, such as single linkage distance, to analyze the connected component of the functional network during speech perception using functional magnetic resonance imaging. For speech perception, bimodal (audio-visual speech cue) or unimodal speech cues with counterpart irrelevant noise (auditory white-noise or visual gum-chewing) were delivered to 15 subjects. In terms of positive relationship, similar connected components were observed in bimodal and unimodal speech conditions during filtration. However, during speech perception by congruent audiovisual stimuli, the tighter couplings of left anterior temporal gyrus-anterior insula component and right premotor-visual components were observed than auditory or visual speech cue conditions, respectively. Interestingly, visual speech is perceived under white noise by tight negative coupling in the left inferior frontal region-right anterior cingulate, left anterior insula, and bilateral visual regions, including right middle temporal gyrus, right fusiform components. In conclusion, the speech brain network is tightly positively or negatively connected, and can reflect efficient or effortful processes during natural audiovisual integration or lip-reading, respectively, in speech perception.
The contribution of visual areas to speech comprehension: a PET study in cochlear implants patients and normal-hearing subjects.

PubMed

Giraud, Anne Lise; Truy, Eric

2002-01-01

Early visual cortex can be recruited by meaningful sounds in the absence of visual information. This occurs in particular in cochlear implant (CI) patients whose dependency on visual cues in speech comprehension is increased. Such cross-modal interaction mirrors the response of early auditory cortex to mouth movements (speech reading) and may reflect the natural expectancy of the visual counterpart of sounds, lip movements. Here we pursue the hypothesis that visual activations occur specifically in response to meaningful sounds. We performed PET in both CI patients and controls, while subjects listened either to their native language or to a completely unknown language. A recruitment of early visual cortex, the left posterior inferior temporal gyrus (ITG) and the left superior parietal cortex was observed in both groups. While no further activation occurred in the group of normal-hearing subjects, CI patients additionally recruited the right perirhinal/fusiform and mid-fusiform, the right temporo-occipito-parietal (TOP) junction and the left inferior prefrontal cortex (LIPF, Broca's area). This study confirms a participation of visual cortical areas in semantic processing of speech sounds. Observation of early visual activation in normal-hearing subjects shows that auditory-to-visual cross-modal effects can also be recruited under natural hearing conditions. In cochlear implant patients, speech activates the mid-fusiform gyrus in the vicinity of the so-called face area. This suggests that specific cross-modal interaction involving advanced stages in the visual processing hierarchy develops after cochlear implantation and may be the correlate of increased usage of lip-reading.
Effects of Visual Speech on Early Auditory Evoked Fields - From the Viewpoint of Individual Variance.

PubMed

Yahata, Izumi; Kawase, Tetsuaki; Kanno, Akitake; Hidaka, Hiroshi; Sakamoto, Shuichi; Nakasato, Nobukazu; Kawashima, Ryuta; Katori, Yukio

2017-01-01

The effects of visual speech (the moving image of the speaker's face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker's face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker's face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information.
Deficient multisensory integration in schizophrenia: an event-related potential study.

PubMed

Stekelenburg, Jeroen J; Maes, Jan Pieter; Van Gool, Arthur R; Sitskoorn, Margriet; Vroomen, Jean

2013-07-01

In many natural audiovisual events (e.g., the sight of a face articulating the syllable /ba/), the visual signal precedes the sound and thus allows observers to predict the onset and the content of the sound. In healthy adults, the N1 component of the event-related brain potential (ERP), reflecting neural activity associated with basic sound processing, is suppressed if a sound is accompanied by a video that reliably predicts sound onset. If the sound does not match the content of the video (e.g., hearing /ba/ while lipreading /fu/), the later occurring P2 component is affected. Here, we examined whether these visual information sources affect auditory processing in patients with schizophrenia. The electroencephalography (EEG) was recorded in 18 patients with schizophrenia and compared with that of 18 healthy volunteers. As stimuli we used video recordings of natural actions in which visual information preceded and predicted the onset of the sound that was either congruent or incongruent with the video. For the healthy control group, visual information reduced the auditory-evoked N1 if compared to a sound-only condition, and stimulus-congruency affected the P2. This reduction in N1 was absent in patients with schizophrenia, and the congruency effect on the P2 was diminished. Distributed source estimations revealed deficits in the network subserving audiovisual integration in patients with schizophrenia. The results show a deficit in multisensory processing in patients with schizophrenia and suggest that multisensory integration dysfunction may be an important and, to date, under-researched aspect of schizophrenia. Copyright © 2013. Published by Elsevier B.V.
United Kingdom national paediatric bilateral project: Results of professional rating scales and parent questionnaires.

PubMed

Cullington, H E; Bele, D; Brinton, J C; Cooper, S; Daft, M; Harding, J; Hatton, N; Humphries, J; Lutman, M E; Maddocks, J; Maggs, J; Millward, K; O'Donoghue, G; Patel, S; Rajput, K; Salmon, V; Sear, T; Speers, A; Wheeler, A; Wilson, K

2017-01-01

This fourteen-centre project used professional rating scales and parent questionnaires to assess longitudinal outcomes in a large non-selected population of children receiving simultaneous and sequential bilateral cochlear implants. This was an observational non-randomized service evaluation. Data were collected at four time points: before bilateral cochlear implants or before the sequential implant, one year, two years, and three years after. The measures reported are Categories of Auditory Performance II (CAPII), Speech Intelligibility Rating (SIR), Bilateral Listening Skills Profile (BLSP) and Parent Outcome Profile (POP). Thousand and one children aged from 8 months to almost 18 years were involved, although there were many missing data. In children receiving simultaneous implants after one, two, and three years respectively, median CAP scores were 4, 5, and 6; median SIR were 1, 2, and 3. Three years after receiving simultaneous bilateral cochlear implants, 61% of children were reported to understand conversation without lip-reading and 66% had intelligible speech if the listener concentrated hard. Auditory performance and speech intelligibility were significantly better in female children than males. Parents of children using sequential implants were generally positive about their child's well-being and behaviour since receiving the second device; those who were less positive about well-being changes also generally reported their children less willing to wear the second device. Data from 78% of paediatric cochlear implant centres in the United Kingdom provide a real-world picture of outcomes of children with bilateral implants in the UK. This large reference data set can be used to identify children in the lower quartile for targeted intervention.
A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements.

PubMed

Hauswald, Anne; Lithari, Chrysa; Collignon, Olivier; Leonardelli, Elisa; Weisz, Nathan

2018-05-07

Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Talker and lexical effects on audiovisual word recognition by adults with cochlear implants.

PubMed

Kaiser, Adam R; Kirk, Karen Iler; Lachs, Lorin; Pisoni, David B

2003-04-01

The present study examined how postlingually deafened adults with cochlear implants combine visual information from lipreading with auditory cues in an open-set word recognition task. Adults with normal hearing served as a comparison group. Word recognition performance was assessed using lexically controlled word lists presented under auditory-only, visual-only, and combined audiovisual presentation formats. Effects of talker variability were studied by manipulating the number of talkers producing the stimulus tokens. Lexical competition was investigated using sets of lexically easy and lexically hard test words. To assess the degree of audiovisual integration, a measure of visual enhancement, R(a), was used to assess the gain in performance provided in the audiovisual presentation format relative to the maximum possible performance obtainable in the auditory-only format. Results showed that word recognition performance was highest for audiovisual presentation followed by auditory-only and then visual-only stimulus presentation. Performance was better for single-talker lists than for multiple-talker lists, particularly under the audiovisual presentation format. Word recognition performance was better for the lexically easy than for the lexically hard words regardless of presentation format. Visual enhancement scores were higher for single-talker conditions compared to multiple-talker conditions and tended to be somewhat better for lexically easy words than for lexically hard words. The pattern of results suggests that information from the auditory and visual modalities is used to access common, multimodal lexical representations in memory. The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech.
[Inpatient rehabilitation of adult CI users: Results in dependency of duration of deafness, CI experience and age].

PubMed

Zeh, R; Baumann, U

2015-08-01

Cochlear implants (CI) have proven to be a highly effective treatment for severe hearing loss or deafness. Inpatient rehabilitation therapy is frequently discussed as a means to increase the speech perception abilities achieved by CI. However, thus far there exists no quantitative evaluation of the effect of these therapies. A retrospective analysis of audiometric data obtained from 1355 CI users compared standardized and qualitative speech intelligibility tests conducted at two time points (admission to and discharge from inpatient hearing therapy, duration 3-5 weeks). The test battery comprised examination of vowel/consonant identification, the Freiburg numbers and monosyllabic test (65 and 80 dB sound pressure level, SPL, free-field sound level), the Hochmair-Schulz-Moser (HSM) sentence test in quiet and in noise (65 dB SPL speech level; 15 dB signal-to-noise ratio, SNR), and a speech tracking test with and without lip-reading. An average increase of 20 percentage points was scored at discharge compared to the admission tests. Patients of all ages and duration of deafness demonstrated the same amount of benefit from the rehabilitation treatment. After completion of inpatient rehabilitation treatment, patients with short duration of CI experience (below 4 months) achieved test scores comparable to experienced long-term users. The demonstrated benefit of the treatment was independent of age and duration of deafness or CI experience. The rehabilitative training program significantly improved hearing abilities and speech perception in CI users, thus promoting their professional and social inclusion. The present results support the efficacy of inpatient rehabilitation for CI recipients. Integration of this or similar therapeutic concepts in the German catalog of follow-up treatment measures appears justified.
Talker and Lexical Effects on Audiovisual Word Recognition by Adults With Cochlear Implants

PubMed Central

Kaiser, Adam R.; Kirk, Karen Iler; Lachs, Lorin; Pisoni, David B.

2012-01-01

The present study examined how postlingually deafened adults with cochlear implants combine visual information from lipreading with auditory cues in an open-set word recognition task. Adults with normal hearing served as a comparison group. Word recognition performance was assessed using lexically controlled word lists presented under auditory-only, visual-only, and combined audiovisual presentation formats. Effects of talker variability were studied by manipulating the number of talkers producing the stimulus tokens. Lexical competition was investigated using sets of lexically easy and lexically hard test words. To assess the degree of audiovisual integration, a measure of visual enhancement, Ra, was used to assess the gain in performance provided in the audiovisual presentation format relative to the maximum possible performance obtainable in the auditory-only format. Results showed that word recognition performance was highest for audiovisual presentation followed by auditory-only and then visual-only stimulus presentation. Performance was better for single-talker lists than for multiple-talker lists, particularly under the audiovisual presentation format. Word recognition performance was better for the lexically easy than for the lexically hard words regardless of presentation format. Visual enhancement scores were higher for single-talker conditions compared to multiple-talker conditions and tended to be somewhat better for lexically easy words than for lexically hard words. The pattern of results suggests that information from the auditory and visual modalities is used to access common, multimodal lexical representations in memory. The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech. PMID:14700380
Effects of Visual Speech on Early Auditory Evoked Fields - From the Viewpoint of Individual Variance

PubMed Central

Yahata, Izumi; Kanno, Akitake; Hidaka, Hiroshi; Sakamoto, Shuichi; Nakasato, Nobukazu; Kawashima, Ryuta; Katori, Yukio

2017-01-01

The effects of visual speech (the moving image of the speaker’s face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker’s face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker’s face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information. PMID:28141836
Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration.

PubMed

Stropahl, Maren; Debener, Stefan

2017-01-01

There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users ( n = 18), untreated mild to moderately hearing impaired individuals (n = 18) and normal hearing controls ( n = 17). Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the auditory system
How can audiovisual pathways enhance the temporal resolution of time-compressed speech in blind subjects?

PubMed

Hertrich, Ingo; Dietrich, Susanne; Ackermann, Hermann

2013-01-01

In blind people, the visual channel cannot assist face-to-face communication via lipreading or visual prosody. Nevertheless, the visual system may enhance the evaluation of auditory information due to its cross-links to (1) the auditory system, (2) supramodal representations, and (3) frontal action-related areas. Apart from feedback or top-down support of, for example, the processing of spatial or phonological representations, experimental data have shown that the visual system can impact auditory perception at more basic computational stages such as temporal signal resolution. For example, blind as compared to sighted subjects are more resistant against backward masking, and this ability appears to be associated with activity in visual cortex. Regarding the comprehension of continuous speech, blind subjects can learn to use accelerated text-to-speech systems for "reading" texts at ultra-fast speaking rates (>16 syllables/s), exceeding by far the normal range of 6 syllables/s. A functional magnetic resonance imaging study has shown that this ability, among other brain regions, significantly covaries with BOLD responses in bilateral pulvinar, right visual cortex, and left supplementary motor area. Furthermore, magnetoencephalographic measurements revealed a particular component in right occipital cortex phase-locked to the syllable onsets of accelerated speech. In sighted people, the "bottleneck" for understanding time-compressed speech seems related to higher demands for buffering phonological material and is, presumably, linked to frontal brain structures. On the other hand, the neurophysiological correlates of functions overcoming this bottleneck, seem to depend upon early visual cortex activity. The present Hypothesis and Theory paper outlines a model that aims at binding these data together, based on early cross-modal pathways that are already known from various audiovisual experiments on cross-modal adjustments during space, time, and object recognition.
How can audiovisual pathways enhance the temporal resolution of time-compressed speech in blind subjects?

PubMed Central

Hertrich, Ingo; Dietrich, Susanne; Ackermann, Hermann

2013-01-01

In blind people, the visual channel cannot assist face-to-face communication via lipreading or visual prosody. Nevertheless, the visual system may enhance the evaluation of auditory information due to its cross-links to (1) the auditory system, (2) supramodal representations, and (3) frontal action-related areas. Apart from feedback or top-down support of, for example, the processing of spatial or phonological representations, experimental data have shown that the visual system can impact auditory perception at more basic computational stages such as temporal signal resolution. For example, blind as compared to sighted subjects are more resistant against backward masking, and this ability appears to be associated with activity in visual cortex. Regarding the comprehension of continuous speech, blind subjects can learn to use accelerated text-to-speech systems for “reading” texts at ultra-fast speaking rates (>16 syllables/s), exceeding by far the normal range of 6 syllables/s. A functional magnetic resonance imaging study has shown that this ability, among other brain regions, significantly covaries with BOLD responses in bilateral pulvinar, right visual cortex, and left supplementary motor area. Furthermore, magnetoencephalographic measurements revealed a particular component in right occipital cortex phase-locked to the syllable onsets of accelerated speech. In sighted people, the “bottleneck” for understanding time-compressed speech seems related to higher demands for buffering phonological material and is, presumably, linked to frontal brain structures. On the other hand, the neurophysiological correlates of functions overcoming this bottleneck, seem to depend upon early visual cortex activity. The present Hypothesis and Theory paper outlines a model that aims at binding these data together, based on early cross-modal pathways that are already known from various audiovisual experiments on cross-modal adjustments during space, time, and object
From Mimicry to Language: A Neuroanatomically Based Evolutionary Model of the Emergence of Vocal Language

PubMed Central

Poliva, Oren

2016-01-01

The auditory cortex communicates with the frontal lobe via the middle temporal gyrus (auditory ventral stream; AVS) or the inferior parietal lobule (auditory dorsal stream; ADS). Whereas the AVS is ascribed only with sound recognition, the ADS is ascribed with sound localization, voice detection, prosodic perception/production, lip-speech integration, phoneme discrimination, articulation, repetition, phonological long-term memory and working memory. Previously, I interpreted the juxtaposition of sound localization, voice detection, audio-visual integration and prosodic analysis, as evidence that the behavioral precursor to human speech is the exchange of contact calls in non-human primates. Herein, I interpret the remaining ADS functions as evidence of additional stages in language evolution. According to this model, the role of the ADS in vocal control enabled early Homo (Hominans) to name objects using monosyllabic calls, and allowed children to learn their parents' calls by imitating their lip movements. Initially, the calls were forgotten quickly but gradually were remembered for longer periods. Once the representations of the calls became permanent, mimicry was limited to infancy, and older individuals encoded in the ADS a lexicon for the names of objects (phonological lexicon). Consequently, sound recognition in the AVS was sufficient for activating the phonological representations in the ADS and mimicry became independent of lip-reading. Later, by developing inhibitory connections between acoustic-syllabic representations in the AVS and phonological representations of subsequent syllables in the ADS, Hominans became capable of concatenating the monosyllabic calls for repeating polysyllabic words (i.e., developed working memory). Finally, due to strengthening of connections between phonological representations in the ADS, Hominans became capable of encoding several syllables as a single representation (chunking). Consequently, Hominans began vocalizing and
Visual speech discrimination and identification of natural and synthetic consonant stimuli

PubMed Central

Files, Benjamin T.; Tjan, Bosco S.; Jiang, Jintao; Bernstein, Lynne E.

2015-01-01

From phonetic features to connected discourse, every level of psycholinguistic structure including prosody can be perceived through viewing the talking face. Yet a longstanding notion in the literature is that visual speech perceptual categories comprise groups of phonemes (referred to as visemes), such as /p, b, m/ and /f, v/, whose internal structure is not informative to the visual speech perceiver. This conclusion has not to our knowledge been evaluated using a psychophysical discrimination paradigm. We hypothesized that perceivers can discriminate the phonemes within typical viseme groups, and that discrimination measured with d-prime (d’) and response latency is related to visual stimulus dissimilarities between consonant segments. In Experiment 1, participants performed speeded discrimination for pairs of consonant-vowel spoken nonsense syllables that were predicted to be same, near, or far in their perceptual distances, and that were presented as natural or synthesized video. Near pairs were within-viseme consonants. Natural within-viseme stimulus pairs were discriminated significantly above chance (except for /k/-/h/). Sensitivity (d’) increased and response times decreased with distance. Discrimination and identification were superior with natural stimuli, which comprised more phonetic information. We suggest that the notion of the viseme as a unitary perceptual category is incorrect. Experiment 2 probed the perceptual basis for visual speech discrimination by inverting the stimuli. Overall reductions in d’ with inverted stimuli but a persistent pattern of larger d’ for far than for near stimulus pairs are interpreted as evidence that visual speech is represented by both its motion and configural attributes. The methods and results of this investigation open up avenues for understanding the neural and perceptual bases for visual and audiovisual speech perception and for development of practical applications such as visual lipreading
A Deficit in Movement-Derived Sentences in German-Speaking Hearing-Impaired Children

PubMed Central

Ruigendijk, Esther; Friedmann, Naama

2017-01-01

Children with hearing impairment (HI) show disorders in syntax and morphology. The question is whether and how these disorders are connected to problems in the auditory domain. The aim of this paper is to examine whether moderate to severe hearing loss at a young age affects the ability of German-speaking orally trained children to understand and produce sentences. We focused on sentence structures that are derived by syntactic movement, which have been identified as a sensitive marker for syntactic impairment in other languages and in other populations with syntactic impairment. Therefore, our study tested subject and object relatives, subject and object Wh-questions, passive sentences, and topicalized sentences, as well as sentences with verb movement to second sentential position. We tested 19 HI children aged 9;5–13;6 and compared their performance with hearing children using comprehension tasks of sentence-picture matching and sentence repetition tasks. For the comprehension tasks, we included HI children who passed an auditory discrimination task; for the sentence repetition tasks, we selected children who passed a screening task of simple sentence repetition without lip-reading; this made sure that they could perceive the words in the tests, so that we could test their grammatical abilities. The results clearly showed that most of the participants with HI had considerable difficulties in the comprehension and repetition of sentences with syntactic movement: they had significant difficulties understanding object relatives, Wh-questions, and topicalized sentences, and in the repetition of object who and which questions and subject relatives, as well as in sentences with verb movement to second sentential position. Repetition of passives was only problematic for some children. Object relatives were still difficult at this age for both HI and hearing children. An additional important outcome of the study is that not all sentence structures are impaired�
Home-based Early Intervention on Auditory and Speech Development in Mandarin-speaking Deaf Infants and Toddlers with Chronological Aged 7-24 Months.

PubMed

Yang, Ying; Liu, Yue-Hui; Fu, Ming-Fu; Li, Chun-Lin; Wang, Li-Yan; Wang, Qi; Sun, Xi-Bin

2015-08-20

Early auditory and speech development in home-based early intervention of infants and toddlers with hearing loss younger than 2 years are still spare in China. This study aimed to observe the development of auditory and speech in deaf infants and toddlers who were fitted with hearing aids and/or received cochlear implantation between the chronological ages of 7-24 months, and analyze the effect of chronological age and recovery time on auditory and speech development in the course of home-based early intervention. This longitudinal study included 55 hearing impaired children with severe and profound binaural deafness, who were divided into Group A (7-12 months), Group B (13-18 months) and Group C (19-24 months) based on the chronological age. Categories auditory performance (CAP) and speech intelligibility rating scale (SIR) were used to evaluate auditory and speech development at baseline and 3, 6, 9, 12, 18, and 24 months of habilitation. Descriptive statistics were used to describe demographic features and were analyzed by repeated measures analysis of variance. With 24 months of hearing intervention, 78% of the patients were able to understand common phrases and conversation without lip-reading, 96% of the patients were intelligible to a listener. In three groups, children showed the rapid growth of trend features in each period of habilitation. CAP and SIR scores have developed rapidly within 24 months after fitted auxiliary device in Group A, which performed much better auditory and speech abilities than Group B (P < 0.05) and Group C (P < 0.05). Group B achieved better results than Group C, whereas no significant differences were observed between Group B and Group C (P > 0.05). The data suggested the early hearing intervention and home-based habilitation benefit auditory and speech development. Chronological age and recovery time may be major factors for aural verbal outcomes in hearing impaired children. The development of auditory and speech in hearing
An Eye Tracking Study on the Perception and Comprehension of Unimodal and Bimodal Linguistic Inputs by Deaf Adolescents.

PubMed

Mastrantuono, Eliana; Saldaña, David; Rodríguez-Ortiz, Isabel R

2017-01-01

An eye tracking experiment explored the gaze behavior of deaf individuals when perceiving language in spoken and sign language only, and in sign-supported speech (SSS). Participants were deaf ( n = 25) and hearing ( n = 25) Spanish adolescents. Deaf students were prelingually profoundly deaf individuals with cochlear implants (CIs) used by age 5 or earlier, or prelingually profoundly deaf native signers with deaf parents. The effectiveness of SSS has rarely been tested within the same group of children for discourse-level comprehension. Here, video-recorded texts, including spatial descriptions, were alternately transmitted in spoken language, sign language and SSS. The capacity of these communicative systems to equalize comprehension in deaf participants with that of spoken language in hearing participants was tested. Within-group analyses of deaf participants tested if the bimodal linguistic input of SSS favored discourse comprehension compared to unimodal languages. Deaf participants with CIs achieved equal comprehension to hearing controls in all communicative systems while deaf native signers with no CIs achieved equal comprehension to hearing participants if tested in their native sign language. Comprehension of SSS was not increased compared to spoken language, even when spatial information was communicated. Eye movements of deaf and hearing participants were tracked and data of dwell times spent looking at the face or body area of the sign model were analyzed. Within-group analyses focused on differences between native and non-native signers. Dwell times of hearing participants were equally distributed across upper and lower areas of the face while deaf participants mainly looked at the mouth area; this could enable information to be obtained from mouthings in sign language and from lip-reading in SSS and spoken language. Few fixations were directed toward the signs, although these were more frequent when spatial language was transmitted. Both native and
An Eye Tracking Study on the Perception and Comprehension of Unimodal and Bimodal Linguistic Inputs by Deaf Adolescents

PubMed Central

Mastrantuono, Eliana; Saldaña, David; Rodríguez-Ortiz, Isabel R.

2017-01-01

An eye tracking experiment explored the gaze behavior of deaf individuals when perceiving language in spoken and sign language only, and in sign-supported speech (SSS). Participants were deaf (n = 25) and hearing (n = 25) Spanish adolescents. Deaf students were prelingually profoundly deaf individuals with cochlear implants (CIs) used by age 5 or earlier, or prelingually profoundly deaf native signers with deaf parents. The effectiveness of SSS has rarely been tested within the same group of children for discourse-level comprehension. Here, video-recorded texts, including spatial descriptions, were alternately transmitted in spoken language, sign language and SSS. The capacity of these communicative systems to equalize comprehension in deaf participants with that of spoken language in hearing participants was tested. Within-group analyses of deaf participants tested if the bimodal linguistic input of SSS favored discourse comprehension compared to unimodal languages. Deaf participants with CIs achieved equal comprehension to hearing controls in all communicative systems while deaf native signers with no CIs achieved equal comprehension to hearing participants if tested in their native sign language. Comprehension of SSS was not increased compared to spoken language, even when spatial information was communicated. Eye movements of deaf and hearing participants were tracked and data of dwell times spent looking at the face or body area of the sign model were analyzed. Within-group analyses focused on differences between native and non-native signers. Dwell times of hearing participants were equally distributed across upper and lower areas of the face while deaf participants mainly looked at the mouth area; this could enable information to be obtained from mouthings in sign language and from lip-reading in SSS and spoken language. Few fixations were directed toward the signs, although these were more frequent when spatial language was transmitted. Both native and non
Using space and time to encode vibrotactile information: toward an estimate of the skin's achievable throughput.

PubMed

Novich, Scott D; Eagleman, David M

2015-10-01

Touch receptors in the skin can relay various forms of abstract information, such as words (Braille), haptic feedback (cell phones, game controllers, feedback for prosthetic control), and basic visual information such as edges and shape (sensory substitution devices). The skin can support such applications with ease: They are all low bandwidth and do not require a fine temporal acuity. But what of high-throughput applications? We use sound-to-touch conversion as a motivating example, though others abound (e.g., vision, stock market data). In the past, vibrotactile hearing aids have demonstrated improvement in speech perceptions in the deaf. However, a sound-to-touch sensory substitution device that works with high efficacy and without the aid of lipreading has yet to be developed. Is this because skin simply does not have the capacity to effectively relay high-throughput streams such as sound? Or is this because the spatial and temporal properties of skin have not been leveraged to full advantage? Here, we begin to address these questions with two experiments. First, we seek to determine the best method of relaying information through the skin using an identification task on the lower back. We find that vibrotactile patterns encoding information in both space and time yield the best overall information transfer estimate. Patterns encoded in space and time or "intensity" (the coupled coding of vibration frequency and force) both far exceed performance of only spatially encoded patterns. Next, we determine the vibrotactile two-tacton resolution on the lower back-the distance necessary for resolving two vibrotactile patterns. We find that our vibratory motors conservatively require at least 6 cm of separation to resolve two independent tactile patterns (>80 % correct), regardless of stimulus type (e.g., spatiotemporal "sweeps" versus single vibratory pulses). Six centimeter is a greater distance than the inter-motor distances used in Experiment 1 (2.5 cm), which

Factors contributing to speech perception scores in long-term pediatric cochlear implant users.

PubMed

Davidson, Lisa S; Geers, Ann E; Blamey, Peter J; Tobey, Emily A; Brenner, Christine A

2011-02-01

Picture Vocabulary Test achieved asymptote at similar ages, around 10 to 11 yrs. On average, children receiving CIs between 2 and 5 yrs of age exhibited significant improvement on tests of speech perception, lipreading, speech production, and language skills measured between primary grades and adolescence. Evidence suggests that improvement in speech perception scores with age reflects increased spoken language level up to a language age of about 10 yrs. Speech perception performance significantly decreased with softer stimulus intensity level and with introduction of background noise. Upgrades to newer speech processing strategies and greater use of frequency-modulated systems may be beneficial for ameliorating performance under these demanding listening conditions.
Long-Term Outcomes, Education, and Occupational Level in Cochlear Implant Recipients Who Were Implanted in Childhood.

PubMed

Illg, Angelika; Haack, Marius; Lesinski-Schiedat, Anke; Büchner, Andreas; Lenarz, Thomas

skill level achieved was 2.24 (range 1 to 4; SD = 0.57) which was significantly poorer (t(127) = 4.886; p = 0.001) than the mean skill level of the respondents (mean = 2.54; SD = 0.85). Data collection up to 17.75 (SD = 3.08; range 13 to 28) years post implant demonstrated that the majority of participants who underwent implantation at an early age achieved discrimination of speech sounds without lipreading (CAP category 4.00). Educational, vocational, and occupational level achieved by this cohort were significantly poorer compared with the German and worldwide population average. Children implanted today who are younger at implantation, and with whom more advanced up-to-date CIs are used, are expected to exhibit better auditory performance and have enhanced educational and occupational opportunities. Compared with the circumstances immediately after World War II in the 20th century, children with hearing impairment who use these implants have improved prospects in this regard.
[Slowing down the flow of facial information enhances facial scanning in children with autism spectrum disorders: A pilot eye tracking study].

PubMed

Charrier, A; Tardif, C; Gepner, B

2017-02-01

mouth as well as longer mean duration of visual fixation on mouth and eyes, at slow speeds (S50 and/or S70) than at RT one. Slowing down facial dynamics enhances looking time on face, and particularly on mouth and/or eyes, in a group of 23 children with ASD and particularly in a small subgroup with mild autism. Given the crucial role of reading the eyes for emotional processing and that of lip-reading for language processing, our present result and other converging ones could pave the way for novel socio-emotional and verbal rehabilitation methods for autistic population. Further studies should investigate whether increased attention to face and particularly eyes and mouth is correlated to emotional/social and/or verbal/language improvements. Copyright © 2016 L'Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.