Sample records for lipreading

  1. Speech Perception Results: Audition and Lipreading Enhancement.

    ERIC Educational Resources Information Center

    Geers, Ann; Brenner, Chris


    This paper describes changes in speech perception performance of deaf children using cochlear implants, tactile aids, or conventional hearing aids over a three-year period. Eleven of the 13 children with cochlear implants were able to identify words on the basis of auditory consonant cues. Significant lipreading enhancement was also achieved with…

  2. The Development of Generative Lipreading Skills in Deaf Persons Using Cued Speech Training.

    ERIC Educational Resources Information Center

    Neef, Nancy A.; Iwata, Brian A


    Evaluation of the effects of cued speech on lipreading performance of two deaf males indicated that Ss were able to accurately lipread cued stimuli after cued speech training and that generalization of lipreading skills to novel nonsense syllables occurred. Cued speech training also appeared to facilitate lipreading performance with noncued…

  3. Lipreading in School-Age Children: The Roles of Age, Hearing Status, and Cognitive Ability

    ERIC Educational Resources Information Center

    Tye-Murray, Nancy; Hale, Sandra; Spehar, Brent; Myerson, Joel; Sommers, Mitchell S.


    Purpose: The study addressed three research questions: Does lipreading improve between the ages of 7 and 14 years? Does hearing loss affect the development of lipreading? How do individual differences in lipreading relate to other abilities? Method: Forty children with normal hearing (NH) and 24 with hearing loss (HL) were tested using 4…

  4. The effects of age and gender on lipreading abilities.


    Tye-Murray, Nancy; Sommers, Mitchell S; Spehar, Brent


    Age-related declines for many sensory and cognitive abilities are greater for males than for females. The primary purpose of the present investigation was to consider whether age-related changes in lipreading abilities are similar for men and women by comparing the lipreading abilities of separate groups of younger and older adults. Older females, older males, younger females and younger males completed vision-only speech recognition tests of: (1) 13 consonants in a vocalic /i/-C-/i/ environment; (2) words in a carrier phrase; and (3) meaningful sentences. In addition to percent correct performance, consonant data were analyzed for performance within viseme categories. The results suggest that while older adults do not lipread as well as younger adults, the difference between older and younger participants was comparable across gender. We also found no differences in the lipreading abilities of males and females, regardless of stimulus type (i.e., consonants, words, sentences), a finding that differs from some reports by previous investigators (e.g., Dancer, Krain, Thompson, Davis, & Glenn, 1994).

  5. Lip-Reading by Deaf and Hearing Children

    ERIC Educational Resources Information Center

    Conradm, R.


    A group of profoundly deaf 15-year-old subjects with no other handicap and of average non-verbal intelligence were given a lip-reading test. The same test was given to comparable hearing subjects "deafened" by white noise masking. The difference between the groups was not significant. (Editor)

  6. Lip-reading abilities in a subject with congenital prosopagnosia.


    Wathour, J; Decat, M; Vander Linden, F; Deggouj, N


    We present the case of an individual with congenital prosopagnosia or "face blindness", a disorder where the ability to recognize faces is impaired. We studied the lip-reading ability and audiovisual perception of this subject using a DVD with four conditions (audiovisual congruent, auditory, visual, and audiovisual incongruent) and compared results with a normal patient cohort. The patient had no correct responses in the visual lip-reading task; whereas, he improved in the audiovisual congruent task. In the audiovisual incongruent task, the patient provided one response; thus, he was able to lip-read. (He was able to use lip-reading/to use labial informations) This patient perceived only global dynamic facial movements, not the fine ones. He had a sufficient complementary use of lip-reading in audiovisual tasks, but not visual ones. These data are consistent with abnormal development of the pathways used for visual speech perception and associated with second-order face processing disorders and normal development of the audiovisual network for speech perception.

  7. The effect of lip-reading on primary stream segregation.


    Devergie, Aymeric; Grimault, Nicolas; Gaudrain, Etienne; Healy, Eric W; Berthommier, Frédéric


    Lip-reading has been shown to improve the intelligibility of speech in multitalker situations, where auditory stream segregation naturally takes place. This study investigated whether the benefit of lip-reading is a result of a primary audiovisual interaction that enhances the obligatory streaming mechanism. Two behavioral experiments were conducted involving sequences of French vowels that alternated in fundamental frequency. In Experiment 1, subjects attempted to identify the order of items in a sequence. In Experiment 2, subjects attempted to detect a disruption to temporal isochrony across alternate items. Both tasks are disrupted by streaming, thus providing a measure of primary or obligatory streaming. Visual lip gestures articulating alternate vowels were synchronized with the auditory sequence. Overall, the results were consistent with the hypothesis that visual lip gestures enhance segregation by affecting primary auditory streaming. Moreover, increases in the naturalness of visual lip gestures and auditory vowels, and corresponding increases in audiovisual congruence may potentially lead to increases in the effect of visual lip gestures on streaming.

  8. Experience with a talker can transfer across modalities to facilitate lipreading.


    Sanchez, Kauyumari; Dias, James W; Rosenblum, Lawrence D


    Rosenblum, Miller, and Sanchez (Psychological Science, 18, 392-396, 2007) found that subjects first trained to lip-read a particular talker were then better able to perceive the auditory speech of that same talker, as compared with that of a novel talker. This suggests that the talker experience a perceiver gains in one sensory modality can be transferred to another modality to make that speech easier to perceive. An experiment was conducted to examine whether this cross-sensory transfer of talker experience could occur (1) from auditory to lip-read speech, (2) with subjects not screened for adequate lipreading skill, (3) when both a familiar and an unfamiliar talker are presented during lipreading, and (4) for both old (presentation set) and new words. Subjects were first asked to identify a set of words from a talker. They were then asked to perform a lipreading task from two faces, one of which was of the same talker they heard in the first phase of the experiment. Results revealed that subjects who lip-read from the same talker they had heard performed better than those who lip-read a different talker, regardless of whether the words were old or new. These results add further evidence that learning of amodal talker information can facilitate speech perception across modalities and also suggest that this information is not restricted to previously heard words.

  9. Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading

    PubMed Central

    O’Sullivan, Aisling E.; Crosse, Michael J.; Di Liberto, Giovanni M.; Lalor, Edmund C.


    Speech is a multisensory percept, comprising an auditory and visual component. While the content and processing pathways of audio speech have been well characterized, the visual component is less well understood. In this work, we expand current methodologies using system identification to introduce a framework that facilitates the study of visual speech in its natural, continuous form. Specifically, we use models based on the unheard acoustic envelope (E), the motion signal (M) and categorical visual speech features (V) to predict EEG activity during silent lipreading. Our results show that each of these models performs similarly at predicting EEG in visual regions and that respective combinations of the individual models (EV, MV, EM and EMV) provide an improved prediction of the neural activity over their constituent models. In comparing these different combinations, we find that the model incorporating all three types of features (EMV) outperforms the individual models, as well as both the EV and MV models, while it performs similarly to the EM model. Importantly, EM does not outperform EV and MV, which, considering the higher dimensionality of the V model, suggests that more data is needed to clarify this finding. Nevertheless, the performance of EMV, and comparisons of the subject performances for the three individual models, provides further evidence to suggest that visual regions are involved in both low-level processing of stimulus dynamics and categorical speech perception. This framework may prove useful for investigating modality-specific processing of visual speech under naturalistic conditions. PMID:28123363

  10. Audiovisual Speech Integration and Lipreading in Autism

    ERIC Educational Resources Information Center

    Smith, Elizabeth G.; Bennetto, Loisa


    Background: During speech perception, the ability to integrate auditory and visual information causes speech to sound louder and be more intelligible, and leads to quicker processing. This integration is important in early language development, and also continues to affect speech comprehension throughout the lifespan. Previous research shows that…

  11. Study of lip-reading detecting and locating technique

    NASA Astrophysics Data System (ADS)

    Wang, Lirong; Li, Jie; Zhao, Yanyan


    With the development of human computer interaction, lip reading technology has become a topic focus in the multimode technologic field. However, detecting and locating lip accurately are very difficult because lip contours of different people, varied illuminant conditions, head movements and other factors. Based on the methods of detecting and locating lip we proposed the methods which are based on the lips color extracted lip contour using the adaptive chromatic filter from the facial images. It is not sensitive to illumination, but appropriate chromatic lip filter is given by analyzing the entire face color and clustering statistics of lip color. It is proposed the combinable method which is preprocessing the face image including rotating the angle of face and improving image contrast in this paper and the lip region is analyzed clustering characteristics for the skin color and lip color, obtained adaptive chromatic filter which can prominent lips from the facial image. This method overcomes the varied illuminate, incline face. The experiments showed that it enhanced detection and location accurately through rough detecting lip region. It lay a good foundation for extraction the lip feature and tracking lip subsequently.

  12. Techniques for Assessing Auditory Speech Perception and Lipreading Enhancement in Young Deaf Children.

    ERIC Educational Resources Information Center

    Geers, Ann


    This paper examines the special considerations involved in selecting a speech perception test battery for young deaf children. The auditory-only tests consisted of closed-set word identification tasks and minimal-pairs syllable tasks. Additional tests included identification of words in sentences, open-set word recognition, and evaluation of…

  13. Effects of Context Type on Lipreading and Listening Performance and Implications for Sentence Processing

    ERIC Educational Resources Information Center

    Spehar, Brent; Goebel, Stacey; Tye-Murray, Nancy


    Purpose: This study compared the use of 2 different types of contextual cues (sentence based and situation based) in 2 different modalities (visual only and auditory only). Method: Twenty young adults were tested with the Illustrated Sentence Test (Tye-Murray, Hale, Spehar, Myerson, & Sommers, 2014) and the Speech Perception in Noise Test…

  14. Impact of Audio-Visual Asynchrony on Lip-Reading Effects -Neuromagnetic and Psychophysical Study-

    PubMed Central

    Yahata, Izumi; Kanno, Akitake; Sakamoto, Shuichi; Takanashi, Yoshitaka; Takata, Shiho; Nakasato, Nobukazu; Kawashima, Ryuta; Katori, Yukio


    The effects of asynchrony between audio and visual (A/V) stimuli on the N100m responses of magnetoencephalography in the left hemisphere were compared with those on the psychophysical responses in 11 participants. The latency and amplitude of N100m were significantly shortened and reduced in the left hemisphere by the presentation of visual speech as long as the temporal asynchrony between A/V stimuli was within 100 ms, but were not significantly affected with audio lags of -500 and +500 ms. However, some small effects were still preserved on average with audio lags of 500 ms, suggesting similar asymmetry of the temporal window to that observed in psychophysical measurements, which tended to be more robust (wider) for audio lags; i.e., the pattern of visual-speech effects as a function of A/V lag observed in the N100m in the left hemisphere grossly resembled that in psychophysical measurements on average, although the individual responses were somewhat varied. The present results suggest that the basic configuration of the temporal window of visual effects on auditory-speech perception could be observed from the early auditory processing stage. PMID:28030631

  15. Effects of Context Type on Lipreading and Listening Performance and Implications for Sentence Processing

    PubMed Central

    Goebel, Stacey; Tye-Murray, Nancy


    Purpose This study compared the use of 2 different types of contextual cues (sentence based and situation based) in 2 different modalities (visual only and auditory only). Method Twenty young adults were tested with the Illustrated Sentence Test (Tye-Murray, Hale, Spehar, Myerson, & Sommers, 2014) and the Speech Perception in Noise Test (Bilger, Nuetzel, Rabinowitz, & Rzeczkowski, 1984; Kalikow, Stevens, & Elliott, 1977) in the 2 modalities. The Illustrated Sentences Test presents sentences with no context and sentences accompanied by picture-based situational context cues. The Speech Perception in Noise Test presents sentences with low sentence-based context and sentences with high sentence-based context. Results Participants benefited from both types of context and received more benefit when testing occurred in the visual-only modality than when it occurred in the auditory-only modality. Participants' use of sentence-based context did not correlate with use of situation-based context. Cue usage did not correlate between the 2 modalities. Conclusions The ability to use contextual cues appears to be dependent on the type of cue and the presentation modality of the target word(s). In a theoretical sense, the results suggest that models of word recognition and sentence processing should incorporate the influence of multiple sources of information and recognize that the 2 types of context have different influences on speech perception. In a clinical sense, the results suggest that aural rehabilitation programs might provide training to optimize use of both kinds of contextual cues. PMID:25863923

  16. 77 FR 24554 - Culturally Significant Objects Imported for Exhibition; Determinations: “Quay Brothers: On...

    Federal Register 2010, 2011, 2012, 2013, 2014


    ... Pharmacist's Prescription for Lip-Reading Puppets'' AGENCY: State Department. ACTION: Notice. SUMMARY: Notice... objects to be included in the exhibition ``Quay Brothers: On Deciphering the Pharmacist's Prescription...

  17. Effects of a Wearable, Tactile Aid on Language Comprehension of Prelingual Profoundly Deaf Children.

    ERIC Educational Resources Information Center

    Proctor, Adele

    Factors influencing the use of nonacoustic aids (such as visual displays and tactile devices) with the hearing impaired are reviewed. The benefits of tactile devices in improving speech reading/lipreading and speech are pointed out. Tactile aids which provide information on rhythm, rate, intensity, and duration of speech increase lipreading and…

  18. Tracking Human Faces in Real-Time,

    DTIC Science & Technology


    human-computer interactive applications such as lip-reading and gaze tracking. The principle in developing this system can be extended to other tracking problems such as tracking the human hand for gesture recognition .

  19. Development of a speech autocuer

    NASA Technical Reports Server (NTRS)

    Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; Mccartney, M. L.


    A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.

  20. Perception of the auditory-visual illusion in speech perception by children with phonological disorders.


    Dodd, Barbara; McIntosh, Beth; Erdener, Dogu; Burnham, Denis


    An example of the auditory-visual illusion in speech perception, first described by McGurk and MacDonald, is the perception of [ta] when listeners hear [pa] in synchrony with the lip movements for [ka]. One account of the illusion is that lip-read and heard speech are combined in an articulatory code since people who mispronounce words respond differently from controls on lip-reading tasks. A same-different judgment task assessing perception of the illusion showed no difference in performance between controls and children with speech difficulties. Another experiment compared children with delayed and disordered speech on perception of the illusion. While neither group perceived many illusions, a significant interaction indicated that children with disordered phonology were strongly biased to the auditory component while the delayed group's response was more evenly split between the auditory and visual components of the illusion. These findings suggest that phonological processing, rather than articulation, supports lip-reading ability.

  1. Is automated conversion of video to text a reality?

    NASA Astrophysics Data System (ADS)

    Bowden, Richard; Cox, Stephen J.; Harvey, Richard W.; Lan, Yuxuan; Ong, Eng-Jon; Owen, Gari; Theobald, Barry-John


    A recent trend in law enforcement has been the use of Forensic lip-readers. Criminal activities are often recorded on CCTV or other video gathering systems. Knowledge of what suspects are saying enriches the evidence gathered but lip-readers, by their own admission, are fallible so, based on long term studies of automated lip-reading, we are investigating the possibilities and limitations of applying this technique under realistic conditions. We have adopted a step-by-step approach and are developing a capability when prior video information is available for the suspect of interest. We use the terminology video-to-text (V2T) for this technique by analogy with speech-to-text (S2T) which also has applications in security and law-enforcement.

  2. Perception of the Auditory-Visual Illusion in Speech Perception by Children with Phonological Disorders

    ERIC Educational Resources Information Center

    Dodd, Barbara; McIntosh, Beth; Erdener, Dogu; Burnham, Denis


    An example of the auditory-visual illusion in speech perception, first described by McGurk and MacDonald, is the perception of [ta] when listeners hear [pa] in synchrony with the lip movements for [ka]. One account of the illusion is that lip-read and heard speech are combined in an articulatory code since people who mispronounce words respond…

  3. Visual Cues and Listening Effort: Individual Variability

    ERIC Educational Resources Information Center

    Picou, Erin M.; Ricketts, Todd A; Hornsby, Benjamin W. Y.


    Purpose: To investigate the effect of visual cues on listening effort as well as whether predictive variables such as working memory capacity (WMC) and lipreading ability affect the magnitude of listening effort. Method: Twenty participants with normal hearing were tested using a paired-associates recall task in 2 conditions (quiet and noise) and…

  4. Tones for Profoundly Deaf Tone-Language Speakers.

    ERIC Educational Resources Information Center

    Ching, Teresa

    A study assessed the practical use of the simplified speech pattern approach to teaching lipreading in a tone language by comparing performance using an acoustic hearing-aid and a Sivo-aid in a tone labelling task. After initial assessment, subjects were given training to enhance perception of lexically contrastive tones, then post-tested. The…

  5. "The Business of Life": Educating Catholic Deaf Children in Late Nineteenth-Century England

    ERIC Educational Resources Information Center

    Mangion, Carmen M.


    Much of the debates in late nineteenth-century Britain regarding the education of deaf children revolved around communication. For many Victorians, sign language was unacceptable; many proponents of oralism attempted to "normalise" the hearing impaired by replacing deaf methods of communication with spoken language and lipreading. While…

  6. Effects of English Cued Speech on Speech Perception, Phonological Awareness and Literacy: A Case Study of a 9-Year-Old Deaf Boy Using a Cochlear Implant

    ERIC Educational Resources Information Center

    Rees, Rachel; Bladel, Judith


    Many studies have shown that French Cued Speech (CS) can enhance lipreading and the development of phonological awareness and literacy in deaf children but, as yet, there is little evidence that these findings can be generalized to English CS. This study investigated the possible effects of English CS on the speech perception, phonological…


    ERIC Educational Resources Information Center



  8. Breaking the Sound Barrier.

    ERIC Educational Resources Information Center

    Garmon, Linda


    Reviews various methods of communication for hearing-impaired individuals, including American Sign Language (ASL) and a computer system which analyzes speech and flashes appropriate symbols onto a wearer's eyeglass lenses to aid in lipreading. Illustrates how an ASL sign can be changed to create a new word. (Author/JN)

  9. 32 CFR 57.3 - Definitions.

    Code of Federal Regulations, 2010 CFR


    ... language habilitation, auditory training, speech-reading (lip-reading), hearing evaluation, and speech... informed of all information about the activity for which consent is sought in the native language or in... language pathologists and audiologists, occupational therapists, physical therapists, psychologists,...

  10. "All Methods--and Wedded to None": The Deaf Education Methods Debate and Progressive Educational Reform in Toronto, Canada, 1922-1945

    ERIC Educational Resources Information Center

    Ellis, Jason A.


    This article is about the deaf education methods debate in the public schools of Toronto, Canada. The author demonstrates how pure oralism (lip-reading and speech instruction to the complete exclusion of sign language) and day school classes for deaf schoolchildren were introduced as a progressive school reform in 1922. Plans for further oralist…

  11. Speech imagery recalibrates speech-perception boundaries.


    Scott, Mark


    The perceptual boundaries between speech sounds are malleable and can shift after repeated exposure to contextual information. This shift is known as recalibration. To date, the known inducers of recalibration are lexical (including phonotactic) information, lip-read information and reading. The experiments reported here are a proof-of-effect demonstration that speech imagery can also induce recalibration.

  12. Deafness and Interpreting.

    ERIC Educational Resources Information Center

    New Jersey State Dept. of Labor, Trenton. Div. of the Deaf.

    This paper explains how the hearing loss of deaf persons affects communication, describes methods deaf individuals use to communicate, and addresses the role of interpreters in the communication process. The volume covers: communication methods such as speechreading or lipreading, written notes, gestures, or sign language (American Sign Language,…

  13. Bimodal bilingualism as multisensory training?: Evidence for improved audiovisual speech perception after sign language exposure.


    Williams, Joshua T; Darcy, Isabelle; Newman, Sharlene D


    The aim of the present study was to characterize effects of learning a sign language on the processing of a spoken language. Specifically, audiovisual phoneme comprehension was assessed before and after 13 weeks of sign language exposure. L2 ASL learners performed this task in the fMRI scanner. Results indicated that L2 American Sign Language (ASL) learners' behavioral classification of the speech sounds improved with time compared to hearing nonsigners. Results indicated increased activation in the supramarginal gyrus (SMG) after sign language exposure, which suggests concomitant increased phonological processing of speech. A multiple regression analysis indicated that learner's rating on co-sign speech use and lipreading ability was correlated with SMG activation. This pattern of results indicates that the increased use of mouthing and possibly lipreading during sign language acquisition may concurrently improve audiovisual speech processing in budding hearing bimodal bilinguals.

  14. Audio-visual speech in noise perception in dyslexia.


    van Laarhoven, Thijs; Keetels, Mirjam; Schakel, Lemmy; Vroomen, Jean


    Individuals with developmental dyslexia (DD) may experience, besides reading problems, other speech-related processing deficits. Here, we examined the influence of visual articulatory information (lip-read speech) at various levels of background noise on auditory word recognition in children and adults with DD. We found that children with a documented history of DD have deficits in their ability to gain benefit from lip-read information that disambiguates noise-masked speech. We show with another group of adult individuals with DD that these deficits persist into adulthood. These deficits could not be attributed to impairments in unisensory auditory word recognition. Rather, the results indicate a specific deficit in audio-visual speech processing and suggest that impaired multisensory integration might be an important aspect of DD.

  15. Two cortical mechanisms support the integration of visual and auditory speech: a hypothesis and preliminary data.


    Okada, Kayoko; Hickok, Gregory


    Visual speech (lip-reading) influences the perception of heard speech. The literature suggests at least two possible mechanisms for this influence: "direct" sensory-sensory interaction, whereby sensory signals from auditory and visual modalities are integrated directly, likely in the superior temporal sulcus, and "indirect" sensory-motor interaction, whereby visual speech is first mapped onto motor-speech representations in the frontal lobe, which in turn influences sensory perception via sensory-motor integration networks. We hypothesize that both mechanisms exist, and further that previous demonstrations of lip-reading functional activations in Broca's region and the posterior planum temporale reflect the sensory-motor mechanism. We tested one prediction of this hypothesis using fMRI. We assessed whether viewing visual speech (contrasted with facial gestures) activates the same network as a speech sensory-motor integration task (listen to and then silently rehearse speech). Both tasks activated locations within Broca's area, dorsal premotor cortex, and the posterior planum temporal (Spt), and focal regions of the STS, all of which have previously been implicated in sensory-motor integration for speech. This finding is consistent with the view that visual speech influences heard speech via sensory-motor networks. Lip-reading also activated a much wider network in the superior temporal lobe than the sensory-motor task, possibly reflecting a more direct cross-sensory integration network.

  16. Visual abilities are important for auditory-only speech recognition: evidence from autism spectrum disorder.


    Schelinski, Stefanie; Riedel, Philipp; von Kriegstein, Katharina


    In auditory-only conditions, for example when we listen to someone on the phone, it is essential to fast and accurately recognize what is said (speech recognition). Previous studies have shown that speech recognition performance in auditory-only conditions is better if the speaker is known not only by voice, but also by face. Here, we tested the hypothesis that such an improvement in auditory-only speech recognition depends on the ability to lip-read. To test this we recruited a group of adults with autism spectrum disorder (ASD), a condition associated with difficulties in lip-reading, and typically developed controls. All participants were trained to identify six speakers by name and voice. Three speakers were learned by a video showing their face and three others were learned in a matched control condition without face. After training, participants performed an auditory-only speech recognition test that consisted of sentences spoken by the trained speakers. As a control condition, the test also included speaker identity recognition on the same auditory material. The results showed that, in the control group, performance in speech recognition was improved for speakers known by face in comparison to speakers learned in the matched control condition without face. The ASD group lacked such a performance benefit. For the ASD group auditory-only speech recognition was even worse for speakers known by face compared to speakers not known by face. In speaker identity recognition, the ASD group performed worse than the control group independent of whether the speakers were learned with or without face. Two additional visual experiments showed that the ASD group performed worse in lip-reading whereas face identity recognition was within the normal range. The findings support the view that auditory-only communication involves specific visual mechanisms. Further, they indicate that in ASD, speaker-specific dynamic visual information is not available to optimize auditory

  17. Phonetic recalibration does not depend on working memory

    PubMed Central

    Baart, Martijn


    Listeners use lipread information to adjust the phonetic boundary between two speech categories (phonetic recalibration, Bertelson et al. 2003). Here, we examined phonetic recalibration while listeners were engaged in a visuospatial or verbal memory working memory task under different memory load conditions. Phonetic recalibration was—like selective speech adaptation—not affected by a concurrent verbal or visuospatial memory task. This result indicates that phonetic recalibration is a low-level process not critically depending on processes used in verbal- or visuospatial working memory. PMID:20437168

  18. Video analysis using spatiotemporal descriptor and kernel extreme learning machine for lip reading

    NASA Astrophysics Data System (ADS)

    Lu, Longbin; Zhang, Xinman; Xu, Xuebin; Shang, Dongpeng


    Lip-reading techniques have shown bright prospects for speech recognition under noisy environments and for hearing-impaired listeners. We aim to solve two important issues regarding lip reading: (1) how to extract discriminative lip motion features and (2) how to establish a classifier that can provide promising recognition accuracy for lip reading. For the first issue, a projection local spatiotemporal descriptor, which considers the lip appearance and motion information at the same time, is utilized to provide an efficient representation of a video sequence. For the second issue, a kernel extreme learning machine (KELM) based on the single-hidden-layer feedforward neural network is presented to distinguish all kinds of utterances. In general, this method has fast learning speed and great robustness to nonlinear data. Furthermore, quantum-behaved particle swarm optimization with binary encoding is introduced to select the appropriate feature subset and parameters for KELM training. Experiments conducted on the AVLetters and OuluVS databases show that the proposed lip-reading method achieves a superior recognition accuracy compared with two previous methods.

  19. Phi-square Lexical Competition Database (Phi-Lex): an online tool for quantifying auditory and visual lexical competition.


    Strand, Julia F


    A widely agreed-upon feature of spoken word recognition is that multiple lexical candidates in memory are simultaneously activated in parallel when a listener hears a word, and that those candidates compete for recognition (Luce, Goldinger, Auer, & Vitevitch, Perception 62:615-625, 2000; Luce & Pisoni, Ear and Hearing 19:1-36, 1998; McClelland & Elman, Cognitive Psychology 18:1-86, 1986). Because the presence of those competitors influences word recognition, much research has sought to quantify the processes of lexical competition. Metrics that quantify lexical competition continuously are more effective predictors of auditory and visual (lipread) spoken word recognition than are the categorical metrics traditionally used (Feld & Sommers, Speech Communication 53:220-228, 2011; Strand & Sommers, Journal of the Acoustical Society of America 130:1663-1672, 2011). A limitation of the continuous metrics is that they are somewhat computationally cumbersome and require access to existing speech databases. This article describes the Phi-square Lexical Competition Database (Phi-Lex): an online, searchable database that provides access to multiple metrics of auditory and visual (lipread) lexical competition for English words, available at .

  20. The role of visual speech cues in reducing energetic and informational masking

    NASA Astrophysics Data System (ADS)

    Helfer, Karen S.; Freyman, Richard L.


    Two experiments compared the effect of supplying visual speech information (e.g., lipreading cues) on the ability to hear one female talker's voice in the presence of steady-state noise or a masking complex consisting of two other female voices. In the first experiment intelligibility of sentences was measured in the presence of the two types of maskers with and without perceived spatial separation of target and masker. The second study tested detection of sentences in the same experimental conditions. Results showed that visual cues provided more benefit for both recognition and detection of speech when the masker consisted of other voices (versus steady-state noise). Moreover, visual cues provided greater benefit when the target speech and masker were spatially coincident versus when they appeared to arise from different spatial locations. The data obtained here are consistent with the hypothesis that lipreading cues help to segregate a target voice from competing voices, in addition to the established benefit of supplementing masked phonetic information. .

  1. Cochlear implants for congenitally deaf adolescents: is open-set speech perception a realistic expectation?


    Sarant, J Z; Cowan, R S; Blamey, P J; Galvin, K L; Clark, G M


    The prognosis for benefit from use of cochlear implants in congenitally deaf adolescents, who have a long duration of profound deafness prior to implantation, has typically been low. Speech perception results for two congenitally deaf patients implanted as adolescents at the University of Melbourne/Royal Victorian Eye and Ear Hospital Clinic show that, after 12 months of experience, both patients had significant open-set speech discrimination scores without lipreading. These results suggest that although benefits may in general be low for congenitally deaf adolescents, individuals may attain significant benefits to speech perception after a short period of experience. Prospective patients from this group should therefore be considered on an individual basis with regard to prognosis for benefit from cochlear implantation.

  2. Deafness

    PubMed Central

    Weston, T E T


    Dr T E T Weston describes his research into the effect of noise on hearing acuity and of deafness in the aged. He found that presbyacusis is associated with a multiplicity of factors, e.g. smoking, circulatory disturbance, urban domicile, heredity and occupational acoustic trauma. Miss W Galbraith describes the social implications of various degrees of deafness and the ways in which they can be overcome by such measures as lipreading, hearing aids and rehabilitation. Sir Terence Cawthorne discusses otosclerosis, nearly 1% of the population being affected by this type of deafness. He describes the modern operation of insertion of an artificial piston through the stapes and states that 90% of cases submitted to this operation will show immediate improvement, whilst 85% should still have retained this improvement at the end of two years. PMID:14341856

  3. Speech Analysis Based On Image Information from Lip Movement

    NASA Astrophysics Data System (ADS)

    Talha, Kamil S.; Wan, Khairunizam; Za'ba, S. K.; Mohamad Razlan, Zuradzman; B, Shahriman A.


    Deaf and hard of hearing people often have problems being able to understand and lip read other people. Usually deaf and hard of hearing people feel left out of conversation and sometimes they are actually ignored by other people. There are a variety of ways hearing-impaired person can communicate and gain accsss to the information. Communication support includes both technical and human aids. Human aids include interpreters, lip-readers and note-takers. Interpreters translate the Sign Language and must therefore be qualified. In this paper, vision system is used to track movements of the lip. In the experiment, the proposed system succesfully can differentiate 11 type of phonemes and then classified it to the respective viseme group. By using the proposed system the hearing-impaired persons could practise pronaunciations by themselve without support from the instructor.

  4. Adaptation of neuromagnetic N1 responses to phonetic stimuli by visual speech in humans.


    Jääskeläinen, Iiro P; Ojanen, Ville; Ahveninen, Jyrki; Auranen, Toni; Levänen, Sari; Möttönen, Riikka; Tarnanen, Iina; Sams, Mikko


    The technique of 306-channel magnetoencephalogaphy (MEG) was used in eight healthy volunteers to test whether silent lip-reading modulates auditory-cortex processing of phonetic sounds. Auditory test stimuli (either Finnish vowel /ae/ or /ø/) were preceded by a 500 ms lag by either another auditory stimulus (/ae/, /ø/ or the second-formant midpoint between /ae/ and /ø/), or silent movie of a person articulating /ae/ or /ø/. Compared with N1 responses to auditory /ae/ and /ø/ when presented without a preceding stimulus, the amplitudes of left-hemisphere N1 responses to the test stimuli were significantly suppressed both when preceded by auditory and visual stimuli, this effect being significantly stronger with preceding auditory stimuli. This suggests that seeing articulatory gestures of a speaker influences auditory speech perception by modulating the responsiveness of auditory-cortex neurons.

  5. Cross-cultural adaptation and validation of the Nijmegen Cochlear Implant Questionnaire into Italian.


    Ottaviani, F; Iacona, E; Sykopetrites, V; Schindler, A; Mozzanica, F


    The NCIQ is a quantifiable self-assessment health-related quality of life instrument specific for cochlear implant users. The aim of this study was to culturally adapt the NCIQ into Italian (I-NCIQ). A prospective instrument validation study was conducted. Cross-cultural adaptation and validation were accomplished. Cronbach α was used to test internal consistency in 51 CI users and in a control group composed by 38 post-lingual deaf adult on a waiting list for a CI. ICC test was used for test-retest reliability analysis. Kruskal-Wallis test with Mann-Whitney post hoc were used to compare the I-NCIQ scores in CI users before and after the cochlear implantation and in control patients. I-NCIQ scores obtained in CI users were compared with the results of Italian version of disyllabic testing without lip-reading and without masking. Good internal consistency and good test-retest reliability were found. I-NCIQ scores obtained in the 51 CI users after implantation were consistently higher than those obtained before implantation and in the control group. Moreover, no differences were found in the results of I-NCIQ obtained in the group of 51 CI users before implantation and in the group of control patients on post hoc Mann-Whitney analysis. Positive correlations between I-NCIQ scores and the results of disyllabic testing without lip-reading and without masking were found. The I-NCIQ is a reliable, valid, self-administered questionnaire for the measurement of QOL in CI users; its application is recommended.

  6. PERVALE-S: a new cognitive task to assess deaf people’s ability to perceive basic and social emotions

    PubMed Central

    Mestre, José M.; Larrán, Cristina; Herrero, Joaquín; Guil, Rocío; de la Torre, Gabriel G.


    A poorly understood aspect of deaf people (DP) is how their emotional information is processed. Verbal ability is key to improve emotional knowledge in people. Nevertheless, DP are unable to distinguish intonation, intensity, and the rhythm of language due to lack of hearing. Some DP have acquired both lip-reading abilities and sign language, but others have developed only sign language. PERVALE-S was developed to assess the ability of DP to perceive both social and basic emotions. PERVALE-S presents different sets of visual images of a real deaf person expressing both basic and social emotions, according to the normative standard of emotional expressions in Spanish Sign Language. Emotional expression stimuli were presented at two different levels of intensity (1: low; and 2: high) because DP do not distinguish an object in the same way as hearing people (HP) do. Then, participants had to click on the more suitable emotional expression. PERVALE-S contains video instructions (given by a sign language interpreter) to improve DP’s understanding about how to use the software. DP had to watch the videos before answering the items. To test PERVALE-S, a sample of 56 individuals was recruited (18 signers, 8 lip-readers, and 30 HP). Participants also performed a personality test (High School Personality Questionnaire adapted) and a fluid intelligence (Gf) measure (RAPM). Moreover, all deaf participants were rated by four teachers for the deaf. Results: there were no significant differences between deaf and HP in performance in PERVALE-S. Confusion matrices revealed that embarrassment, envy, and jealousy were worse perceived. Age was just related to social-emotional tasks (but not in basic emotional tasks). Emotional perception ability was related mainly to warmth and consciousness, but negatively related to tension. Meanwhile, Gf was related to only social-emotional tasks. There were no gender differences. PMID:26300828

  7. Audibility and visual biasing in speech perception

    NASA Astrophysics Data System (ADS)

    Clement, Bart Richard

    Although speech perception has been considered a predominantly auditory phenomenon, large benefits from vision in degraded acoustic conditions suggest integration of audition and vision. More direct evidence of this comes from studies of audiovisual disparity that demonstrate vision can bias and even dominate perception (McGurk & MacDonald, 1976). It has been observed that hearing-impaired listeners demonstrate more visual biasing than normally hearing listeners (Walden et al., 1990). It is argued here that stimulus audibility must be equated across groups before true differences can be established. In the present investigation, effects of visual biasing on perception were examined as audibility was degraded for 12 young normally hearing listeners. Biasing was determined by quantifying the degree to which listener identification functions for a single synthetic auditory /ba-da-ga/ continuum changed across two conditions: (1)an auditory-only listening condition; and (2)an auditory-visual condition in which every item of the continuum was synchronized with visual articulations of the consonant-vowel (CV) tokens /ba/ and /ga/, as spoken by each of two talkers. Audibility was altered by presenting the conditions in quiet and in noise at each of three signal-to- noise (S/N) ratios. For the visual-/ba/ context, large effects of audibility were found. As audibility decreased, visual biasing increased. A large talker effect also was found, with one talker eliciting more biasing than the other. An independent lipreading measure demonstrated that this talker was more visually intelligible than the other. For the visual-/ga/ context, audibility and talker effects were less robust, possibly obscured by strong listener effects, which were characterized by marked differences in perceptual processing patterns among participants. Some demonstrated substantial biasing whereas others demonstrated little, indicating a strong reliance on audition even in severely degraded acoustic

  8. Language access and theory of mind reasoning: evidence from deaf children in bilingual and oralist environments.


    Meristo, Marek; Falkman, Kerstin W; Hjelmquist, Erland; Tedoldi, Mariantonia; Surian, Luca; Siegal, Michael


    This investigation examined whether access to sign language as a medium for instruction influences theory of mind (ToM) reasoning in deaf children with similar home language environments. Experiment 1 involved 97 deaf Italian children ages 4-12 years: 56 were from deaf families and had LIS (Italian Sign Language) as their native language, and 41 had acquired LIS as late signers following contact with signers outside their hearing families. Children receiving bimodal/bilingual instruction in LIS together with Sign-Supported and spoken Italian significantly outperformed children in oralist schools in which communication was in Italian and often relied on lipreading. Experiment 2 involved 61 deaf children in Estonia and Sweden ages 6-16 years. On a wide variety of ToM tasks, bilingually instructed native signers in Estonian Sign Language and spoken Estonian succeeded at a level similar to age-matched hearing children. They outperformed bilingually instructed late signers and native signers attending oralist schools. Particularly for native signers, access to sign language in a bilingual environment may facilitate conversational exchanges that promote the expression of ToM by enabling children to monitor others' mental states effectively.

  9. Hearing Loss: Communicating With the Patient Who Is Deaf or Hard of Hearing.


    McKee, Michael M; Moreland, Christopher; Atcherson, Samuel R; Zazove, Philip


    Hearing loss impairs health care communication and adversely affects patient satisfaction, treatment adherence, and use of health services. Hearing loss is the third most common chronic health condition among older patients after hypertension and arthritis, but only 15% to 18% of older adults are screened for hearing loss during health maintenance examinations. Patients with hearing loss may be reluctant to disclose it because of fear of ageism, perceptions of disability, and vanity. Lipreading and note writing often are ineffective ways to communicate with deaf and hard of hearing (DHH) patients who use American Sign Language; use of medical sign language interpreters is preferred. A variety of strategies can improve the quality of health care communication for DHH patients, such as the physician facing the patient, listening attentively, and using visual tools. Physicians should learn what hearing loss means to the DHH patient. Deaf American Sign Language users may not perceive hearing loss as a disability but as a cultural identity. Patients' preferred communication strategies will vary. Relay services, electronic communication, and other telecommunications methods can be helpful, but family physicians and medical staff should learn from each DHH patient about which communication strategies will work best.

  10. Preparing for communication interactions: the value of anticipatory strategies for adults with hearing impairment.


    Tye-Murray, N


    Some people with hearing impairment may use anticipatory strategies to prepare for an upcoming communication interaction, such as a doctor's appointment. They may consider vocabulary and statements that might occur, and they may practice speechreading a partner saying the items. Experiment 1 evaluated the effectiveness of two types of anticipatory strategies: workbook activities and situation-specific lipreading practice. Two groups of normal-hearing subjects were asked to prepare for a communication interaction in a bank setting where they would be required to recognize speech using only the visual signal. Each group was assigned to one type of anticipatory strategy. A third group served as a control group. Experiment 2 evaluated whether multifaceted anticipatory practice improved cochlear implant users' ability to recognize statements and words audiovisually that might occur in a doctor's office, bank, movie theater, and gas station. One group of implanted subjects received 4 days of training, 1 day for each setting, and a second group served as a control group. In both experiments, subjects who used anticipatory strategies did not improve their performance on situation-specific sentence tests more than the control subjects.

  11. Auditory Midbrain Implant: A Review

    PubMed Central

    Lim, Hubert H.; Lenarz, Minoo; Lenarz, Thomas


    The auditory midbrain implant (AMI) is a new hearing prosthesis designed for stimulation of the inferior colliculus in deaf patients who cannot sufficiently benefit from cochlear implants. The authors have begun clinical trials in which five patients have been implanted with a single shank AMI array (20 electrodes). The goal of this review is to summarize the development and research that has led to the translation of the AMI from a concept into the first patients. This study presents the rationale and design concept for the AMI as well a summary of the animal safety and feasibility studies that were required for clinical approval. The authors also present the initial surgical, psychophysical, and speech results from the first three implanted patients. Overall, the results have been encouraging in terms of the safety and functionality of the implant. All patients obtain improvements in hearing capabilities on a daily basis. However, performance varies dramatically across patients depending on the implant location within the midbrain with the best performer still not able to achieve open set speech perception without lip-reading cues. Stimulation of the auditory midbrain provides a wide range of level, spectral, and temporal cues, all of which are important for speech understanding, but they do not appear to sufficiently fuse together to enable open set speech perception with the currently used stimulation strategies. Finally, several issues and hypotheses for why current patients obtain limited speech perception along with several feasible solutions for improving AMI implementation are presented. PMID:19762428

  12. Prototype to product—developing a commercially viable neural prosthesis

    NASA Astrophysics Data System (ADS)

    Seligman, Peter


    The Cochlear implant or 'Bionic ear' is a device that enables people who do not get sufficient benefit from a hearing aid to communicate with the hearing world. The Cochlear implant is not an amplifier, but a device that electrically stimulates the auditory nerve in a way that crudely mimics normal hearing, thus providing a hearing percept. Many recipients are able to understand running speech without the help of lipreading. Cochlear implants have reached a stage of maturity where there are now 170 000 recipients implanted worldwide. The commercial development of these devices has occurred over the last 30 years. This development has been multidisciplinary, including audiologists, engineers, both mechanical and electrical, histologists, materials scientists, physiologists, surgeons and speech pathologists. This paper will trace the development of the device we have today, from the engineering perspective. The special challenges of designing an active device that will work in the human body for a lifetime will be outlined. These challenges include biocompatibility, extreme reliability, safety, patient fitting and surgical issues. It is emphasized that the successful development of a neural prosthesis requires the partnership of academia and industry.

  13. Effects of phonetic context on audio-visual intelligibility of French.


    Benoît, C; Mohamadi, T; Kandel, S


    Bimodal perception leads to better speech understanding than auditory perception alone. We evaluated the overall benefit of lip-reading on natural utterances of French produced by a single speaker. Eighteen French subjects with good audition and vision were administered a closed set identification test of VCVCV nonsense words consisting of three vowels [i, a, y] and six consonants [b, v, z, 3, R, l]. Stimuli were presented under both auditory and audio-visual conditions with white noise added at various signal-to-noise ratios. Identification scores were higher in the bimodal condition than in the auditory-alone condition, especially in situations where acoustic information was reduced. The auditory and audio-visual intelligibility of the three vowels [i, a, y] averaged over the six consonantal contexts was evaluated as well. Two different hierarchies of intelligibility were found. Auditorily, [a] was most intelligible, followed by [i] and then by [y]; whereas visually [y] was most intelligible, followed by [a] and [i]. We also quantified the contextual effects of the three vowels on the auditory and audio-visual intelligibility of the consonants. Both the auditory and the audio-visual intelligibility of surrounding consonants was highest in the [a] context, followed by the [i] context and lastly the [y] context.

  14. Electrically evoked hearing perception by functional neurostimulation of the central auditory system.


    Tatagiba, M; Gharabaghi, A


    Perceptional benefits and potential risks of electrical stimulation of the central auditory system are constantly changing due to ongoing developments and technical modifications. Therefore, we would like to introduce current treatment protocols and strategies that might have an impact on functional results of auditory brainstem implants (ABI) in profoundly deaf patients. Patients with bilateral tumours as a result of neurofibromatosis type 2 with complete dysfunction of the eighth cranial nerves are the most frequent candidates for auditory brainstem implants. Worldwide, about 300 patients have already received an ABI through a translabyrinthine or suboccipital approach supported by multimodality electrophysiological monitoring. Patient selection is based on disease course, clinical signs, audiological, radiological and psycho-social criteria. The ABI provides the patients with access to auditory information such as environmental sound awareness together with distinct hearing cues in speech. In addition, this device markedly improves speech reception in combination with lip-reading. Nonetheless, there is only limited open-set speech understanding. Results of hearing function are correlated with electrode design, number of activated electrodes, speech processing strategies, duration of pre-existing deafness and extent of brainstem deformation. Functional neurostimulation of the central auditory system by a brainstem implant is a safe and beneficial procedure, which may considerably improve the quality of life in patients suffering from deafness due to bilateral retrocochlear lesions. The auditory outcome may be improved by a new generation of microelectrodes capable of penetrating the surface of the brainstem to access more directly the auditory neurons.

  15. Auditory Midbrain Implant: Research and Development Towards a Second Clinical Trial

    PubMed Central

    Lim, Hubert H.; Lenarz, Thomas


    The cochlear implant is considered one of the most successful neural prostheses to date, which was made possible by visionaries who continued to develop the cochlear implant through multiple technological and clinical challenges. However, patients without a functional auditory nerve or implantable cochlea cannot benefit from a cochlear implant. The focus of the paper is to review the development and translation of a new type of central auditory prosthesis for this group of patients, which is known as the auditory midbrain implant (AMI) and is designed for electrical stimulation within the inferior colliculus. The rationale and results for the first AMI clinical study using a multi-site single-shank array will be presented initially. Although the AMI has achieved encouraging results in terms of safety and improvements in lip-reading capabilities and environmental awareness, it has not yet provided sufficient speech perception. Animal and human data will then be presented to show that a two-shank AMI array can potentially improve hearing performance by targeting specific neurons of the inferior colliculus. Modifications to the AMI array design, stimulation strategy, and surgical approach have been made that are expected to improve hearing performance in the patients implanted with a two-shank array in an upcoming clinical trial funded by the National Institutes of Health. Positive outcomes from this clinical trial will motivate new efforts and developments toward improving central auditory prostheses for those who cannot sufficiently benefit from cochlear implants. PMID:25613994

  16. Silent speechreading in the absence of scanner noise: an event-related fMRI study.


    MacSweeney, M; Amaro, E; Calvert, G A; Campbell, R; David, A S; McGuire, P; Williams, S C; Woll, B; Brammer, M J


    In a previous study we used functional magnetic resonance imaging (fMRI) to demonstrate activation in auditory cortex during silent speechreading. Since image acquisition during fMRI generates acoustic noise, this pattern of activation could have reflected an interaction between background scanner noise and the visual lip-read stimuli. In this study we employed an event-related fMRI design which allowed us to measure activation during speechreading in the absence of acoustic scanner noise. In the experimental condition, hearing subjects were required to speechread random numbers from a silent speaker. In the control condition subjects watched a static image of the same speaker with mouth closed and were required to subvocally count an intermittent visual cue. A single volume of images was collected to coincide with the estimated peak of the blood oxygen level dependent (BOLD) response to these stimuli across multiple baseline and experimental trials. Silent speechreading led to greater activation in lateral temporal cortex relative to the control condition. This indicates that activation of auditory areas during silent speechreading is not a function of acoustic scanner noise and confirms that silent speechreading engages similar regions of auditory cortex as listening to speech.

  17. Automatic lip reading by using multimodal visual features

    NASA Astrophysics Data System (ADS)

    Takahashi, Shohei; Ohya, Jun


    Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.

  18. Brain networks engaged in audiovisual integration during speech perception revealed by persistent homology-based network filtration.


    Kim, Heejung; Hahm, Jarang; Lee, Hyekyoung; Kang, Eunjoo; Kang, Hyejin; Lee, Dong Soo


    The human brain naturally integrates audiovisual information to improve speech perception. However, in noisy environments, understanding speech is difficult and may require much effort. Although the brain network is supposed to be engaged in speech perception, it is unclear how speech-related brain regions are connected during natural bimodal audiovisual or unimodal speech perception with counterpart irrelevant noise. To investigate the topological changes of speech-related brain networks at all possible thresholds, we used a persistent homological framework through hierarchical clustering, such as single linkage distance, to analyze the connected component of the functional network during speech perception using functional magnetic resonance imaging. For speech perception, bimodal (audio-visual speech cue) or unimodal speech cues with counterpart irrelevant noise (auditory white-noise or visual gum-chewing) were delivered to 15 subjects. In terms of positive relationship, similar connected components were observed in bimodal and unimodal speech conditions during filtration. However, during speech perception by congruent audiovisual stimuli, the tighter couplings of left anterior temporal gyrus-anterior insula component and right premotor-visual components were observed than auditory or visual speech cue conditions, respectively. Interestingly, visual speech is perceived under white noise by tight negative coupling in the left inferior frontal region-right anterior cingulate, left anterior insula, and bilateral visual regions, including right middle temporal gyrus, right fusiform components. In conclusion, the speech brain network is tightly positively or negatively connected, and can reflect efficient or effortful processes during natural audiovisual integration or lip-reading, respectively, in speech perception.

  19. Effects of Visual Speech on Early Auditory Evoked Fields - From the Viewpoint of Individual Variance.


    Yahata, Izumi; Kawase, Tetsuaki; Kanno, Akitake; Hidaka, Hiroshi; Sakamoto, Shuichi; Nakasato, Nobukazu; Kawashima, Ryuta; Katori, Yukio


    The effects of visual speech (the moving image of the speaker's face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker's face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker's face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information.

  20. Activation and Functional Connectivity of the Left Inferior Temporal Gyrus during Visual Speech Priming in Healthy Listeners and Listeners with Schizophrenia

    PubMed Central

    Wu, Chao; Zheng, Yingjun; Li, Juanhua; Zhang, Bei; Li, Ruikeng; Wu, Haibo; She, Shenglin; Liu, Sha; Peng, Hongjun; Ning, Yuping; Li, Liang


    Under a “cocktail-party” listening condition with multiple-people talking, compared to healthy people, people with schizophrenia benefit less from the use of visual-speech (lipreading) priming (VSP) cues to improve speech recognition. The neural mechanisms underlying the unmasking effect of VSP remain unknown. This study investigated the brain substrates underlying the unmasking effect of VSP in healthy listeners and the schizophrenia-induced changes in the brain substrates. Using functional magnetic resonance imaging, brain activation and functional connectivity for the contrasts of the VSP listening condition vs. the visual non-speech priming (VNSP) condition were examined in 16 healthy listeners (27.4 ± 8.6 years old, 9 females and 7 males) and 22 listeners with schizophrenia (29.0 ± 8.1 years old, 8 females and 14 males). The results showed that in healthy listeners, but not listeners with schizophrenia, the VSP-induced activation (against the VNSP condition) of the left posterior inferior temporal gyrus (pITG) was significantly correlated with the VSP-induced improvement in target-speech recognition against speech masking. Compared to healthy listeners, listeners with schizophrenia showed significantly lower VSP-induced activation of the left pITG and reduced functional connectivity of the left pITG with the bilateral Rolandic operculum, bilateral STG, and left insular. Thus, the left pITG and its functional connectivity may be the brain substrates related to the unmasking effect of VSP, assumedly through enhancing both the processing of target visual-speech signals and the inhibition of masking-speech signals. In people with schizophrenia, the reduced unmasking effect of VSP on speech recognition may be associated with a schizophrenia-related reduction of VSP-induced activation and functional connectivity of the left pITG. PMID:28360829

  1. Effects of Visual Speech on Early Auditory Evoked Fields - From the Viewpoint of Individual Variance

    PubMed Central

    Yahata, Izumi; Kanno, Akitake; Hidaka, Hiroshi; Sakamoto, Shuichi; Nakasato, Nobukazu; Kawashima, Ryuta; Katori, Yukio


    The effects of visual speech (the moving image of the speaker’s face uttering speech sound) on early auditory evoked fields (AEFs) were examined using a helmet-shaped magnetoencephalography system in 12 healthy volunteers (9 males, mean age 35.5 years). AEFs (N100m) in response to the monosyllabic sound /be/ were recorded and analyzed under three different visual stimulus conditions, the moving image of the same speaker’s face uttering /be/ (congruent visual stimuli) or uttering /ge/ (incongruent visual stimuli), and visual noise (still image processed from speaker’s face using a strong Gaussian filter: control condition). On average, latency of N100m was significantly shortened in the bilateral hemispheres for both congruent and incongruent auditory/visual (A/V) stimuli, compared to the control A/V condition. However, the degree of N100m shortening was not significantly different between the congruent and incongruent A/V conditions, despite the significant differences in psychophysical responses between these two A/V conditions. Moreover, analysis of the magnitudes of these visual effects on AEFs in individuals showed that the lip-reading effects on AEFs tended to be well correlated between the two different audio-visual conditions (congruent vs. incongruent visual stimuli) in the bilateral hemispheres but were not significantly correlated between right and left hemisphere. On the other hand, no significant correlation was observed between the magnitudes of visual speech effects and psychophysical responses. These results may indicate that the auditory-visual interaction observed on the N100m is a fundamental process which does not depend on the congruency of the visual information. PMID:28141836

  2. From Mimicry to Language: A Neuroanatomically Based Evolutionary Model of the Emergence of Vocal Language

    PubMed Central

    Poliva, Oren


    The auditory cortex communicates with the frontal lobe via the middle temporal gyrus (auditory ventral stream; AVS) or the inferior parietal lobule (auditory dorsal stream; ADS). Whereas the AVS is ascribed only with sound recognition, the ADS is ascribed with sound localization, voice detection, prosodic perception/production, lip-speech integration, phoneme discrimination, articulation, repetition, phonological long-term memory and working memory. Previously, I interpreted the juxtaposition of sound localization, voice detection, audio-visual integration and prosodic analysis, as evidence that the behavioral precursor to human speech is the exchange of contact calls in non-human primates. Herein, I interpret the remaining ADS functions as evidence of additional stages in language evolution. According to this model, the role of the ADS in vocal control enabled early Homo (Hominans) to name objects using monosyllabic calls, and allowed children to learn their parents' calls by imitating their lip movements. Initially, the calls were forgotten quickly but gradually were remembered for longer periods. Once the representations of the calls became permanent, mimicry was limited to infancy, and older individuals encoded in the ADS a lexicon for the names of objects (phonological lexicon). Consequently, sound recognition in the AVS was sufficient for activating the phonological representations in the ADS and mimicry became independent of lip-reading. Later, by developing inhibitory connections between acoustic-syllabic representations in the AVS and phonological representations of subsequent syllables in the ADS, Hominans became capable of concatenating the monosyllabic calls for repeating polysyllabic words (i.e., developed working memory). Finally, due to strengthening of connections between phonological representations in the ADS, Hominans became capable of encoding several syllables as a single representation (chunking). Consequently, Hominans began vocalizing and

  3. Electrophysiological Validation of a Human Prototype Auditory Midbrain Implant in a Guinea Pig Model

    PubMed Central

    Lenarz, Minoo; Patrick, James F.; Anderson, David J.; Lenarz, Thomas


    The auditory midbrain implant (AMI) is a new treatment for hearing restoration in patients with neural deafness or surgically inaccessible cochleae who cannot benefit from cochlear implants (CI). This includes neurofibromatosis type II (NF2) patients who, due to development and/or removal of vestibular schwannomas, usually experience complete damage of their auditory nerves. Although the auditory brainstem implant (ABI) provides sound awareness and aids lip-reading capabilities for these NF2 patients, it generally only achieves hearing performance levels comparable with a single-channel CI. In collaboration with Cochlear Ltd. (Lane Cove, Australia), we developed a human prototype AMI, which is designed for electrical stimulation along the well-defined tonotopic gradient of the inferior colliculus central nucleus (ICC). Considering that better speech perception and hearing performance has been correlated with a greater number of discriminable frequency channels of information available, the ability of the AMI to effectively activate discrete frequency regions within the ICC may enable better hearing performance than achieved by the ABI. Therefore, the goal of this study was to investigate if our AMI array could achieve low-threshold, frequency-specific activation within the ICC, and whether the levels for ICC activation via AMI stimulation were within safe limits for human application. We electrically stimulated different frequency regions within the ICC via the AMI array and recorded the corresponding neural activity in the primary auditory cortex (A1) using a multisite silicon probe in ketamine-anesthetized guinea pigs. Based on our results, AMI stimulation achieves lower thresholds and more localized, frequency-specific activation than CI stimulation. Furthermore, AMI stimulation achieves cortical activation with current levels that are within safe limits for central nervous system stimulation. This study confirms that our AMI design is sufficient for ensuring