multitalker speech understanding: Topics by Science.gov

Sample records for multitalker speech understanding

How Autism Affects Speech Understanding in Multitalker Environments

DTIC Science & Technology

2015-12-01

Page 1 AD_________________ Award Number: W81XWH-12-1-0363 TITLE: How Autism Affects Speech Understanding in Multitalker Environments PRINCIPAL...COVERED 30 Sep 2012 - 29 Sep 2015 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER How Autism Affects Speech Understanding in Multitalker 5b. GRANT NUMBER...that adults with Autism Spectrum Disorders have particular difficulty recognizing speech in acoustically-hostile environments (e.g., Alcantara et al
Ranking Hearing Aid Input-Output Functions for Understanding Low-, Conversational-, and High-Level Speech in Multitalker Babble

ERIC Educational Resources Information Center

Chung, King; Killion, Mead C.; Christensen, Laurel A.

2007-01-01

Purpose: To determine the rankings of 6 input-output functions for understanding low-level, conversational, and high-level speech in multitalker babble without manipulating volume control for listeners with normal hearing, flat sensorineural hearing loss, and mildly sloping sensorineural hearing loss. Method: Peak clipping, compression limiting,…
Multitalker Speech Perception with Ideal Time-Frequency Segregation: Effects of Voice Characteristics and Number of Talkers

DTIC Science & Technology

2009-03-23

Multitalker speech perception with ideal time-frequency segregation: Effects of voice characteristics and number of talkers Douglas S. Brungarta Air...INTRODUCTION Speech perception in multitalker listening environments is limited by two very different types of masking. The first is energetic...06 MAR 2009 2. REPORT TYPE 3. DATES COVERED 00-00-2009 to 00-00-2009 4. TITLE AND SUBTITLE Multitalker speech perception with ideal time
Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene.

PubMed

Vander Ghinst, Marc; Bourguignon, Mathieu; Op de Beeck, Marc; Wens, Vincent; Marty, Brice; Hassid, Sergio; Choufani, Georges; Jousmäki, Veikko; Hari, Riitta; Van Bogaert, Patrick; Goldman, Serge; De Tiège, Xavier

2016-02-03

Using a continuous listening task, we evaluated the coupling between the listener's cortical activity and the temporal envelopes of different sounds in a multitalker auditory scene using magnetoencephalography and corticovocal coherence analysis. Neuromagnetic signals were recorded from 20 right-handed healthy adult humans who listened to five different recorded stories (attended speech streams), one without any multitalker background (No noise) and four mixed with a "cocktail party" multitalker background noise at four signal-to-noise ratios (5, 0, -5, and -10 dB) to produce speech-in-noise mixtures, here referred to as Global scene. Coherence analysis revealed that the modulations of the attended speech stream, presented without multitalker background, were coupled at ∼0.5 Hz to the activity of both superior temporal gyri, whereas the modulations at 4-8 Hz were coupled to the activity of the right supratemporal auditory cortex. In cocktail party conditions, with the multitalker background noise, the coupling was at both frequencies stronger for the attended speech stream than for the unattended Multitalker background. The coupling strengths decreased as the Multitalker background increased. During the cocktail party conditions, the ∼0.5 Hz coupling became left-hemisphere dominant, compared with bilateral coupling without the multitalker background, whereas the 4-8 Hz coupling remained right-hemisphere lateralized in both conditions. The brain activity was not coupled to the multitalker background or to its individual talkers. The results highlight the key role of listener's left superior temporal gyri in extracting the slow ∼0.5 Hz modulations, likely reflecting the attended speech stream within a multitalker auditory scene. When people listen to one person in a "cocktail party," their auditory cortex mainly follows the attended speech stream rather than the entire auditory scene. However, how the brain extracts the attended speech stream from the whole auditory scene and how increasing background noise corrupts this process is still debated. In this magnetoencephalography study, subjects had to attend a speech stream with or without multitalker background noise. Results argue for frequency-dependent cortical tracking mechanisms for the attended speech stream. The left superior temporal gyrus tracked the ∼0.5 Hz modulations of the attended speech stream only when the speech was embedded in multitalker background, whereas the right supratemporal auditory cortex tracked 4-8 Hz modulations during both noiseless and cocktail-party conditions. Copyright © 2016 the authors 0270-6474/16/361597-11$15.00/0.
How Autism Affects Speech Understanding in Multitalker Environments

DTIC Science & Technology

2013-10-01

difficult than will typically- developing children. Knowing whether toddlers with ASD have difficulties processing speech in the presence of acoustic...to separate the speech of different talkers than do their typically- developing peers. We also predict that they will fail to exploit visual cues on...learn language from many settings in which children are typically placed. In addition, one of the cues that typically- developing listeners use to
Spatial and temporal modifications of multitalker speech can improve speech perception in older adults.

PubMed

Gygi, Brian; Shafiro, Valeriy

2014-04-01

Speech perception in multitalker environments often requires listeners to divide attention among several concurrent talkers before focusing on one talker with pertinent information. Such attentionally demanding tasks are particularly difficult for older adults due both to age-related hearing loss (presbacusis) and general declines in attentional processing and associated cognitive abilities. This study investigated two signal-processing techniques that have been suggested as a means of improving speech perception accuracy of older adults: time stretching and spatial separation of target talkers. Stimuli in each experiment comprised 2-4 fixed-form utterances in which listeners were asked to consecutively 1) detect concurrently spoken keywords in the beginning of the utterance (divided attention); and, 2) identify additional keywords from only one talker at the end of the utterance (selective attention). In Experiment 1, the overall tempo of each utterance was unaltered or slowed down by 25%; in Experiment 2 the concurrent utterances were spatially coincident or separated across a 180-degree hemifield. Both manipulations improved performance for elderly adults with age-appropriate hearing on both tasks. Increasing the divided attention load by attending to more concurrent keywords had a marked negative effect on performance of the selective attention task only when the target talker was identified by a keyword, but not by spatial location. These findings suggest that the temporal and spatial modifications of multitalker speech improved perception of multitalker speech primarily by reducing competition among cognitive resources required to perform attentionally demanding tasks. Published by Elsevier B.V.
Looking Behavior and Audiovisual Speech Understanding in Children With Normal Hearing and Children With Mild Bilateral or Unilateral Hearing Loss.

PubMed

Lewis, Dawna E; Smith, Nicholas A; Spalding, Jody L; Valente, Daniel L

Visual information from talkers facilitates speech intelligibility for listeners when audibility is challenged by environmental noise and hearing loss. Less is known about how listeners actively process and attend to visual information from different talkers in complex multi-talker environments. This study tracked looking behavior in children with normal hearing (NH), mild bilateral hearing loss (MBHL), and unilateral hearing loss (UHL) in a complex multi-talker environment to examine the extent to which children look at talkers and whether looking patterns relate to performance on a speech-understanding task. It was hypothesized that performance would decrease as perceptual complexity increased and that children with hearing loss would perform more poorly than their peers with NH. Children with MBHL or UHL were expected to demonstrate greater attention to individual talkers during multi-talker exchanges, indicating that they were more likely to attempt to use visual information from talkers to assist in speech understanding in adverse acoustics. It also was of interest to examine whether MBHL, versus UHL, would differentially affect performance and looking behavior. Eighteen children with NH, eight children with MBHL, and 10 children with UHL participated (8-12 years). They followed audiovisual instructions for placing objects on a mat under three conditions: a single talker providing instructions via a video monitor, four possible talkers alternately providing instructions on separate monitors in front of the listener, and the same four talkers providing both target and nontarget information. Multi-talker background noise was presented at a 5 dB signal-to-noise ratio during testing. An eye tracker monitored looking behavior while children performed the experimental task. Behavioral task performance was higher for children with NH than for either group of children with hearing loss. There were no differences in performance between children with UHL and children with MBHL. Eye-tracker analysis revealed that children with NH looked more at the screens overall than did children with MBHL or UHL, though individual differences were greater in the groups with hearing loss. Listeners in all groups spent a small proportion of time looking at relevant screens as talkers spoke. Although looking was distributed across all screens, there was a bias toward the right side of the display. There was no relationship between overall looking behavior and performance on the task. The present study examined the processing of audiovisual speech in the context of a naturalistic task. Results demonstrated that children distributed their looking to a variety of sources during the task, but that children with NH were more likely to look at screens than were those with MBHL/UHL. However, all groups looked at the relevant talkers as they were speaking only a small proportion of the time. Despite variability in looking behavior, listeners were able to follow the audiovisual instructions and children with NH demonstrated better performance than children with MBHL/UHL. These results suggest that performance on some challenging multi-talker audiovisual tasks is not dependent on visual fixation to relevant talkers for children with NH or with MBHL/UHL.
Speaker normalization for chinese vowel recognition in cochlear implants.

PubMed

Luo, Xin; Fu, Qian-Jie

2005-07-01

Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.
The mismatch negativity as a measure of auditory stream segregation in a simulated "cocktail-party" scenario: effect of age.

PubMed

Getzmann, Stephan; Näätänen, Risto

2015-11-01

With age the ability to understand speech in multitalker environments usually deteriorates. The central auditory system has to perceptually segregate and group the acoustic input into sequences of distinct auditory objects. The present study used electrophysiological measures to study effects of age on auditory stream segregation in a multitalker scenario. Younger and older adults were presented with streams of short speech stimuli. When a single target stream was presented, the occurrence of a rare (deviant) syllable among a frequent (standard) syllable elicited the mismatch negativity (MMN), an electrophysiological correlate of automatic deviance detection. The presence of a second, concurrent stream consisting of the deviant syllable of the target stream reduced the MMN amplitude, especially when located nearby the target stream. The decrease in MMN amplitude indicates that the rare syllable of the target stream was less perceived as deviant, suggesting reduced stream segregation with decreasing stream distance. Moreover, the presence of a concurrent stream increased the MMN peak latency of the older group but not that of the younger group. The results provide neurophysiological evidence for the effects of concurrent speech on auditory processing in older adults, suggesting that older adults need more time for stream segregation in the presence of concurrent speech. Copyright © 2015 Elsevier Inc. All rights reserved.
Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

ERIC Educational Resources Information Center

Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

2015-01-01

Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…
How Autism Affects Speech Understanding in Multitalker Environments

DTIC Science & Technology

2014-10-01

this collection of information, including suggestions for reducing this burden to Department of Defense, Washington Headquarters Services , Directorate...aware that notwithstanding any other provision of law , no person shall be subject to any penalty for failing to comply with a collection of information if...been distributing these posters at various locations in the community ; we have trained laboratory personnel to conduct the various aspects of this
Working Memory and Speech Comprehension in Older Adults with Hearing Impairment

ERIC Educational Resources Information Center

Nagaraj, Naveen K.

2017-01-01

Purpose: This study examined the relationship between working memory (WM) and speech comprehension in older adults with hearing impairment (HI). It was hypothesized that WM would explain significant variance in speech comprehension measured in multitalker babble (MTB). Method: Twenty-four older (59-73 years) adults with sensorineural HI…
Intelligibility of Clear Speech: Effect of Instruction

ERIC Educational Resources Information Center

Lam, Jennifer; Tjaden, Kris

2013-01-01

Purpose: The authors investigated how clear speech instructions influence sentence intelligibility. Method: Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis…
Impact of clear, loud, and slow speech on scaled intelligibility and speech severity in Parkinson's disease and multiple sclerosis.

PubMed

Tjaden, Kris; Sussman, Joan E; Wilding, Gregory E

2014-06-01

The perceptual consequences of rate reduction, increased vocal intensity, and clear speech were studied in speakers with multiple sclerosis (MS), Parkinson's disease (PD), and healthy controls. Seventy-eight speakers read sentences in habitual, clear, loud, and slow conditions. Sentences were equated for peak amplitude and mixed with multitalker babble for presentation to listeners. Using a computerized visual analog scale, listeners judged intelligibility or speech severity as operationally defined in Sussman and Tjaden (2012). Loud and clear but not slow conditions improved intelligibility relative to the habitual condition. With the exception of the loud condition for the PD group, speech severity did not improve above habitual and was reduced relative to habitual in some instances. Intelligibility and speech severity were strongly related, but relationships for disordered speakers were weaker in clear and slow conditions versus habitual. Both clear and loud speech show promise for improving intelligibility and maintaining or improving speech severity in multitalker babble for speakers with mild dysarthria secondary to MS or PD, at least as these perceptual constructs were defined and measured in this study. Although scaled intelligibility and speech severity overlap, the metrics further appear to have some separate value in documenting treatment-related speech changes.
Tailoring auditory training to patient needs with single and multiple talkers: transfer-appropriate gains on a four-choice discrimination test.

PubMed

Barcroft, Joe; Sommers, Mitchell S; Tye-Murray, Nancy; Mauzé, Elizabeth; Schroy, Catherine; Spehar, Brent

2011-11-01

Our long-term objective is to develop an auditory training program that will enhance speech recognition in those situations where patients most want improvement. As a first step, the current investigation trained participants using either a single talker or multiple talkers to determine if auditory training leads to transfer-appropriate gains. The experiment implemented a 2 × 2 × 2 mixed design, with training condition as a between-participants variable and testing interval and test version as repeated-measures variables. Participants completed a computerized six-week auditory training program wherein they heard either the speech of a single talker or the speech of six talkers. Training gains were assessed with single-talker and multi-talker versions of the Four-choice discrimination test. Participants in both groups were tested on both versions. Sixty-nine adult hearing-aid users were randomly assigned to either single-talker or multi-talker auditory training. Both groups showed significant gains on both test versions. Participants who trained with multiple talkers showed greater improvement on the multi-talker version whereas participants who trained with a single talker showed greater improvement on the single-talker version. Transfer-appropriate gains occurred following auditory training, suggesting that auditory training can be designed to target specific patient needs.
Use of Adaptive Digital Signal Processing to Improve Speech Communication for Normally Hearing aand Hearing-Impaired Subjects.

ERIC Educational Resources Information Center

Harris, Richard W.; And Others

1988-01-01

A two-microphone adaptive digital noise cancellation technique improved word-recognition ability for 20 normal and 12 hearing-impaired adults by reducing multitalker speech babble and speech spectrum noise 18-22 dB. Word recognition improvements averaged 37-50 percent for normal and 27-40 percent for hearing-impaired subjects. Improvement was best…
The Effect of Asymmetrical Signal Degradation on Binaural Speech Recognition in Children and Adults.

ERIC Educational Resources Information Center

Rothpletz, Ann M.; Tharpe, Anne Marie; Grantham, D. Wesley

2004-01-01

To determine the effect of asymmetrical signal degradation on binaural speech recognition, 28 children and 14 adults were administered a sentence recognition task amidst multitalker babble. There were 3 listening conditions: (a) monaural, with mild degradation in 1 ear; (b) binaural, with mild degradation in both ears (symmetric degradation); and…
Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

PubMed

Schoenmaker, Esther; van de Par, Steven

2016-01-01

Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.
Unequal effects of speech and nonspeech contexts on the perceptual normalization of Cantonese level tones.

PubMed

Zhang, Caicai; Peng, Gang; Wang, William S-Y

2012-08-01

Context is important for recovering language information from talker-induced variability in acoustic signals. In tone perception, previous studies reported similar effects of speech and nonspeech contexts in Mandarin, supporting a general perceptual mechanism underlying tone normalization. However, no supportive evidence was obtained in Cantonese, also a tone language. Moreover, no study has compared speech and nonspeech contexts in the multi-talker condition, which is essential for exploring the normalization mechanism of inter-talker variability in speaking F0. The other question is whether a talker's full F0 range and mean F0 equally facilitate normalization. To answer these questions, this study examines the effects of four context conditions (speech/nonspeech × F0 contour/mean F0) in the multi-talker condition in Cantonese. Results show that raising and lowering the F0 of speech contexts change the perception of identical stimuli from mid level tone to low and high level tone, whereas nonspeech contexts only mildly increase the identification preference. It supports the speech-specific mechanism of tone normalization. Moreover, speech context with flattened F0 trajectory, which neutralizes cues of a talker's full F0 range, fails to facilitate normalization in some conditions, implying that a talker's mean F0 is less efficient for minimizing talker-induced lexical ambiguity in tone perception.
Non-native Listeners’ Recognition of High-Variability Speech Using PRESTO

PubMed Central

Tamati, Terrin N.; Pisoni, David B.

2015-01-01

Background Natural variability in speech is a significant challenge to robust successful spoken word recognition. In everyday listening environments, listeners must quickly adapt and adjust to multiple sources of variability in both the signal and listening environments. High-variability speech may be particularly difficult to understand for non-native listeners, who have less experience with the second language (L2) phonological system and less detailed knowledge of sociolinguistic variation of the L2. Purpose The purpose of this study was to investigate the effects of high-variability sentences on non-native speech recognition and to explore the underlying sources of individual differences in speech recognition abilities of non-native listeners. Research Design Participants completed two sentence recognition tasks involving high-variability and low-variability sentences. They also completed a battery of behavioral tasks and self-report questionnaires designed to assess their indexical processing skills, vocabulary knowledge, and several core neurocognitive abilities. Study Sample Native speakers of Mandarin (n = 25) living in the United States recruited from the Indiana University community participated in the current study. A native comparison group consisted of scores obtained from native speakers of English (n = 21) in the Indiana University community taken from an earlier study. Data Collection and Analysis Speech recognition in high-variability listening conditions was assessed with a sentence recognition task using sentences from PRESTO (Perceptually Robust English Sentence Test Open-Set) mixed in 6-talker multitalker babble. Speech recognition in low-variability listening conditions was assessed using sentences from HINT (Hearing In Noise Test) mixed in 6-talker multitalker babble. Indexical processing skills were measured using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Vocabulary knowledge was assessed with the WordFam word familiarity test, and executive functioning was assessed with the BRIEF-A (Behavioral Rating Inventory of Executive Function – Adult Version) self-report questionnaire. Scores from the non-native listeners on behavioral tasks and self-report questionnaires were compared with scores obtained from native listeners tested in a previous study and were examined for individual differences. Results Non-native keyword recognition scores were significantly lower on PRESTO sentences than on HINT sentences. Non-native listeners’ keyword recognition scores were also lower than native listeners’ scores on both sentence recognition tasks. Differences in performance on the sentence recognition tasks between non-native and native listeners were larger on PRESTO than on HINT, although group differences varied by signal-to-noise ratio. The non-native and native groups also differed in the ability to categorize talkers by region of origin and in vocabulary knowledge. Individual non-native word recognition accuracy on PRESTO sentences in multitalker babble at more favorable signal-to-noise ratios was found to be related to several BRIEF-A subscales and composite scores. However, non-native performance on PRESTO was not related to regional dialect categorization, talker and gender discrimination, or vocabulary knowledge. Conclusions High-variability sentences in multitalker babble were particularly challenging for non-native listeners. Difficulty under high-variability testing conditions was related to lack of experience with the L2, especially L2 sociolinguistic information, compared with native listeners. Individual differences among the non-native listeners were related to weaknesses in core neurocognitive abilities affecting behavioral control in everyday life. PMID:25405842

Speech understanding in noise with the Roger Pen, Naida CI Q70 processor, and integrated Roger 17 receiver in a multi-talker network.

PubMed

De Ceulaer, Geert; Bestel, Julie; Mülder, Hans E; Goldbeck, Felix; de Varebeke, Sebastien Pierre Janssens; Govaerts, Paul J

2016-05-01

Roger is a digital adaptive multi-channel remote microphone technology that wirelessly transmits a speaker's voice directly to a hearing instrument or cochlear implant sound processor. Frequency hopping between channels, in combination with repeated broadcast, avoids interference issues that have limited earlier generation FM systems. This study evaluated the benefit of the Roger Pen transmitter microphone in a multiple talker network (MTN) for cochlear implant users in a simulated noisy conversation setting. Twelve post-lingually deafened adult Advanced Bionics CII/HiRes 90K recipients were recruited. Subjects used a Naida CI Q70 processor with integrated Roger 17 receiver. The test environment simulated four people having a meal in a noisy restaurant, one the CI user (listener), and three companions (talkers) talking non-simultaneously in a diffuse field of multi-talker babble. Speech reception thresholds (SRTs) were determined without the Roger Pen, with one Roger Pen, and with three Roger Pens in an MTN. Using three Roger Pens in an MTN improved the SRT by 14.8 dB over using no Roger Pen, and by 13.1 dB over using a single Roger Pen (p < 0.0001). The Roger Pen in an MTN provided statistically and clinically significant improvement in speech perception in noise for Advanced Bionics cochlear implant recipients. The integrated Roger 17 receiver made it easy for users of the Naida CI Q70 processor to take advantage of the Roger system. The listening advantage and ease of use should encourage more clinicians to recommend and fit Roger in adult cochlear implant patients.
Bilateral Versus Unilateral Cochlear Implantation in Adult Listeners: Speech-On-Speech Masking and Multitalker Localization.

PubMed

Rana, Baljeet; Buchholz, Jörg M; Morgan, Catherine; Sharma, Mridula; Weller, Tobias; Konganda, Shivali Appaiah; Shirai, Kyoko; Kawano, Atsushi

2017-01-01

Binaural hearing helps normal-hearing listeners localize sound sources and understand speech in noise. However, it is not fully understood how far this is the case for bilateral cochlear implant (CI) users. To determine the potential benefits of bilateral over unilateral CIs, speech comprehension thresholds (SCTs) were measured in seven Japanese bilateral CI recipients using Helen test sentences (translated into Japanese) in a two-talker speech interferer presented from the front (co-located with the target speech), ipsilateral to the first-implanted ear (at +90° or -90°), and spatially symmetric at ±90°. Spatial release from masking was calculated as the difference between co-located and spatially separated SCTs. Localization was assessed in the horizontal plane by presenting either male or female speech or both simultaneously. All measurements were performed bilaterally and unilaterally (with the first implanted ear) inside a loudspeaker array. Both SCTs and spatial release from masking were improved with bilateral CIs, demonstrating mean bilateral benefits of 7.5 dB in spatially asymmetric and 3 dB in spatially symmetric speech mixture. Localization performance varied strongly between subjects but was clearly improved with bilateral over unilateral CIs with the mean localization error reduced by 27°. Surprisingly, adding a second talker had only a negligible effect on localization.
Speech intelligibility in complex acoustic environments in young children

NASA Astrophysics Data System (ADS)

Litovsky, Ruth

2003-04-01

While the auditory system undergoes tremendous maturation during the first few years of life, it has become clear that in complex scenarios when multiple sounds occur and when echoes are present, children's performance is significantly worse than their adult counterparts. The ability of children (3-7 years of age) to understand speech in a simulated multi-talker environment and to benefit from spatial separation of the target and competing sounds was investigated. In these studies, competing sources vary in number, location, and content (speech, modulated or unmodulated speech-shaped noise and time-reversed speech). The acoustic spaces were also varied in size and amount of reverberation. Finally, children with chronic otitis media who received binaural training were tested pre- and post-training on a subset of conditions. Results indicated the following. (1) Children experienced significantly more masking than adults, even in the simplest conditions tested. (2) When the target and competing sounds were spatially separated speech intelligibility improved, but the amount varied with age, type of competing sound, and number of competitors. (3) In a large reverberant classroom there was no benefit of spatial separation. (4) Binaural training improved speech intelligibility performance in children with otitis media. Future work includes similar studies in children with unilateral and bilateral cochlear implants. [Work supported by NIDCD, DRF, and NOHR.
Extending and Applying the EPIC Architecture for Human Cognition and Performance: Auditory and Spatial Components

DTIC Science & Technology

2016-03-01

manual rather than verbal responses. The coordinate response measure ( CRM ) task and speech corpus is a highly simplified form of the command and...in multi-talker speech experiments. The CRM corpus is a collection of recorded command utterances in the form of Ready <Callsign> go to <Color...In the two-talker CRM listening task, participants respond to commands by pointing to the appropriate Color/Digit pair on a computer display. A
Spatial release of cognitive load measured in a dual-task paradigm in normal-hearing and hearing-impaired listeners.

PubMed

Xia, Jing; Nooraei, Nazanin; Kalluri, Sridhar; Edwards, Brent

2015-04-01

This study investigated whether spatial separation between talkers helps reduce cognitive processing load, and how hearing impairment interacts with the cognitive load of individuals listening in multi-talker environments. A dual-task paradigm was used in which performance on a secondary task (visual tracking) served as a measure of the cognitive load imposed by a speech recognition task. Visual tracking performance was measured under four conditions in which the target and the interferers were distinguished by (1) gender and spatial location, (2) gender only, (3) spatial location only, and (4) neither gender nor spatial location. Results showed that when gender cues were available, a 15° spatial separation between talkers reduced the cognitive load of listening even though it did not provide further improvement in speech recognition (Experiment I). Compared to normal-hearing listeners, large individual variability in spatial release of cognitive load was observed among hearing-impaired listeners. Cognitive load was lower when talkers were spatially separated by 60° than when talkers were of different genders, even though speech recognition was comparable in these two conditions (Experiment II). These results suggest that a measure of cognitive load might provide valuable insight into the benefit of spatial cues in multi-talker environments.
Training to Improve Hearing Speech in Noise: Biological Mechanisms

PubMed Central

Song, Judy H.; Skoe, Erika; Banai, Karen

2012-01-01

We investigated training-related improvements in listening in noise and the biological mechanisms mediating these improvements. Training-related malleability was examined using a program that incorporates cognitively based listening exercises to improve speech-in-noise perception. Before and after training, auditory brainstem responses to a speech syllable were recorded in quiet and multitalker noise from adults who ranged in their speech-in-noise perceptual ability. Controls did not undergo training but were tested at intervals equivalent to the trained subjects. Trained subjects exhibited significant improvements in speech-in-noise perception that were retained 6 months later. Subcortical responses in noise demonstrated training-related enhancements in the encoding of pitch-related cues (the fundamental frequency and the second harmonic), particularly for the time-varying portion of the syllable that is most vulnerable to perceptual disruption (the formant transition region). Subjects with the largest strength of pitch encoding at pretest showed the greatest perceptual improvement. Controls exhibited neither neurophysiological nor perceptual changes. We provide the first demonstration that short-term training can improve the neural representation of cues important for speech-in-noise perception. These results implicate and delineate biological mechanisms contributing to learning success, and they provide a conceptual advance to our understanding of the kind of training experiences that can influence sensory processing in adulthood. PMID:21799207
Bilateral Versus Unilateral Cochlear Implantation in Adult Listeners: Speech-On-Speech Masking and Multitalker Localization

PubMed Central

Buchholz, Jörg M.; Morgan, Catherine; Sharma, Mridula; Weller, Tobias; Konganda, Shivali Appaiah; Shirai, Kyoko; Kawano, Atsushi

2017-01-01

Binaural hearing helps normal-hearing listeners localize sound sources and understand speech in noise. However, it is not fully understood how far this is the case for bilateral cochlear implant (CI) users. To determine the potential benefits of bilateral over unilateral CIs, speech comprehension thresholds (SCTs) were measured in seven Japanese bilateral CI recipients using Helen test sentences (translated into Japanese) in a two-talker speech interferer presented from the front (co-located with the target speech), ipsilateral to the first-implanted ear (at +90° or −90°), and spatially symmetric at ±90°. Spatial release from masking was calculated as the difference between co-located and spatially separated SCTs. Localization was assessed in the horizontal plane by presenting either male or female speech or both simultaneously. All measurements were performed bilaterally and unilaterally (with the first implanted ear) inside a loudspeaker array. Both SCTs and spatial release from masking were improved with bilateral CIs, demonstrating mean bilateral benefits of 7.5 dB in spatially asymmetric and 3 dB in spatially symmetric speech mixture. Localization performance varied strongly between subjects but was clearly improved with bilateral over unilateral CIs with the mean localization error reduced by 27°. Surprisingly, adding a second talker had only a negligible effect on localization. PMID:28752811
Increased vocal intensity due to the Lombard effect in speakers with Parkinson's disease: simultaneous laryngeal and respiratory strategies.

PubMed

Stathopoulos, Elaine T; Huber, Jessica E; Richardson, Kelly; Kamphaus, Jennifer; DeCicco, Devan; Darling, Meghan; Fulcher, Katrina; Sussman, Joan E

2014-01-01

The objective of the present study was to investigate whether speakers with hypophonia, secondary to Parkinson's disease (PD), would increases their vocal intensity when speaking in a noisy environment (Lombard effect). The other objective was to examine the underlying laryngeal and respiratory strategies used to increase vocal intensity. Thirty-three participants with PD were included for study. Each participant was fitted with the SpeechVive™ device that played multi-talker babble noise into one ear during speech. Using acoustic, aerodynamic and respiratory kinematic techniques, the simultaneous laryngeal and respiratory mechanisms used to regulate vocal intensity were examined. Significant group results showed that most speakers with PD (26/33) were successful at increasing their vocal intensity when speaking in the condition of multi-talker babble noise. They were able to support their increased vocal intensity and subglottal pressure with combined strategies from both the laryngeal and respiratory mechanisms. Individual speaker analysis indicated that the particular laryngeal and respiratory interactions differed among speakers. The SpeechVive™ device elicited higher vocal intensities from patients with PD. Speakers used different combinations of laryngeal and respiratory physiologic mechanisms to increase vocal intensity, thus suggesting that disease process does not uniformly affect the speech subsystems. Readers will be able to: (1) identify speech characteristics of people with Parkinson's disease (PD), (2) identify typical respiratory strategies for increasing sound pressure level (SPL), (3) identify typical laryngeal strategies for increasing SPL, (4) define the Lombard effect. Copyright © 2014 Elsevier Inc. All rights reserved.
A Dynamic Speech Comprehension Test for Assessing Real-World Listening Ability.

PubMed

Best, Virginia; Keidser, Gitte; Freeston, Katrina; Buchholz, Jörg M

2016-07-01

Many listeners with hearing loss report particular difficulties with multitalker communication situations, but these difficulties are not well predicted using current clinical and laboratory assessment tools. The overall aim of this work is to create new speech tests that capture key aspects of multitalker communication situations and ultimately provide better predictions of real-world communication abilities and the effect of hearing aids. A test of ongoing speech comprehension introduced previously was extended to include naturalistic conversations between multiple talkers as targets, and a reverberant background environment containing competing conversations. In this article, we describe the development of this test and present a validation study. Thirty listeners with normal hearing participated in this study. Speech comprehension was measured for one-, two-, and three-talker passages at three different signal-to-noise ratios (SNRs), and working memory ability was measured using the reading span test. Analyses were conducted to examine passage equivalence, learning effects, and test-retest reliability, and to characterize the effects of number of talkers and SNR. Although we observed differences in difficulty across passages, it was possible to group the passages into four equivalent sets. Using this grouping, we achieved good test-retest reliability and observed no significant learning effects. Comprehension performance was sensitive to the SNR but did not decrease as the number of talkers increased. Individual performance showed associations with age and reading span score. This new dynamic speech comprehension test appears to be valid and suitable for experimental purposes. Further work will explore its utility as a tool for predicting real-world communication ability and hearing aid benefit. American Academy of Audiology.
Formant discrimination in noise for isolated vowels

NASA Astrophysics Data System (ADS)

Liu, Chang; Kewley-Port, Diane

2004-11-01

Formant discrimination for isolated vowels presented in noise was investigated for normal-hearing listeners. Discrimination thresholds for F1 and F2, for the seven American English vowels /eye, smcapi, eh, æ, invv, aye, you/, were measured under two types of noise, long-term speech-shaped noise (LTSS) and multitalker babble, and also under quiet listening conditions. Signal-to-noise ratios (SNR) varied from -4 to +4 dB in steps of 2 dB. All three factors, formant frequency, signal-to-noise ratio, and noise type, had significant effects on vowel formant discrimination. Significant interactions among the three factors showed that threshold-frequency functions depended on SNR and noise type. The thresholds at the lowest levels of SNR were highly elevated by a factor of about 3 compared to those in quiet. The masking functions (threshold vs SNR) were well described by a negative exponential over F1 and F2 for both LTSS and babble noise. Speech-shaped noise was a slightly more effective masker than multitalker babble, presumably reflecting small benefits (1.5 dB) due to the temporal variation of the babble. .
The effects of early auditory-based intervention on adult bilateral cochlear implant outcomes.

PubMed

Lim, Stacey R

2017-09-01

The goal of this exploratory study was to determine the types of improvement that sequentially implanted auditory-verbal and auditory-oral adults with prelingual and childhood hearing loss received in bilateral listening conditions, compared to their best unilateral listening condition. Five auditory-verbal adults and five auditory-oral adults were recruited for this study. Participants were seated in the center of a 6-loudspeaker array. BKB-SIN sentences were presented from 0° azimuth, while multi-talker babble was presented from various loudspeakers. BKB-SIN scores in bilateral and the best unilateral listening conditions were compared to determine the amount of improvement gained. As a group, the participants had improved speech understanding scores in the bilateral listening condition. Although not statistically significant, the auditory-verbal group tended to have greater speech understanding with greater levels of competing background noise, compared to the auditory-oral participants. Bilateral cochlear implantation provides individuals with prelingual and childhood hearing loss with improved speech understanding in noise. A higher emphasis on auditory development during the critical language development years may add to increased speech understanding in adulthood. However, other demographic factors such as age or device characteristics must also be considered. Although both auditory-verbal and auditory-oral approaches emphasize spoken language development, they emphasize auditory development to different degrees. This may affect cochlear implant (CI) outcomes. Further consideration should be made in future auditory research to determine whether these differences contribute to performance outcomes. Additional investigation with a larger participant pool, controlled for effects of age and CI devices and processing strategies, would be necessary to determine whether language learning approaches are associated with different levels of speech understanding performance.
Neural Timing is Linked to Speech Perception in Noise

PubMed Central

Samira, Anderson; Erika, Skoe; Bharath, Chandrasekaran; Nina, Kraus

2010-01-01

Understanding speech in background noise is challenging for every listener, including those with normal peripheral hearing. This difficulty is due in part to the disruptive effects of noise on neural synchrony, resulting in degraded representation of speech at cortical and subcortical levels as reflected by electrophysiological responses. These problems are especially pronounced in clinical populations such as children with learning impairments. Given the established effects of noise on evoked responses, we hypothesized that listening-in-noise problems are associated with degraded processing of timing information at the brainstem level. Participants (66 children, ages 8 to 14 years, 22 females) were divided into groups based on their performance on clinical measures of speech-in-noise perception (SIN) and reading. We compared brainstem responses to speech syllables between top and bottom SIN and reading groups in the presence and absence of competing multi-talker babble. In the quiet condition, neural response timing was equivalent between groups. In noise, however, the bottom groups exhibited greater neural delays relative to the top groups. Group-specific timing delays occurred exclusively in response to the noise-vulnerable formant transition, not to the more perceptually-robust, steady-state portion of the stimulus. These results demonstrate that neural timing is disrupted by background noise and that greater disruptions are associated with the inability to perceive speech in challenging listening conditions. PMID:20371812
Effects of utterance length and vocal loudness on speech breathing in older adults.

PubMed

Huber, Jessica E

2008-12-31

Age-related reductions in pulmonary elastic recoil and respiratory muscle strength can affect how older adults generate subglottal pressure required for speech production. The present study examined age-related changes in speech breathing by manipulating utterance length and loudness during a connected speech task (monologue). Twenty-three older adults and twenty-eight young adults produced a monologue at comfortable loudness and pitch and with multi-talker babble noise playing in the room to elicit louder speech. Dependent variables included sound pressure level, speech rate, and lung volume initiation, termination, and excursion. Older adults produced shorter utterances than young adults overall. Age-related effects were larger for longer utterances. Older adults demonstrated very different lung volume adjustments for loud speech than young adults. These results suggest that older adults have a more difficult time when the speech system is being taxed by both utterance length and loudness. The data were consistent with the hypothesis that both young and older adults use utterance length in premotor speech planning processes.
Only Behavioral But Not Self-Report Measures of Speech Perception Correlate with Cognitive Abilities.

PubMed

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A

2016-01-01

Good speech perception and communication skills in everyday life are crucial for participation and well-being, and are therefore an overarching aim of auditory rehabilitation. Both behavioral and self-report measures can be used to assess these skills. However, correlations between behavioral and self-report speech perception measures are often low. One possible explanation is that there is a mismatch between the specific situations used in the assessment of these skills in each method, and a more careful matching across situations might improve consistency of results. The role that cognition plays in specific speech situations may also be important for understanding communication, as speech perception tests vary in their cognitive demands. In this study, the role of executive function, working memory (WM) and attention in behavioral and self-report measures of speech perception was investigated. Thirty existing hearing aid users with mild-to-moderate hearing loss aged between 50 and 74 years completed a behavioral test battery with speech perception tests ranging from phoneme discrimination in modulated noise (easy) to words in multi-talker babble (medium) and keyword perception in a carrier sentence against a distractor voice (difficult). In addition, a self-report measure of aided communication, residual disability from the Glasgow Hearing Aid Benefit Profile, was obtained. Correlations between speech perception tests and self-report measures were higher when specific speech situations across both were matched. Cognition correlated with behavioral speech perception test results but not with self-report. Only the most difficult speech perception test, keyword perception in a carrier sentence with a competing distractor voice, engaged executive functions in addition to WM. In conclusion, any relationship between behavioral and self-report speech perception is not mediated by a shared correlation with cognition.
Only Behavioral But Not Self-Report Measures of Speech Perception Correlate with Cognitive Abilities

PubMed Central

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2016-01-01

Good speech perception and communication skills in everyday life are crucial for participation and well-being, and are therefore an overarching aim of auditory rehabilitation. Both behavioral and self-report measures can be used to assess these skills. However, correlations between behavioral and self-report speech perception measures are often low. One possible explanation is that there is a mismatch between the specific situations used in the assessment of these skills in each method, and a more careful matching across situations might improve consistency of results. The role that cognition plays in specific speech situations may also be important for understanding communication, as speech perception tests vary in their cognitive demands. In this study, the role of executive function, working memory (WM) and attention in behavioral and self-report measures of speech perception was investigated. Thirty existing hearing aid users with mild-to-moderate hearing loss aged between 50 and 74 years completed a behavioral test battery with speech perception tests ranging from phoneme discrimination in modulated noise (easy) to words in multi-talker babble (medium) and keyword perception in a carrier sentence against a distractor voice (difficult). In addition, a self-report measure of aided communication, residual disability from the Glasgow Hearing Aid Benefit Profile, was obtained. Correlations between speech perception tests and self-report measures were higher when specific speech situations across both were matched. Cognition correlated with behavioral speech perception test results but not with self-report. Only the most difficult speech perception test, keyword perception in a carrier sentence with a competing distractor voice, engaged executive functions in addition to WM. In conclusion, any relationship between behavioral and self-report speech perception is not mediated by a shared correlation with cognition. PMID:27242564
Attentional Gain Control of Ongoing Cortical Speech Representations in a “Cocktail Party”

PubMed Central

Kerlin, Jess R.; Shahin, Antoine J.; Miller, Lee M.

2010-01-01

Normal listeners possess the remarkable perceptual ability to select a single speech stream among many competing talkers. However, few studies of selective attention have addressed the unique nature of speech as a temporally extended and complex auditory object. We hypothesized that sustained selective attention to speech in a multi-talker environment would act as gain control on the early auditory cortical representations of speech. Using high-density electroencephalography and a template-matching analysis method, we found selective gain to the continuous speech content of an attended talker, greatest at a frequency of 4–8 Hz, in auditory cortex. In addition, the difference in alpha power (8–12 Hz) at parietal sites across hemispheres indicated the direction of auditory attention to speech, as has been previously found in visual tasks. The strength of this hemispheric alpha lateralization, in turn, predicted an individual’s attentional gain of the cortical speech signal. These results support a model of spatial speech stream segregation, mediated by a supramodal attention mechanism, enabling selection of the attended representation in auditory cortex. PMID:20071526
The Nationwide Speech Project: A multi-talker multi-dialect speech corpus

NASA Astrophysics Data System (ADS)

Clopper, Cynthia G.; Pisoni, David B.

2004-05-01

Most research on regional phonological variation relies on field recordings of interview speech. Recent research on the perception of dialect variation by naive listeners, however, has relied on read sentence materials in order to control for phonological and lexical content and syntax. The Nationwide Speech Project corpus was designed to obtain a large amount of speech from a number of talkers representing different regional varieties of American English. Five male and five female talkers from each of six different dialect regions in the United States were recorded reading isolated words, sentences, and passages, and in conversations with the experimenter. The talkers ranged in age from 18 and 25 years old and they were all monolingual native speakers of American English. They had lived their entire life in one dialect region and both of their parents were raised in the same region. Results of an acoustic analysis of the vowel spaces of the talkers included in the Nationwide Speech Project will be presented. [Work supported by NIH.
Peripheral hearing loss reduces the ability of children to direct selective attention during multi-talker listening.

PubMed

Holmes, Emma; Kitterick, Padraig T; Summerfield, A Quentin

2017-07-01

Restoring normal hearing requires knowledge of how peripheral and central auditory processes are affected by hearing loss. Previous research has focussed primarily on peripheral changes following sensorineural hearing loss, whereas consequences for central auditory processing have received less attention. We examined the ability of hearing-impaired children to direct auditory attention to a voice of interest (based on the talker's spatial location or gender) in the presence of a common form of background noise: the voices of competing talkers (i.e. during multi-talker, or "Cocktail Party" listening). We measured brain activity using electro-encephalography (EEG) when children prepared to direct attention to the spatial location or gender of an upcoming target talker who spoke in a mixture of three talkers. Compared to normally-hearing children, hearing-impaired children showed significantly less evidence of preparatory brain activity when required to direct spatial attention. This finding is consistent with the idea that hearing-impaired children have a reduced ability to prepare spatial attention for an upcoming talker. Moreover, preparatory brain activity was not restored when hearing-impaired children listened with their acoustic hearing aids. An implication of these findings is that steps to improve auditory attention alongside acoustic hearing aids may be required to improve the ability of hearing-impaired children to understand speech in the presence of competing talkers. Copyright © 2017 Elsevier B.V. All rights reserved.
Working Memory and Speech Comprehension in Older Adults With Hearing Impairment.

PubMed

Nagaraj, Naveen K

2017-10-17

This study examined the relationship between working memory (WM) and speech comprehension in older adults with hearing impairment (HI). It was hypothesized that WM would explain significant variance in speech comprehension measured in multitalker babble (MTB). Twenty-four older (59-73 years) adults with sensorineural HI participated. WM capacity (WMC) was measured using 3 complex span tasks. Speech comprehension was assessed using multiple passages, and speech identification ability was measured using recall of sentence final-word and key words. Speech measures were performed in quiet and in the presence of MTB at + 5 dB signal-to-noise ratio. Results suggested that participants' speech identification was poorer in MTB, but their ability to comprehend discourse in MTB was at least as good as in quiet. WMC did not explain significant variance in speech comprehension before and after controlling for age and audibility. However, WMC explained significant variance in low-context sentence key words identification in MTB. These results suggest that WMC plays an important role in identifying low-context sentences in MTB, but not when comprehending semantically rich discourse passages. In general, data did not support individual variability in WMC as a factor that predicts speech comprehension ability in older adults with HI.
Neural tracking of attended versus ignored speech is differentially affected by hearing loss.

PubMed

Petersen, Eline Borch; Wöstmann, Malte; Obleser, Jonas; Lunner, Thomas

2017-01-01

Hearing loss manifests as a reduced ability to understand speech, particularly in multitalker situations. In these situations, younger normal-hearing listeners' brains are known to track attended speech through phase-locking of neural activity to the slow-varying envelope of the speech. This study investigates how hearing loss, compensated by hearing aids, affects the neural tracking of the speech-onset envelope in elderly participants with varying degree of hearing loss (n = 27, 62-86 yr; hearing thresholds 11-73 dB hearing level). In an active listening task, a to-be-attended audiobook (signal) was presented either in quiet or against a competing to-be-ignored audiobook (noise) presented at three individualized signal-to-noise ratios (SNRs). The neural tracking of the to-be-attended and to-be-ignored speech was quantified through the cross-correlation of the electroencephalogram (EEG) and the temporal envelope of speech. We primarily investigated the effects of hearing loss and SNR on the neural envelope tracking. First, we found that elderly hearing-impaired listeners' neural responses reliably track the envelope of to-be-attended speech more than to-be-ignored speech. Second, hearing loss relates to the neural tracking of to-be-ignored speech, resulting in a weaker differential neural tracking of to-be-attended vs. to-be-ignored speech in listeners with worse hearing. Third, neural tracking of to-be-attended speech increased with decreasing background noise. Critically, the beneficial effect of reduced noise on neural speech tracking decreased with stronger hearing loss. In sum, our results show that a common sensorineural processing deficit, i.e., hearing loss, interacts with central attention mechanisms and reduces the differential tracking of attended and ignored speech. The present study investigates the effect of hearing loss in older listeners on the neural tracking of competing speech. Interestingly, we observed that whereas internal degradation (hearing loss) relates to the neural tracking of ignored speech, external sound degradation (ratio between attended and ignored speech; signal-to-noise ratio) relates to tracking of attended speech. This provides the first evidence for hearing loss affecting the ability to neurally track speech. Copyright © 2017 the American Physiological Society.

Prior Knowledge Guides Speech Segregation in Human Auditory Cortex.

PubMed

Wang, Yuanye; Zhang, Jianfeng; Zou, Jiajie; Luo, Huan; Ding, Nai

2018-05-18

Segregating concurrent sound streams is a computationally challenging task that requires integrating bottom-up acoustic cues (e.g. pitch) and top-down prior knowledge about sound streams. In a multi-talker environment, the brain can segregate different speakers in about 100 ms in auditory cortex. Here, we used magnetoencephalographic (MEG) recordings to investigate the temporal and spatial signature of how the brain utilizes prior knowledge to segregate 2 speech streams from the same speaker, which can hardly be separated based on bottom-up acoustic cues. In a primed condition, the participants know the target speech stream in advance while in an unprimed condition no such prior knowledge is available. Neural encoding of each speech stream is characterized by the MEG responses tracking the speech envelope. We demonstrate that an effect in bilateral superior temporal gyrus and superior temporal sulcus is much stronger in the primed condition than in the unprimed condition. Priming effects are observed at about 100 ms latency and last more than 600 ms. Interestingly, prior knowledge about the target stream facilitates speech segregation by mainly suppressing the neural tracking of the non-target speech stream. In sum, prior knowledge leads to reliable speech segregation in auditory cortex, even in the absence of reliable bottom-up speech segregation cue.
Comparing Binaural Pre-processing Strategies I: Instrumental Evaluation.

PubMed

Baumgärtel, Regina M; Krawczyk-Becker, Martin; Marquardt, Daniel; Völker, Christoph; Hu, Hongmei; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Ernst, Stephan M A; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-12-30

In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios. © The Author(s) 2015.
Relationship Between Speech Intelligibility and Speech Comprehension in Babble Noise.

PubMed

Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

2015-06-01

The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to transcribe what they heard and obey the commands in an interactive environment set up for this purpose. The former test provided intelligibility scores and the latter provided comprehension scores. Collected data revealed a globally weak correlation between intelligibility and comprehension scores (r = .35, p < .001). The discrepancy tended to grow as noise level increased. An analysis of standard deviations showed that variability in comprehension scores increased linearly with noise level, whereas higher variability in intelligibility scores was found for moderate noise level conditions. These results support the hypothesis that intelligibility scores are poor predictors of listeners' comprehension in real communication situations. Intelligibility and comprehension scores appear to provide different insights, the first measure being centered on speech signal transfer and the second on communicative performance. Both theoretical and practical implications for the use of speech intelligibility tests as indicators of speakers' performances are discussed.
Comparing Binaural Pre-processing Strategies I

PubMed Central

Krawczyk-Becker, Martin; Marquardt, Daniel; Völker, Christoph; Hu, Hongmei; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Ernst, Stephan M. A.; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-01-01

In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios. PMID:26721920
Speech understanding in background noise with the two-microphone adaptive beamformer BEAM in the Nucleus Freedom Cochlear Implant System.

PubMed

Spriet, Ann; Van Deun, Lieselot; Eftaxiadis, Kyriaky; Laneau, Johan; Moonen, Marc; van Dijk, Bas; van Wieringen, Astrid; Wouters, Jan

2007-02-01

This paper evaluates the benefit of the two-microphone adaptive beamformer BEAM in the Nucleus Freedom cochlear implant (CI) system for speech understanding in background noise by CI users. A double-blind evaluation of the two-microphone adaptive beamformer BEAM and a hardware directional microphone was carried out with five adult Nucleus CI users. The test procedure consisted of a pre- and post-test in the lab and a 2-wk trial period at home. In the pre- and post-test, the speech reception threshold (SRT) with sentences and the percentage correct phoneme scores for CVC words were measured in quiet and background noise at different signal-to-noise ratios. Performance was assessed for two different noise configurations (with a single noise source and with three noise sources) and two different noise materials (stationary speech-weighted noise and multitalker babble). During the 2-wk trial period at home, the CI users evaluated the noise reduction performance in different listening conditions by means of the SSQ questionnaire. In addition to the perceptual evaluation, the noise reduction performance of the beamformer was measured physically as a function of the direction of the noise source. Significant improvements of both the SRT in noise (average improvement of 5-16 dB) and the percentage correct phoneme scores (average improvement of 10-41%) were observed with BEAM compared to the standard hardware directional microphone. In addition, the SSQ questionnaire and subjective evaluation in controlled and real-life scenarios suggested a possible preference for the beamformer in noisy environments. The evaluation demonstrates that the adaptive noise reduction algorithm BEAM in the Nucleus Freedom CI-system may significantly increase the speech perception by cochlear implantees in noisy listening conditions. This is the first monolateral (adaptive) noise reduction strategy actually implemented in a mainstream commercial CI.
Intelligibility of clear speech: effect of instruction.

PubMed

Lam, Jennifer; Tjaden, Kris

2013-10-01

The authors investigated how clear speech instructions influence sentence intelligibility. Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis investigated percentage-correct intelligibility scores as a function of the 4 conditions and speaker sex. Additional analyses included listener response variability, individual speaker trends, and an alternate intelligibility measure: proportion of content words correct. Relative to the habitual condition, the overenunciate condition was associated with the greatest intelligibility benefit, followed by the hearing impaired and clear conditions. Ten speakers followed this trend. The results indicated different patterns of clear speech benefit for male and female speakers. Greater listener variability was observed for speakers with inherently low habitual intelligibility compared to speakers with inherently high habitual intelligibility. Stable proportions of content words were observed across conditions. Clear speech instructions affected the magnitude of the intelligibility benefit. The instruction to overenunciate may be most effective in clear speech training programs. The findings may help explain the range of clear speech intelligibility benefit previously reported. Listener variability analyses suggested the importance of obtaining multiple listener judgments of intelligibility, especially for speakers with inherently low habitual intelligibility.
Some factors underlying individual differences in speech recognition on PRESTO: a first report.

PubMed

Tamati, Terrin N; Gilbert, Jaimie L; Pisoni, David B

2013-01-01

Previous studies investigating speech recognition in adverse listening conditions have found extensive variability among individual listeners. However, little is currently known about the core underlying factors that influence speech recognition abilities. To investigate sensory, perceptual, and neurocognitive differences between good and poor listeners on the Perceptually Robust English Sentence Test Open-set (PRESTO), a new high-variability sentence recognition test under adverse listening conditions. Participants who fell in the upper quartile (HiPRESTO listeners) or lower quartile (LoPRESTO listeners) on key word recognition on sentences from PRESTO in multitalker babble completed a battery of behavioral tasks and self-report questionnaires designed to investigate real-world hearing difficulties, indexical processing skills, and neurocognitive abilities. Young, normal-hearing adults (N = 40) from the Indiana University community participated in the current study. Participants' assessment of their own real-world hearing difficulties was measured with a self-report questionnaire on situational hearing and hearing health history. Indexical processing skills were assessed using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Neurocognitive abilities were measured with the Auditory Digit Span Forward (verbal short-term memory) and Digit Span Backward (verbal working memory) tests, the Stroop Color and Word Test (attention/inhibition), the WordFam word familiarity test (vocabulary size), the Behavioral Rating Inventory of Executive Function-Adult Version (BRIEF-A) self-report questionnaire on executive function, and two performance subtests of the Wechsler Abbreviated Scale of Intelligence (WASI) Performance Intelligence Quotient (IQ; nonverbal intelligence). Scores on self-report questionnaires and behavioral tasks were tallied and analyzed by listener group (HiPRESTO and LoPRESTO). The extreme groups did not differ overall on self-reported hearing difficulties in real-world listening environments. However, an item-by-item analysis of questions revealed that LoPRESTO listeners reported significantly greater difficulty understanding speakers in a public place. HiPRESTO listeners were significantly more accurate than LoPRESTO listeners at gender discrimination and regional dialect categorization, but they did not differ on talker discrimination accuracy or response time, or gender discrimination response time. HiPRESTO listeners also had longer forward and backward digit spans, higher word familiarity ratings on the WordFam test, and lower (better) scores for three individual items on the BRIEF-A questionnaire related to cognitive load. The two groups did not differ on the Stroop Color and Word Test or either of the WASI performance IQ subtests. HiPRESTO listeners and LoPRESTO listeners differed in indexical processing abilities, short-term and working memory capacity, vocabulary size, and some domains of executive functioning. These findings suggest that individual differences in the ability to encode and maintain highly detailed episodic information in speech may underlie the variability observed in speech recognition performance in adverse listening conditions using high-variability PRESTO sentences in multitalker babble. American Academy of Audiology.
Some Factors Underlying Individual Differences in Speech Recognition on PRESTO: A First Report

PubMed Central

Tamati, Terrin N.; Gilbert, Jaimie L.; Pisoni, David B.

2013-01-01

Background Previous studies investigating speech recognition in adverse listening conditions have found extensive variability among individual listeners. However, little is currently known about the core, underlying factors that influence speech recognition abilities. Purpose To investigate sensory, perceptual, and neurocognitive differences between good and poor listeners on PRESTO, a new high-variability sentence recognition test under adverse listening conditions. Research Design Participants who fell in the upper quartile (HiPRESTO listeners) or lower quartile (LoPRESTO listeners) on key word recognition on sentences from PRESTO in multitalker babble completed a battery of behavioral tasks and self-report questionnaires designed to investigate real-world hearing difficulties, indexical processing skills, and neurocognitive abilities. Study Sample Young, normal-hearing adults (N = 40) from the Indiana University community participated in the current study. Data Collection and Analysis Participants’ assessment of their own real-world hearing difficulties was measured with a self-report questionnaire on situational hearing and hearing health history. Indexical processing skills were assessed using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Neurocognitive abilities were measured with the Auditory Digit Span Forward (verbal short-term memory) and Digit Span Backward (verbal working memory) tests, the Stroop Color and Word Test (attention/inhibition), the WordFam word familiarity test (vocabulary size), the BRIEF-A self-report questionnaire on executive function, and two performance subtests of the WASI Performance IQ (non-verbal intelligence). Scores on self-report questionnaires and behavioral tasks were tallied and analyzed by listener group (HiPRESTO and LoPRESTO). Results The extreme groups did not differ overall on self-reported hearing difficulties in real-world listening environments. However, an item-by-item analysis of questions revealed that LoPRESTO listeners reported significantly greater difficulty understanding speakers in a public place. HiPRESTO listeners were significantly more accurate than LoPRESTO listeners at gender discrimination and regional dialect categorization, but they did not differ on talker discrimination accuracy or response time, or gender discrimination response time. HiPRESTO listeners also had longer forward and backward digit spans, higher word familiarity ratings on the WordFam test, and lower (better) scores for three individual items on the BRIEF-A questionnaire related to cognitive load. The two groups did not differ on the Stroop Color and Word Test or either of the WASI performance IQ subtests. Conclusions HiPRESTO listeners and LoPRESTO listeners differed in indexical processing abilities, short-term and working memory capacity, vocabulary size, and some domains of executive functioning. These findings suggest that individual differences in the ability to encode and maintain highly detailed episodic information in speech may underlie the variability observed in speech recognition performance in adverse listening conditions using high-variability PRESTO sentences in multitalker babble. PMID:24047949
The effect of simultaneous text on the recall of noise-degraded speech.

PubMed

Grossman, Irina; Rajan, Ramesh

2017-05-01

Written and spoken language utilize the same processing system, enabling text to modulate speech processing. We investigated how simultaneously presented text affected speech recall in babble noise using a retrospective recall task. Participants were presented with text-speech sentence pairs in multitalker babble noise and then prompted to recall what they heard or what they read. In Experiment 1, sentence pairs were either congruent or incongruent and they were presented in silence or at 1 of 4 noise levels. Audio and Visual control groups were also tested with sentences presented in only 1 modality. Congruent text facilitated accurate recall of degraded speech; incongruent text had no effect. Text and speech were seldom confused for each other. A consideration of the effects of the language background found that monolingual English speakers outperformed early multilinguals at recalling degraded speech; however the effects of text on speech processing were analogous. Experiment 2 considered if the benefit provided by matching text was maintained when the congruency of the text and speech becomes more ambiguous because of the addition of partially mismatching text-speech sentence pairs that differed only on their final keyword and because of the use of low signal-to-noise ratios. The experiment focused on monolingual English speakers; the results showed that even though participants commonly confused text-for-speech during incongruent text-speech pairings, these confusions could not fully account for the benefit provided by matching text. Thus, we uniquely demonstrate that congruent text benefits the recall of noise-degraded speech. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Using auditory pre-information to solve the cocktail-party problem: electrophysiological evidence for age-specific differences.

PubMed

Getzmann, Stephan; Lewald, Jörg; Falkenstein, Michael

2014-01-01

Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called "cocktail-party" problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments.
Using auditory pre-information to solve the cocktail-party problem: electrophysiological evidence for age-specific differences

PubMed Central

Getzmann, Stephan; Lewald, Jörg; Falkenstein, Michael

2014-01-01

Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called “cocktail-party” problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments. PMID:25540608
The Spatial Release of Cognitive Load in Cocktail Party Is Determined by the Relative Levels of the Talkers.

PubMed

Andéol, Guillaume; Suied, Clara; Scannella, Sébastien; Dehais, Frédéric

2017-06-01

In a multi-talker situation, spatial separation between talkers reduces cognitive processing load: this is the "spatial release of cognitive load". The present study investigated the role played by the relative levels of the talkers on this spatial release of cognitive load. During the experiment, participants had to report the speech emitted by a target talker in the presence of a concurrent masker talker. The spatial separation (0° and 120° angular distance in azimuth) and the relative levels of the talkers (adverse, intermediate, and favorable target-to-masker ratio) were manipulated. The cognitive load was assessed with a prefrontal functional near-infrared spectroscopy. Data from 14 young normal-hearing listeners revealed that the target-to-masker ratio had a direct impact on the spatial release of cognitive load. Spatial separation significantly reduced the prefrontal activity only for the intermediate target-to-masker ratio and had no effect on prefrontal activity for the favorable and the adverse target-to-masker ratios. Therefore, the relative levels of the talkers might be a key point to determine the spatial release of cognitive load and more specifically the prefrontal activity induced by spatial cues in multi-talker situations.
Accuracy of cochlear implant recipients in speech reception in the presence of background music.

PubMed

Gfeller, Kate; Turner, Christopher; Oleson, Jacob; Kliethermes, Stephanie; Driscoll, Virginia

2012-12-01

This study examined speech recognition abilities of cochlear implant (CI) recipients in the spectrally complex listening condition of 3 contrasting types of background music, and compared performance based upon listener groups: CI recipients using conventional long-electrode devices, Hybrid CI recipients (acoustic plus electric stimulation), and normal-hearing adults. We tested 154 long-electrode CI recipients using varied devices and strategies, 21 Hybrid CI recipients, and 49 normal-hearing adults on closed-set recognition of spondees presented in 3 contrasting forms of background music (piano solo, large symphony orchestra, vocal solo with small combo accompaniment) in an adaptive test. Signal-to-noise ratio thresholds for speech in music were examined in relation to measures of speech recognition in background noise and multitalker babble, pitch perception, and music experience. The signal-to-noise ratio thresholds for speech in music varied as a function of category of background music, group membership (long-electrode, Hybrid, normal-hearing), and age. The thresholds for speech in background music were significantly correlated with measures of pitch perception and thresholds for speech in background noise; auditory status was an important predictor. Evidence suggests that speech reception thresholds in background music change as a function of listener age (with more advanced age being detrimental), structural characteristics of different types of music, and hearing status (residual hearing). These findings have implications for everyday listening conditions such as communicating in social or commercial situations in which there is background music.
Tuning in and tuning out: Speech perception in native- and foreign-talker babble

NASA Astrophysics Data System (ADS)

van Heukelem, Kristin; Bradlow, Ann R.

2005-09-01

Studies on speech perception in multitalker babble have revealed asymmetries in the effects of noise on native versus foreign-accented speech intelligibility for native listeners [Rogers et al., Lang Speech 47(2), 139-154 (2004)] and on sentence-in-noise perception by native versus non-native listeners [Mayo et al., J. Speech Lang. Hear. Res., 40, 686-693 (1997)], suggesting that the linguistic backgrounds of talkers and listeners contribute to the effects of noise on speech perception. However, little attention has been paid to the language of the babble. This study tested whether the language of the noise also has asymmetrical effects on listeners. Replicating previous findings [e.g., Bronkhorst and Plomp, J. Acoust. Soc. Am., 92, 3132-3139 (1992)], the results showed poorer English sentence recognition by native English listeners in six-talker babble than in two-talker babble regardless of the language of the babble, demonstrating the effect of increased psychoacoustic/energetic masking. In addition, the results showed that in the two-talker babble condition, native English listeners were more adversely affected by English than Chinese babble. These findings demonstrate informational/cognitive masking on sentence-in-noise recognition in the form of linguistic competition. Whether this competition is at the lexical or sublexical level and whether it is modulated by the phonetic similarity between the target and noise languages remains to be determined.
Accuracy of Cochlear Implant Recipients on Speech Reception in Background Music

PubMed Central

Gfeller, Kate; Turner, Christopher; Oleson, Jacob; Kliethermes, Stephanie; Driscoll, Virginia

2012-01-01

Objectives This study (a) examined speech recognition abilities of cochlear implant (CI) recipients in the spectrally complex listening condition of three contrasting types of background music, and (b) compared performance based upon listener groups: CI recipients using conventional long-electrode (LE) devices, Hybrid CI recipients (acoustic plus electric stimulation), and normal-hearing (NH) adults. Methods We tested 154 LE CI recipients using varied devices and strategies, 21 Hybrid CI recipients, and 49 NH adults on closed-set recognition of spondees presented in three contrasting forms of background music (piano solo, large symphony orchestra, vocal solo with small combo accompaniment) in an adaptive test. Outcomes Signal-to-noise thresholds for speech in music (SRTM) were examined in relation to measures of speech recognition in background noise and multi-talker babble, pitch perception, and music experience. Results SRTM thresholds varied as a function of category of background music, group membership (LE, Hybrid, NH), and age. Thresholds for speech in background music were significantly correlated with measures of pitch perception and speech in background noise thresholds; auditory status was an important predictor. Conclusions Evidence suggests that speech reception thresholds in background music change as a function of listener age (with more advanced age being detrimental), structural characteristics of different types of music, and hearing status (residual hearing). These findings have implications for everyday listening conditions such as communicating in social or commercial situations in which there is background music. PMID:23342550
Talker and accent variability effects on spoken word recognition

NASA Astrophysics Data System (ADS)

Nyang, Edna E.; Rogers, Catherine L.; Nishi, Kanae

2003-04-01

A number of studies have shown that words in a list are recognized less accurately in noise and with longer response latencies when they are spoken by multiple talkers, rather than a single talker. These results have been interpreted as support for an exemplar-based model of speech perception, in which it is assumed that detailed information regarding the speaker's voice is preserved in memory and used in recognition, rather than being eliminated via normalization. In the present study, the effects of varying both accent and talker are investigated using lists of words spoken by (a) a single native English speaker, (b) six native English speakers, (c) three native English speakers and three Japanese-accented English speakers. Twelve /hVd/ words were mixed with multi-speaker babble at three signal-to-noise ratios (+10, +5, and 0 dB) to create the word lists. Native English-speaking listeners' percent-correct recognition for words produced by native English speakers across the three talker conditions (single talker native, multi-talker native, and multi-talker mixed native and non-native) and three signal-to-noise ratios will be compared to determine whether sources of speaker variability other than voice alone add to the processing demands imposed by simple (i.e., single accent) speaker variability in spoken word recognition.
Perception of speech in noise: neural correlates.

PubMed

Song, Judy H; Skoe, Erika; Banai, Karen; Kraus, Nina

2011-09-01

The presence of irrelevant auditory information (other talkers, environmental noises) presents a major challenge to listening to speech. The fundamental frequency (F(0)) of the target speaker is thought to provide an important cue for the extraction of the speaker's voice from background noise, but little is known about the relationship between speech-in-noise (SIN) perceptual ability and neural encoding of the F(0). Motivated by recent findings that music and language experience enhance brainstem representation of sound, we examined the hypothesis that brainstem encoding of the F(0) is diminished to a greater degree by background noise in people with poorer perceptual abilities in noise. To this end, we measured speech-evoked auditory brainstem responses to /da/ in quiet and two multitalker babble conditions (two-talker and six-talker) in native English-speaking young adults who ranged in their ability to perceive and recall SIN. Listeners who were poorer performers on a standardized SIN measure demonstrated greater susceptibility to the degradative effects of noise on the neural encoding of the F(0). Particularly diminished was their phase-locked activity to the fundamental frequency in the portion of the syllable known to be most vulnerable to perceptual disruption (i.e., the formant transition period). Our findings suggest that the subcortical representation of the F(0) in noise contributes to the perception of speech in noisy conditions.
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments

PubMed Central

Colburn, H. Steven

2016-01-01

Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. PMID:27698261
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.

PubMed

Mi, Jing; Colburn, H Steven

2016-10-03

Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. © The Author(s) 2016.
Effects of sensorineural hearing loss on visually guided attention in a multitalker environment.

PubMed

Best, Virginia; Marrone, Nicole; Mason, Christine R; Kidd, Gerald; Shinn-Cunningham, Barbara G

2009-03-01

This study asked whether or not listeners with sensorineural hearing loss have an impaired ability to use top-down attention to enhance speech intelligibility in the presence of interfering talkers. Listeners were presented with a target string of spoken digits embedded in a mixture of five spatially separated speech streams. The benefit of providing simple visual cues indicating when and/or where the target would occur was measured in listeners with hearing loss, listeners with normal hearing, and a control group of listeners with normal hearing who were tested at a lower target-to-masker ratio to equate their baseline (no cue) performance with the hearing-loss group. All groups received robust benefits from the visual cues. The magnitude of the spatial-cue benefit, however, was significantly smaller in listeners with hearing loss. Results suggest that reduced utility of selective attention for resolving competition between simultaneous sounds contributes to the communication difficulties experienced by listeners with hearing loss in everyday listening situations.

Single-Sided Deafness: Impact of Cochlear Implantation on Speech Perception in Complex Noise and on Auditory Localization Accuracy.

PubMed

Döge, Julia; Baumann, Uwe; Weissgerber, Tobias; Rader, Tobias

2017-12-01

To assess auditory localization accuracy and speech reception threshold (SRT) in complex noise conditions in adult patients with acquired single-sided deafness, after intervention with a cochlear implant (CI) in the deaf ear. Nonrandomized, open, prospective patient series. Tertiary referral university hospital. Eleven patients with late-onset single-sided deafness (SSD) and normal hearing in the unaffected ear, who received a CI. All patients were experienced CI users. Unilateral cochlear implantation. Speech perception was tested in a complex multitalker equivalent noise field consisting of multiple sound sources. Speech reception thresholds in noise were determined in aided (with CI) and unaided conditions. Localization accuracy was assessed in complete darkness. Acoustic stimuli were radiated by multiple loudspeakers distributed in the frontal horizontal plane between -60 and +60 degrees. In the aided condition, results show slightly improved speech reception scores compared with the unaided condition in most of the patients. For 8 of the 11 subjects, SRT was improved between 0.37 and 1.70 dB. Three of the 11 subjects showed deteriorations between 1.22 and 3.24 dB SRT. Median localization error decreased significantly by 12.9 degrees compared with the unaided condition. CI in single-sided deafness is an effective treatment to improve the auditory localization accuracy. Speech reception in complex noise conditions is improved to a lesser extent in 73% of the participating CI SSD patients. However, the absence of true binaural interaction effects (summation, squelch) impedes further improvements. The development of speech processing strategies that respect binaural interaction seems to be mandatory to advance speech perception in demanding listening situations in SSD patients.
Availability of binaural cues for pediatric bilateral cochlear implant recipients.

PubMed

Sheffield, Sterling W; Haynes, David S; Wanna, George B; Labadie, Robert F; Gifford, René H

2015-03-01

Bilateral implant recipients theoretically have access to binaural cues. Research in postlingually deafened adults with cochlear implants (CIs) indicates minimal evidence for true binaural hearing. Congenitally deafened children who experience spatial hearing with bilateral CIs, however, might perceive binaural cues in the CI signal differently. There is limited research examining binaural hearing in children with CIs, and the few published studies are limited by the use of unrealistic speech stimuli and background noise. The purposes of this study were to (1) replicate our previous study of binaural hearing in postlingually deafened adults with AzBio sentences in prelingually deafened children with the pediatric version of the AzBio sentences, and (2) replicate previous studies of binaural hearing in children with CIs using more open-set sentences and more realistic background noise (i.e., multitalker babble). The study was a within-participant, repeated-measures design. The study sample consisted of 14 children with bilateral CIs with at least 25 mo of listening experience. Speech recognition was assessed using sentences presented in multitalker babble at a fixed signal-to-noise ratio. Test conditions included speech at 0° with noise presented at 0° (S0N0), on the side of the first CI (90° or 270°) (S0N1stCI), and on the side of the second CI (S0N2ndCI) as well as speech presented at 0° with noise presented semidiffusely from eight speakers at 45° intervals. Estimates of summation, head shadow, squelch, and spatial release from masking were calculated. Results of test conditions commonly reported in the literature (S0N0, S0N1stCI, S0N2ndCI) are consistent with results from previous research in adults and children with bilateral CIs, showing minimal summation and squelch but typical head shadow and spatial release from masking. However, bilateral benefit over the better CI with speech at 0° was much larger with semidiffuse noise. Congenitally deafened children with CIs have similar availability of binaural hearing cues to postlingually deafened adults with CIs within the same experimental design. It is possible that the use of realistic listening environments, such as semidiffuse background noise as in Experiment II, would reveal greater binaural hearing benefit for bilateral CI recipients. Future research is needed to determine whether (1) availability of binaural cues for children correlates with interaural time and level differences, (2) different listening environments are more sensitive to binaural hearing benefits, and (3) differences exist between pediatric bilateral recipients receiving implants in the same or sequential surgeries. American Academy of Audiology.
A virtual speaker in noisy classroom conditions: supporting or disrupting children's listening comprehension?

PubMed

Nirme, Jens; Haake, Magnus; Lyberg Åhlander, Viveka; Brännström, Jonas; Sahlén, Birgitta

2018-04-05

Seeing a speaker's face facilitates speech recognition, particularly under noisy conditions. Evidence for how it might affect comprehension of the content of the speech is more sparse. We investigated how children's listening comprehension is affected by multi-talker babble noise, with or without presentation of a digitally animated virtual speaker, and whether successful comprehension is related to performance on a test of executive functioning. We performed a mixed-design experiment with 55 (34 female) participants (8- to 9-year-olds), recruited from Swedish elementary schools. The children were presented with four different narratives, each in one of four conditions: audio-only presentation in a quiet setting, audio-only presentation in noisy setting, audio-visual presentation in a quiet setting, and audio-visual presentation in a noisy setting. After each narrative, the children answered questions on the content and rated their perceived listening effort. Finally, they performed a test of executive functioning. We found significantly fewer correct answers to explicit content questions after listening in noise. This negative effect was only mitigated to a marginally significant degree by audio-visual presentation. Strong executive function only predicted more correct answers in quiet settings. Altogether, our results are inconclusive regarding how seeing a virtual speaker affects listening comprehension. We discuss how methodological adjustments, including modifications to our virtual speaker, can be used to discriminate between possible explanations to our results and contribute to understanding the listening conditions children face in a typical classroom.
Talker- and language-specific effects on speech intelligibility in noise assessed with bilingual talkers: Which language is more robust against noise and reverberation?

PubMed

Hochmuth, Sabine; Jürgens, Tim; Brand, Thomas; Kollmeier, Birger

2015-01-01

Investigate talker- and language-specific aspects of speech intelligibility in noise and reverberation using highly comparable matrix sentence tests across languages. Matrix sentences spoken by German/Russian and German/Spanish bilingual talkers were recorded. These sentences were used to measure speech reception thresholds (SRTs) with native listeners in the respective languages in different listening conditions (stationary and fluctuating noise, multi-talker babble, reverberated speech-in-noise condition). Four German/Russian and four German/Spanish bilingual talkers; 20 native German-speaking, 10 native Russian-speaking, and 10 native Spanish-speaking listeners. Across-talker SRT differences of up to 6 dB were found for both groups of bilinguals. SRTs of German/Russian bilingual talkers were the same in both languages. SRTs of German/Spanish bilingual talkers were higher when they talked in Spanish than when they talked in German. The benefit from listening in the gaps was similar across all languages. The detrimental effect of reverberation was larger for Spanish than for German and Russian. Within the limitations set by the number and slight accentedness of talkers and other possible confounding factors, talker- and test-condition-dependent differences were isolated from the language effect: Russian and German exhibited similar intelligibility in noise and reverberation, whereas Spanish was more impaired in these situations.
Temporal and speech processing skills in normal hearing individuals exposed to occupational noise.

PubMed

Kumar, U Ajith; Ameenudin, Syed; Sangamanatha, A V

2012-01-01

Prolonged exposure to high levels of occupational noise can cause damage to hair cells in the cochlea and result in permanent noise-induced cochlear hearing loss. Consequences of cochlear hearing loss on speech perception and psychophysical abilities have been well documented. Primary goal of this research was to explore temporal processing and speech perception Skills in individuals who are exposed to occupational noise of more than 80 dBA and not yet incurred clinically significant threshold shifts. Contribution of temporal processing skills to speech perception in adverse listening situation was also evaluated. A total of 118 participants took part in this research. Participants comprised three groups of train drivers in the age range of 30-40 (n= 13), 41 50 ( = 13), 41-50 (n = 9), and 51-60 (n = 6) years and their non-noise-exposed counterparts (n = 30 in each age group). Participants of all the groups including the train drivers had hearing sensitivity within 25 dB HL in the octave frequencies between 250 and 8 kHz. Temporal processing was evaluated using gap detection, modulation detection, and duration pattern tests. Speech recognition was tested in presence multi-talker babble at -5dB SNR. Differences between experimental and control groups were analyzed using ANOVA and independent sample t-tests. Results showed a trend of reduced temporal processing skills in individuals with noise exposure. These deficits were observed despite normal peripheral hearing sensitivity. Speech recognition scores in the presence of noise were also significantly poor in noise-exposed group. Furthermore, poor temporal processing skills partially accounted for the speech recognition difficulties exhibited by the noise-exposed individuals. These results suggest that noise can cause significant distortions in the processing of suprathreshold temporal cues which may add to difficulties in hearing in adverse listening conditions.
The effect of talker and intonation variability on speech perception in noise in children with dyslexia

PubMed Central

Hazan, Valerie; Messaoud-Galusi, Souhila; Rosen, Stuart

2013-01-01

Purpose To determine whether children with dyslexia (DYS) are more affected than age-matched average readers (AR) by talker and intonation variability when perceiving speech in noise. Method Thirty-four DYS and 25 AR children were tested on their perception of consonants in naturally-produced consonant-vowel (CV) tokens in multi-talker babble. Twelve CVs were presented for identification in four conditions varying in the degree of talker and intonation variability. Consonant place (/bi/-/di/) and voicing (/bi/-/pi/) discrimination was investigated with the same conditions. Results DYS children made slightly more identification errors than AR children but only for conditions with variable intonation. Errors were more frequent for a subset of consonants, generally weakly-encoded for AR children, for tokens with intonation patterns (steady and rise-fall) that occur infrequently in connected discourse. In discrimination tasks, which have a greater memory and cognitive load, DYS children scored lower than AR children across all conditions. Conclusions Unusual intonation patterns had a disproportionate (but small) effect on consonant intelligibility in noise for DYS children but adding talker variability did not. DYS children do not appear to have a general problem in perceiving speech in degraded conditions, which makes it unlikely that they lack robust phonological representations. PMID:22761322
The effect of talker and intonation variability on speech perception in noise in children with dyslexia.

PubMed

Hazan, Valerie; Messaoud-Galusi, Souhila; Rosen, Stuart

2013-02-01

In this study, the authors aimed to determine whether children with dyslexia (hereafter referred to as "DYS children") are more affected than children with average reading ability (hereafter referred to as "AR children") by talker and intonation variability when perceiving speech in noise. Thirty-four DYS and 25 AR children were tested on their perception of consonants in naturally produced CV tokens in multitalker babble. Twelve CVs were presented for identification in four conditions varying in the degree of talker and intonation variability. Consonant place (/bi/-/di/) and voicing (/bi/-/pi/) discrimination were investigated with the same conditions. DYS children made slightly more identification errors than AR children but only for conditions with variable intonation. Errors were more frequent for a subset of consonants, generally weakly encoded for AR children, for tokens with intonation patterns (steady and rise-fall) that occur infrequently in connected discourse. In discrimination tasks, which have a greater memory and cognitive load, DYS children scored lower than AR children across all conditions. Unusual intonation patterns had a disproportionate (but small) effect on consonant intelligibility in noise for DYS children, but adding talker variability did not. DYS children do not appear to have a general problem in perceiving speech in degraded conditions, which makes it unlikely that they lack robust phonological representations.
An evaluation of the performance of two binaural beamformers in complex and dynamic multitalker environments.

PubMed

Best, Virginia; Mejia, Jorge; Freeston, Katrina; van Hoesel, Richard J; Dillon, Harvey

2015-01-01

Binaural beamformers are super-directional hearing aids created by combining microphone outputs from each side of the head. While they offer substantial improvements in SNR over conventional directional hearing aids, the benefits (and possible limitations) of these devices in realistic, complex listening situations have not yet been fully explored. In this study we evaluated the performance of two experimental binaural beamformers. Testing was carried out using a horizontal loudspeaker array. Background noise was created using recorded conversations. Performance measures included speech intelligibility, localization in noise, acceptable noise level, subjective ratings, and a novel dynamic speech intelligibility measure. Participants were 27 listeners with bilateral hearing loss, fitted with BTE prototypes that could be switched between conventional directional or binaural beamformer microphone modes. Relative to the conventional directional microphones, both binaural beamformer modes were generally superior for tasks involving fixed frontal targets, but not always for situations involving dynamic target locations. Binaural beamformers show promise for enhancing listening in complex situations when the location of the source of interest is predictable.
An evaluation of the performance of two binaural beamformers in complex and dynamic multitalker environments

PubMed Central

Best, Virginia; Mejia, Jorge; Freeston, Katrina; van Hoesel, Richard J.; Dillon, Harvey

2016-01-01

Objective Binaural beamformers are super-directional hearing aids created by combining microphone outputs from each side of the head. While they offer substantial improvements in SNR over conventional directional hearing aids, the benefits (and possible limitations) of these devices in realistic, complex listening situations have not yet been fully explored. In this study we evaluated the performance of two experimental binaural beamformers. Design Testing was carried out using a horizontal loudspeaker array. Background noise was created using recorded conversations. Performance measures included speech intelligibility, localisation in noise, acceptable noise level, subjective ratings, and a novel dynamic speech intelligibility measure. Study sample Participants were 27 listeners with bilateral hearing loss, fitted with BTE prototypes that could be switched between conventional directional or binaural beamformer microphone modes. Results Relative to the conventional directional microphones, both binaural beamformer modes were generally superior for tasks involving fixed frontal targets, but not always for situations involving dynamic target locations. Conclusions Binaural beamformers show promise for enhancing listening in complex situations when the location of the source of interest is predictable. PMID:26140298
Exploring Use of the Coordinate Response Measure in a Multitalker Babble Paradigm

ERIC Educational Resources Information Center

Humes, Larry E.; Kidd, Gary R.; Fogerty, Daniel

2017-01-01

Purpose: Three experiments examined the use of competing coordinate response measure (CRM) sentences as a multitalker babble. Method: In Experiment I, young adults with normal hearing listened to a CRM target sentence in the presence of 2, 4, or 6 competing CRM sentences with synchronous or asynchronous onsets. In Experiment II, the condition with…
The cingulo-opercular network provides word-recognition benefit.

PubMed

Vaden, Kenneth I; Kuchinsky, Stefanie E; Cute, Stephanie L; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A

2013-11-27

Recognizing speech in difficult listening conditions requires considerable focus of attention that is often demonstrated by elevated activity in putative attention systems, including the cingulo-opercular network. We tested the prediction that elevated cingulo-opercular activity provides word-recognition benefit on a subsequent trial. Eighteen healthy, normal-hearing adults (10 females; aged 20-38 years) performed word recognition (120 trials) in multi-talker babble at +3 and +10 dB signal-to-noise ratios during a sparse sampling functional magnetic resonance imaging (fMRI) experiment. Blood oxygen level-dependent (BOLD) contrast was elevated in the anterior cingulate cortex, anterior insula, and frontal operculum in response to poorer speech intelligibility and response errors. These brain regions exhibited significantly greater correlated activity during word recognition compared with rest, supporting the premise that word-recognition demands increased the coherence of cingulo-opercular network activity. Consistent with an adaptive control network explanation, general linear mixed model analyses demonstrated that increased magnitude and extent of cingulo-opercular network activity was significantly associated with correct word recognition on subsequent trials. These results indicate that elevated cingulo-opercular network activity is not simply a reflection of poor performance or error but also supports word recognition in difficult listening conditions.
How Age, Linguistic Status, and the Nature of the Auditory Scene Alter the Manner in Which Listening Comprehension Is Achieved in Multitalker Conversations.

PubMed

Avivi-Reich, Meital; Jakubczyk, Agnes; Daneman, Meredyth; Schneider, Bruce A

2015-10-01

We investigated how age and linguistic status affected listeners' ability to follow and comprehend 3-talker conversations, and the extent to which individual differences in language proficiency predict speech comprehension under difficult listening conditions. Younger and older L1s as well as young L2s listened to 3-talker conversations, with or without spatial separation between talkers, in either quiet or against moderate or high 12-talker babble background, and were asked to answer questions regarding their contents. After compensating for individual differences in speech recognition, no significant differences in conversation comprehension were found among the groups. As expected, conversation comprehension decreased as babble level increased. Individual differences in reading comprehension skill contributed positively to performance in younger EL1s and in young EL2s to a lesser degree but not in older EL1s. Vocabulary knowledge was significantly and positively related to performance only at the intermediate babble level. The results indicate that the manner in which spoken language comprehension is achieved is modulated by the listeners' age and linguistic status.
Exploring Use of the Coordinate Response Measure in a Multitalker Babble Paradigm

PubMed Central

Kidd, Gary R.; Fogerty, Daniel

2017-01-01

Purpose Three experiments examined the use of competing coordinate response measure (CRM) sentences as a multitalker babble. Method In Experiment I, young adults with normal hearing listened to a CRM target sentence in the presence of 2, 4, or 6 competing CRM sentences with synchronous or asynchronous onsets. In Experiment II, the condition with 6 competing sentences was explored further. Three stimulus conditions (6 talkers saying same sentence, 1 talker producing 6 different sentences, and 6 talkers each saying a different sentence) were evaluated with different methods of presentation. Experiment III examined the performance of older adults with hearing impairment in a subset of conditions from Experiment II. Results In Experiment I, performance declined with increasing numbers of talkers and improved with asynchronous sentence onsets. Experiment II identified conditions under which an increase in the number of talkers led to better performance. In Experiment III, the relative effects of the number of talkers, messages, and onset asynchrony were the same for young and older listeners. Conclusions Multitalker babble composed of CRM sentences has masking properties similar to other types of multitalker babble. However, when the number of different talkers and messages are varied independently, performance is best with more talkers and fewer messages. PMID:28249093
Cortical Representations of Speech in a Multitalker Auditory Scene.

PubMed

Puvvada, Krishna C; Simon, Jonathan Z

2017-09-20

The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory scene, with both attended and unattended speech streams represented with almost equal fidelity. We also show that higher-order auditory cortical areas, by contrast, represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams. Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects. Copyright © 2017 the authors 0270-6474/17/379189-08$15.00/0.
Measuring effectiveness of semantic cues in degraded English sentences in non-native listeners.

PubMed

Shi, Lu-Feng

2014-01-01

This study employed Boothroyd and Nittrouer's k (1988) to directly quantify effectiveness in native versus non-native listeners' use of semantic cues. Listeners were presented speech-perception-in-noise sentences processed at three levels of concurrent multi-talker babble and reverberation. For each condition, 50 sentences with multiple semantic cues and 50 with minimum semantic cues were randomly presented. Listeners verbally reported and wrote down the target words. The metric, k, was derived from percent-correct scores for sentences with and without semantics. Ten native and 33 non-native listeners participated. The presence of semantics increased recognition benefit by over 250% for natives, but access to semantics remained limited for non-native listeners (90-135%). The k was comparable across conditions for native listeners, but level-dependent for non-natives. The k for non-natives was significantly different from 1 in all conditions, suggesting semantic cues, though reduced in importance in difficult conditions, were helpful for non-natives. Non-natives as a group were not as effective in using semantics to facilitate English sentence recognition as natives. Poor listening conditions were particularly adverse to the use of semantics in non-natives, who may rely on clear acoustic-phonetic cues before benefitting from semantic cues when recognizing connected speech.
The influence of informational masking in reverberant, multi-talker environments.

PubMed

Westermann, Adam; Buchholz, Jörg M

2015-08-01

The relevance of informational masking (IM) in real-world listening is not well understood. In literature, IM effects of up to 10 dB in measured speech reception thresholds (SRTs) are reported. However, these experiments typically employed simplified spatial configurations and speech corpora that magnified confusions. In this study, SRTs were measured with normal hearing subjects in a simulated cafeteria environment. The environment was reproduced by a 41-channel 3D-loudspeaker array. The target talker was 2 m in front of the listener and masking talkers were either spread throughout the room or colocated with the target. Three types of maskers were realized: one with the same talker as the target (maximum IM), one with talkers different from the target, and one with unintelligible, noise-vocoded talkers (minimal IM). Overall, SRTs improved for the spatially distributed conditions compared to the colocated conditions. Within the spatially distributed conditions, there was no significant difference between thresholds with the different- and vocoded-talker maskers. Conditions with the same-talker masker were the only conditions with substantially higher thresholds, especially in the colocated conditions. These results suggest that IM related to target-masker confusions, at least for normal-hearing listeners, is of low relevance in real-life listening.
How age and linguistic competence alter the interplay of perceptual and cognitive factors when listening to conversations in a noisy environment

PubMed Central

Avivi-Reich, Meital; Daneman, Meredyth; Schneider, Bruce A.

2013-01-01

Multi-talker conversations challenge the perceptual and cognitive capabilities of older adults and those listening in their second language (L2). In older adults these difficulties could reflect declines in the auditory, cognitive, or linguistic processes supporting speech comprehension. The tendency of L2 listeners to invoke some of the semantic and syntactic processes from their first language (L1) may interfere with speech comprehension in L2. These challenges might also force them to reorganize the ways in which they perceive and process speech, thereby altering the balance between the contributions of bottom-up vs. top-down processes to speech comprehension. Younger and older L1s as well as young L2s listened to conversations played against a babble background, with or without spatial separation between the talkers and masker, when the spatial positions of the stimuli were specified either by loudspeaker placements (real location), or through use of the precedence effect (virtual location). After listening to a conversation, the participants were asked to answer questions regarding its content. Individual hearing differences were compensated for by creating the same degree of difficulty in identifying individual words in babble. Once compensation was applied, the number of questions correctly answered increased when a real or virtual spatial separation was introduced between babble and talkers. There was no evidence that performance differed between real and virtual locations. The contribution of vocabulary knowledge to dialog comprehension was found to be larger in the virtual conditions than in the real whereas the contribution of reading comprehension skill did not depend on the listening environment but rather differed as a function of age and language proficiency. The results indicate that the acoustic scene and the cognitive and linguistic competencies of listeners modulate how and when top-down resources are engaged in aid of speech comprehension. PMID:24578684
How age and linguistic competence alter the interplay of perceptual and cognitive factors when listening to conversations in a noisy environment.

PubMed

Avivi-Reich, Meital; Daneman, Meredyth; Schneider, Bruce A

2014-01-01

Multi-talker conversations challenge the perceptual and cognitive capabilities of older adults and those listening in their second language (L2). In older adults these difficulties could reflect declines in the auditory, cognitive, or linguistic processes supporting speech comprehension. The tendency of L2 listeners to invoke some of the semantic and syntactic processes from their first language (L1) may interfere with speech comprehension in L2. These challenges might also force them to reorganize the ways in which they perceive and process speech, thereby altering the balance between the contributions of bottom-up vs. top-down processes to speech comprehension. Younger and older L1s as well as young L2s listened to conversations played against a babble background, with or without spatial separation between the talkers and masker, when the spatial positions of the stimuli were specified either by loudspeaker placements (real location), or through use of the precedence effect (virtual location). After listening to a conversation, the participants were asked to answer questions regarding its content. Individual hearing differences were compensated for by creating the same degree of difficulty in identifying individual words in babble. Once compensation was applied, the number of questions correctly answered increased when a real or virtual spatial separation was introduced between babble and talkers. There was no evidence that performance differed between real and virtual locations. The contribution of vocabulary knowledge to dialog comprehension was found to be larger in the virtual conditions than in the real whereas the contribution of reading comprehension skill did not depend on the listening environment but rather differed as a function of age and language proficiency. The results indicate that the acoustic scene and the cognitive and linguistic competencies of listeners modulate how and when top-down resources are engaged in aid of speech comprehension.
Comparing Binaural Pre-processing Strategies III

PubMed Central

Warzybok, Anna; Ernst, Stephan M. A.

2015-01-01

A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922
Effect of motion on speech recognition.

PubMed

Davis, Timothy J; Grantham, D Wesley; Gifford, René H

2016-07-01

The benefit of spatial separation for talkers in a multi-talker environment is well documented. However, few studies have examined the effect of talker motion on speech recognition. In the current study, we evaluated the effects of (1) motion of the target or distracters, (2) a priori information about the target and distracter spatial configurations, and (3) target and distracter location. In total, seventeen young adults with normal hearing were tested in a large anechoic chamber in two experiments. In Experiment 1, seven stimulus conditions were tested using the Coordinate Response Measure (Bolia et al., 2000) speech corpus, in which subjects were required to report the key words in a target sentence presented simultaneously with two distracter sentences. As in previous studies, there was a significant improvement in key word identification for conditions in which the target and distracters were spatially separated as compared to the co-located conditions. In addition, 1) motion of either talker or distracter resulted in improved performance compared to stationary presentation (talker motion yielded significantly better performance than distracter motion) 2) a priori information regarding stimulus configuration was not beneficial, and 3) performance was significantly better with key words at 0° azimuth as compared to -60° (on the listener's left). Experiment 2 included two additional conditions designed to assess whether the benefit of motion observed in Experiment 1 was due to the motion itself or to the fact that the motion conditions introduced small spatial separations in the target and distracter key words. Results showed that small spatial separations (on the order of 5-8°) resulted in improved performance (relative to co-located key words) whether the sentences were moving or stationary. These results suggest that in the presence of distracting messages, motion of either target or distracters and/or small spatial separation of the key words may be beneficial for sound source segregation and thus for improved speech recognition. Copyright © 2016 Elsevier B.V. All rights reserved.

Clinical experience with the words-in-noise test on 3430 veterans: comparisons with pure-tone thresholds and word recognition in quiet.

PubMed

Wilson, Richard H

2011-01-01

Since the 1940s, measures of pure-tone sensitivity and speech recognition in quiet have been vital components of the audiologic evaluation. Although early investigators urged that speech recognition in noise also should be a component of the audiologic evaluation, only recently has this suggestion started to become a reality. This report focuses on the Words-in-Noise (WIN) Test, which evaluates word recognition in multitalker babble at seven signal-to-noise ratios and uses the 50% correct point (in dB SNR) calculated with the Spearman-Kärber equation as the primary metric. The WIN was developed and validated in a series of 12 laboratory studies. The current study examined the effectiveness of the WIN materials for measuring the word-recognition performance of patients in a typical clinical setting. To examine the relations among three audiometric measures including pure-tone thresholds, word-recognition performances in quiet, and word-recognition performances in multitalker babble for veterans seeking remediation for their hearing loss. Retrospective, descriptive. The participants were 3430 veterans who for the most part were evaluated consecutively in the Audiology Clinic at the VA Medical Center, Mountain Home, Tennessee. The mean age was 62.3 yr (SD = 12.8 yr). The data were collected in the course of a 60 min routine audiologic evaluation. A history, otoscopy, and aural-acoustic immittance measures also were included in the clinic protocol but were not evaluated in this report. Overall, the 1000-8000 Hz thresholds were significantly lower (better) in the right ear (RE) than in the left ear (LE). There was a direct relation between age and the pure-tone thresholds, with greater change across age in the high frequencies than in the low frequencies. Notched audiograms at 4000 Hz were observed in at least one ear in 41% of the participants with more unilateral than bilateral notches. Normal pure-tone thresholds (≤20 dB HL) were obtained from 6% of the participants. Maximum performance on the Northwestern University Auditory Test No. 6 (NU-6) in quiet was ≥90% correct by 50% of the participants, with an additional 20% performing at ≥80% correct; the RE performed 1-3% better than the LE. Of the 3291 who completed the WIN on both ears, only 7% exhibited normal performance (50% correct point of ≤6 dB SNR). Overall, WIN performance was significantly better in the RE (mean = 13.3 dB SNR) than in the LE (mean = 13.8 dB SNR). Recognition performance on both the NU-6 and the WIN decreased as a function of both pure-tone hearing loss and age. There was a stronger relation between the high-frequency pure-tone average (1000, 2000, and 4000 Hz) and the WIN than between the pure-tone average (500, 1000, and 2000 Hz) and the WIN. The results on the WIN from both the previous laboratory studies and the current clinical study indicate that the WIN is an appropriate clinic instrument to assess word-recognition performance in background noise. Recognition performance on a speech-in-quiet task does not predict performance on a speech-in-noise task, as the two tasks reflect different domains of auditory function. Experience with the WIN indicates that word-in-noise tasks should be considered the "stress test" for auditory function. American Academy of Audiology.
Raspberry, not a car: context predictability and a phonological advantage in early and late learners’ processing of speech in noise

PubMed Central

Gor, Kira

2014-01-01

Second language learners perform worse than native speakers under adverse listening conditions, such as speech in noise (SPIN). No data are available on heritage language speakers’ (early naturalistic interrupted learners’) ability to perceive SPIN. The current study fills this gap and investigates the perception of Russian speech in multi-talker babble noise by the matched groups of high- and low-proficiency heritage speakers (HSs) and late second language learners of Russian who were native speakers of English. The study includes a control group of Russian native speakers. It manipulates the noise level (high and low), and context cloze probability (high and low). The results of the SPIN task are compared to the tasks testing the control of phonology, AXB discrimination and picture-word discrimination, and lexical knowledge, a word translation task, in the same participants. The increased phonological sensitivity of HSs interacted with their ability to rely on top–down processing in sentence integration, use contextual cues, and build expectancies in the high-noise/high-context condition in a bootstrapping fashion. HSs outperformed oral proficiency-matched late second language learners on SPIN task and two tests of phonological sensitivity. The outcomes of the SPIN experiment support both the early naturalistic advantage and the role of proficiency in HSs. HSs’ ability to take advantage of the high-predictability context in the high-noise condition was mitigated by their level of proficiency. Only high-proficiency HSs, but not any other non-native group, took advantage of the high-predictability context that became available with better phonological processing skills in high-noise. The study thus confirms high-proficiency (but not low-proficiency) HSs’ nativelike ability to combine bottom–up and top–down cues in processing SPIN. PMID:25566130
Simultaneous Communication Supports Learning in Noise by Cochlear Implant Users

PubMed Central

Blom, Helen C.; Marschark, Marc; Machmer, Elizabeth

2017-01-01

Objectives This study sought to evaluate the potential of using spoken language and signing together (simultaneous communication, SimCom, sign-supported speech) as a means of improving speech recognition, comprehension, and learning by cochlear implant users in noisy contexts. Methods Forty eight college students who were active cochlear implant users, watched videos of three short presentations, the text versions of which were standardized at the 8th grade reading level. One passage was presented in spoken language only, one was presented in spoken language with multi-talker babble background noise, and one was presented via simultaneous communication with the same background noise. Following each passage, participants responded to 10 (standardized) open-ended questions designed to assess comprehension. Indicators of participants’ spoken language and sign language skills were obtained via self-reports and objective assessments. Results When spoken materials were accompanied by signs, scores were significantly higher than when materials were spoken in noise without signs. Participants’ receptive spoken language skills significantly predicted scores in all three conditions; neither their receptive sign skills nor age of implantation predicted performance. Discussion Students who are cochlear implant users typically rely solely on spoken language in the classroom. The present results, however, suggest that there are potential benefits of simultaneous communication for such learners in noisy settings. For those cochlear implant users who know sign language, the redundancy of speech and signs potentially can offset the reduced fidelity of spoken language in noise. Conclusion Accompanying spoken language with signs can benefit learners who are cochlear implant users in noisy situations such as classroom settings. Factors associated with such benefits, such as receptive skills in signed and spoken modalities, classroom acoustics, and material difficulty need to be empirically examined. PMID:28010675
EEG activity evoked in preparation for multi-talker listening by adults and children.

PubMed

Holmes, Emma; Kitterick, Padraig T; Summerfield, A Quentin

2016-06-01

Selective attention is critical for successful speech perception because speech is often encountered in the presence of other sounds, including the voices of competing talkers. Faced with the need to attend selectively, listeners perceive speech more accurately when they know characteristics of upcoming talkers before they begin to speak. However, the neural processes that underlie the preparation of selective attention for voices are not fully understood. The current experiments used electroencephalography (EEG) to investigate the time course of brain activity during preparation for an upcoming talker in young adults aged 18-27 years with normal hearing (Experiments 1 and 2) and in typically-developing children aged 7-13 years (Experiment 3). Participants reported key words spoken by a target talker when an opposite-gender distractor talker spoke simultaneously. The two talkers were presented from different spatial locations (±30° azimuth). Before the talkers began to speak, a visual cue indicated either the location (left/right) or the gender (male/female) of the target talker. Adults evoked preparatory EEG activity that started shortly after (<50 ms) the visual cue was presented and was sustained until the talkers began to speak. The location cue evoked similar preparatory activity in Experiments 1 and 2 with different samples of participants. The gender cue did not evoke preparatory activity when it predicted gender only (Experiment 1) but did evoke preparatory activity when it predicted the identity of a specific talker with greater certainty (Experiment 2). Location cues evoked significant preparatory EEG activity in children but gender cues did not. The results provide converging evidence that listeners evoke consistent preparatory brain activity for selecting a talker by their location (regardless of their gender or identity), but not by their gender alone. Copyright © 2016 Elsevier B.V. All rights reserved.
Speech Perception in Noise in Normally Hearing Children: Does Binaural Frequency Modulated Fitting Provide More Benefit than Monaural Frequency Modulated Fitting?

PubMed

Mukari, Siti Zamratol-Mai Sarah; Umat, Cila; Razak, Ummu Athiyah Abdul

2011-07-01

The aim of the present study was to compare the benefit of monaural versus binaural ear-level frequency modulated (FM) fitting on speech perception in noise in children with normal hearing. Reception threshold for sentences (RTS) was measured in no-FM, monaural FM, and binaural FM conditions in 22 normally developing children with bilateral normal hearing, aged 8 to 9 years old. Data were gathered using the Pediatric Malay Hearing in Noise Test (P-MyHINT) with speech presented from front and multi-talker babble presented from 90°, 180°, 270° azimuths in a sound treated booth. The results revealed that the use of either monaural or binaural ear level FM receivers provided significantly better mean RTSs than the no-FM condition (P<0.001). However, binaural FM did not produce a significantly greater benefit in mean RTS than monaural fitting. The benefit of binaural over monaural FM varies across individuals; while binaural fitting provided better RTSs in about 50% of study subjects, there were those in whom binaural fitting resulted in either deterioration or no additional improvement compared to monaural FM fitting. The present study suggests that the use of monaural ear-level FM receivers in children with normal hearing might provide similar benefit as binaural use. Individual subjects' variations of binaural FM benefit over monaural FM suggests that the decision to employ monaural or binaural fitting should be individualized. It should be noted however, that the current study recruits typically developing normal hearing children. Future studies involving normal hearing children with high risk of having difficulty listening in noise is indicated to see if similar findings are obtained.
Simultaneous communication supports learning in noise by cochlear implant users.

PubMed

Blom, Helen; Marschark, Marc; Machmer, Elizabeth

2017-01-01

This study sought to evaluate the potential of using spoken language and signing together (simultaneous communication, SimCom, sign-supported speech) as a means of improving speech recognition, comprehension, and learning by cochlear implant (CI) users in noisy contexts. Forty eight college students who were active CI users, watched videos of three short presentations, the text versions of which were standardized at the 8 th -grade reading level. One passage was presented in spoken language only, one was presented in spoken language with multi-talker babble background noise, and one was presented via simultaneous communication with the same background noise. Following each passage, participants responded to 10 (standardized) open-ended questions designed to assess comprehension. Indicators of participants' spoken language and sign language skills were obtained via self-reports and objective assessments. When spoken materials were accompanied by signs, scores were significantly higher than when materials were spoken in noise without signs. Participants' receptive spoken language skills significantly predicted scores in all three conditions; neither their receptive sign skills nor age of implantation predicted performance. Students who are CI users typically rely solely on spoken language in the classroom. The present results, however, suggest that there are potential benefits of simultaneous communication for such learners in noisy settings. For those CI users who know sign language, the redundancy of speech and signs potentially can offset the reduced fidelity of spoken language in noise. Accompanying spoken language with signs can benefit learners who are CI users in noisy situations such as classroom settings. Factors associated with such benefits, such as receptive skills in signed and spoken modalities, classroom acoustics, and material difficulty need to be empirically examined.
Raspberry, not a car: context predictability and a phonological advantage in early and late learners' processing of speech in noise.

PubMed

Gor, Kira

2014-01-01

Second language learners perform worse than native speakers under adverse listening conditions, such as speech in noise (SPIN). No data are available on heritage language speakers' (early naturalistic interrupted learners') ability to perceive SPIN. The current study fills this gap and investigates the perception of Russian speech in multi-talker babble noise by the matched groups of high- and low-proficiency heritage speakers (HSs) and late second language learners of Russian who were native speakers of English. The study includes a control group of Russian native speakers. It manipulates the noise level (high and low), and context cloze probability (high and low). The results of the SPIN task are compared to the tasks testing the control of phonology, AXB discrimination and picture-word discrimination, and lexical knowledge, a word translation task, in the same participants. The increased phonological sensitivity of HSs interacted with their ability to rely on top-down processing in sentence integration, use contextual cues, and build expectancies in the high-noise/high-context condition in a bootstrapping fashion. HSs outperformed oral proficiency-matched late second language learners on SPIN task and two tests of phonological sensitivity. The outcomes of the SPIN experiment support both the early naturalistic advantage and the role of proficiency in HSs. HSs' ability to take advantage of the high-predictability context in the high-noise condition was mitigated by their level of proficiency. Only high-proficiency HSs, but not any other non-native group, took advantage of the high-predictability context that became available with better phonological processing skills in high-noise. The study thus confirms high-proficiency (but not low-proficiency) HSs' nativelike ability to combine bottom-up and top-down cues in processing SPIN.
Did You Listen to the Beat? Auditory Steady-State Responses in the Human Electroencephalogram at 4 and 7 Hz Modulation Rates Reflect Selective Attention.

PubMed

Jaeger, Manuela; Bleichner, Martin G; Bauer, Anna-Katharina R; Mirkovic, Bojana; Debener, Stefan

2018-02-27

The acoustic envelope of human speech correlates with the syllabic rate (4-8 Hz) and carries important information for intelligibility, which is typically compromised in multi-talker, noisy environments. In order to better understand the dynamics of selective auditory attention to low frequency modulated sound sources, we conducted a two-stream auditory steady-state response (ASSR) selective attention electroencephalogram (EEG) study. The two streams consisted of 4 and 7 Hz amplitude and frequency modulated sounds presented from the left and right side. One of two streams had to be attended while the other had to be ignored. The attended stream always contained a target, allowing for the behavioral confirmation of the attention manipulation. EEG ASSR power analysis revealed a significant increase in 7 Hz power for the attend compared to the ignore conditions. There was no significant difference in 4 Hz power when the 4 Hz stream had to be attended compared to when it had to be ignored. This lack of 4 Hz attention modulation could be explained by a distracting effect of a third frequency at 3 Hz (beat frequency) perceivable when the 4 and 7 Hz streams are presented simultaneously. Taken together our results show that low frequency modulations at syllabic rate are modulated by selective spatial attention. Whether attention effects act as enhancement of the attended stream or suppression of to be ignored stream may depend on how well auditory streams can be segregated.
Contribution of low-frequency harmonics to Mandarin Chinese tone identification in quiet and six-talker babble background.

PubMed

Liu, Chang; Azimi, Behnam; Bhandary, Moulesh; Hu, Yi

2014-01-01

The goal of this study was to investigate Mandarin Chinese tone identification in quiet and multi-talker babble conditions for normal-hearing listeners. Tone identification was measured with speech stimuli and stimuli with low and/or high harmonics that were embedded in three Mandarin vowels with two fundamental frequencies. There were six types of stimuli: all harmonics (All), low harmonics (Low), high harmonics (High), and the first (H1), second (H2), and third (H3) harmonic. Results showed that, for quiet conditions, individual harmonics carried frequency contour information well enough for tone identification with high accuracy; however, in noisy conditions, tone identification with individual low harmonics (e.g., H1, H2, and H3) was significantly lower than that with the Low, High, and All harmonics. Moreover, tone identification with individual harmonics in noise was lower for a low F0 than for a high F0, and was also dependent on vowel category. Tone identification with individual low-frequency harmonics was accounted for by local signal-to-noise ratios, indicating that audibility of harmonics in noise may play a primary role in tone identification.
Biologically inspired binaural hearing aid algorithms: Design principles and effectiveness

NASA Astrophysics Data System (ADS)

Feng, Albert

2002-05-01

Despite rapid advances in the sophistication of hearing aid technology and microelectronics, listening in noise remains problematic for people with hearing impairment. To solve this problem two algorithms were designed for use in binaural hearing aid systems. The signal processing strategies are based on principles in auditory physiology and psychophysics: (a) the location/extraction (L/E) binaural computational scheme determines the directions of source locations and cancels noise by applying a simple subtraction method over every frequency band; and (b) the frequency-domain minimum-variance (FMV) scheme extracts a target sound from a known direction amidst multiple interfering sound sources. Both algorithms were evaluated using standard metrics such as signal-to-noise-ratio gain and articulation index. Results were compared with those from conventional adaptive beam-forming algorithms. In free-field tests with multiple interfering sound sources our algorithms performed better than conventional algorithms. Preliminary intelligibility and speech reception results in multitalker environments showed gains for every listener with normal or impaired hearing when the signals were processed in real time with the FMV binaural hearing aid algorithm. [Work supported by NIH-NIDCD Grant No. R21DC04840 and the Beckman Institute.
Probing the limits of alpha power lateralisation as a neural marker of selective attention in middle-aged and older listeners.

PubMed

Tune, Sarah; Wöstmann, Malte; Obleser, Jonas

2018-02-11

In recent years, hemispheric lateralisation of alpha power has emerged as a neural mechanism thought to underpin spatial attention across sensory modalities. Yet, how healthy ageing, beginning in middle adulthood, impacts the modulation of lateralised alpha power supporting auditory attention remains poorly understood. In the current electroencephalography study, middle-aged and older adults (N = 29; ~40-70 years) performed a dichotic listening task that simulates a challenging, multitalker scenario. We examined the extent to which the modulation of 8-12 Hz alpha power would serve as neural marker of listening success across age. With respect to the increase in interindividual variability with age, we examined an extensive battery of behavioural, perceptual and neural measures. Similar to findings on younger adults, middle-aged and older listeners' auditory spatial attention induced robust lateralisation of alpha power, which synchronised with the speech rate. Notably, the observed relationship between this alpha lateralisation and task performance did not co-vary with age. Instead, task performance was strongly related to an individual's attentional and working memory capacity. Multivariate analyses revealed a separation of neural and behavioural variables independent of age. Our results suggest that in age-varying samples as the present one, the lateralisation of alpha power is neither a sufficient nor necessary neural strategy for an individual's auditory spatial attention, as higher age might come with increased use of alternative, compensatory mechanisms. Our findings emphasise that explaining interindividual variability will be key to understanding the role of alpha oscillations in auditory attention in the ageing listener. © 2018 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Informational Masking Effects on Neural Encoding of Stimulus Onset and Acoustic Change.

PubMed

Niemczak, Christopher E; Vander Werff, Kathy R

2018-05-18

Recent investigations using cortical auditory evoked potentials have shown masker-dependent effects on sensory cortical processing of speech information. Background noise maskers consisting of other people talking are particularly difficult for speech recognition. Behavioral studies have related this to perceptual masking, or informational masking, beyond just the overlap of the masker and target at the auditory periphery. The aim of the present study was to use cortical auditory evoked potentials, to examine how maskers (i.e., continuous speech-shaped noise [SSN] and multi-talker babble) affect the cortical sensory encoding of speech information at an obligatory level of processing. Specifically, cortical responses to vowel onset and formant change were recorded under different background noise conditions presumed to represent varying amounts of energetic or informational masking. The hypothesis was, that even at this obligatory cortical level of sensory processing, we would observe larger effects on the amplitude and latency of the onset and change components as the amount of informational masking increased across background noise conditions. Onset and change responses were recorded to a vowel change from /u-i/ in young adults under four conditions: quiet, continuous SSN, eight-talker (8T) babble, and two-talker (2T) babble. Repeated measures analyses by noise condition were conducted on amplitude, latency, and response area measurements to determine the differential effects of these noise conditions, designed to represent increasing and varying levels of informational and energetic masking, on cortical neural representation of a vowel onset and acoustic change response waveforms. All noise conditions significantly reduced onset N1 and P2 amplitudes, onset N1-P2 peak to peak amplitudes, as well as both onset and change response area compared with quiet conditions. Further, all amplitude and area measures were significantly reduced for the two babble conditions compared with continuous SSN. However, there were no significant differences in peak amplitude or area for either onset or change responses between the two different babble conditions (eight versus two talkers). Mean latencies for all onset peaks were delayed for noise conditions compared with quiet. However, in contrast to the amplitude and area results, differences in peak latency between SSN and the babble conditions did not reach statistical significance. These results support the idea that while background noise maskers generally reduce amplitude and increase latency of speech-sound evoked cortical responses, the type of masking has a significant influence. Speech babble maskers (eight talkers and two talkers) have a larger effect on the obligatory cortical response to speech sound onset and change compared with purely energetic continuous SSN maskers, which may be attributed to informational masking effects. Neither the neural responses to the onset nor the vowel change, however, were sensitive to the hypothesized increase in the amount of informational masking between speech babble maskers with two talkers compared with eight talkers.
Difficulty understanding speech in noise by the hearing impaired: underlying causes and technological solutions.

PubMed

Healy, Eric W; Yoho, Sarah E

2016-08-01

A primary complaint of hearing-impaired individuals involves poor speech understanding when background noise is present. Hearing aids and cochlear implants often allow good speech understanding in quiet backgrounds. But hearing-impaired individuals are highly noise intolerant, and existing devices are not very effective at combating background noise. As a result, speech understanding in noise is often quite poor. In accord with the significance of the problem, considerable effort has been expended toward understanding and remedying this issue. Fortunately, our understanding of the underlying issues is reasonably good. In sharp contrast, effective solutions have remained elusive. One solution that seems promising involves a single-microphone machine-learning algorithm to extract speech from background noise. Data from our group indicate that the algorithm is capable of producing vast increases in speech understanding by hearing-impaired individuals. This paper will first provide an overview of the speech-in-noise problem and outline why hearing-impaired individuals are so noise intolerant. An overview of our approach to solving this problem will follow.
Auditory and cognitive factors underlying individual differences in aided speech-understanding among older adults

PubMed Central

Humes, Larry E.; Kidd, Gary R.; Lentz, Jennifer J.

2013-01-01

This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male) ranging in age from 60 to 86 (mean = 69.2). Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures), psychophysical (17 measures), and speech-understanding (9 measures), as well as the Speech, Spatial, and Qualities of Hearing (SSQ) self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference). All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding) measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection) as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI), and performance on the text-recognition-threshold (TRT) task (a visual analog of interrupted speech recognition). These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance. PMID:24098273
Understanding the abstract role of speech in communication at 12 months.

PubMed

Martin, Alia; Onishi, Kristine H; Vouloumanos, Athena

2012-04-01

Adult humans recognize that even unfamiliar speech can communicate information between third parties, demonstrating an ability to separate communicative function from linguistic content. We examined whether 12-month-old infants understand that speech can communicate before they understand the meanings of specific words. Specifically, we test the understanding that speech permits the transfer of information about a Communicator's target object to a Recipient. Initially, the Communicator selectively grasped one of two objects. In test, the Communicator could no longer reach the objects. She then turned to the Recipient and produced speech (a nonsense word) or non-speech (coughing). Infants looked longer when the Recipient selected the non-target than the target object when the Communicator had produced speech but not coughing (Experiment 1). Looking time patterns differed from the speech condition when the Recipient rather than the Communicator produced the speech (Experiment 2), and when the Communicator produced a positive emotional vocalization (Experiment 3), but did not differ when the Recipient had previously received information about the target by watching the Communicator's selective grasping (Experiment 4). Thus infants understand the information-transferring properties of speech and recognize some of the conditions under which others' information states can be updated. These results suggest that infants possess an abstract understanding of the communicative function of speech, providing an important potential mechanism for language and knowledge acquisition. Copyright © 2011 Elsevier B.V. All rights reserved.
Asymmetric Dynamic Attunement of Speech and Gestures in the Construction of Children's Understanding.

PubMed

De Jonge-Hoekstra, Lisette; Van der Steen, Steffie; Van Geert, Paul; Cox, Ralf F A

2016-01-01

As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6) from Kindergarten (n = 5) and first grade (n = 7) participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA) to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on (1) the temporal relation between gestures and speech, (2) the relative strength and direction of the interaction between gestures and speech, (3) the relative strength and direction between gestures and speech for different levels of understanding, and (4) relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal) asymmetry in the gestures-speech interaction. For younger children, the balance leans more toward gestures leading speech in time, while the balance leans more toward speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools' language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between gestures and speech. The picture that emerges from our analyses suggests that the relation between gestures, speech and cognition is more complex than previously thought. We suggest that temporal differences and asymmetry in influence between gestures and speech arise from simultaneous coordination of synergies.
Contributions of speech science to the technology of man-machine voice interactions

NASA Technical Reports Server (NTRS)

Lea, Wayne A.

1977-01-01

Research in speech understanding was reviewed. Plans which include prosodics research, phonological rules for speech understanding systems, and continued interdisciplinary phonetics research are discussed. Improved acoustic phonetic analysis capabilities in speech recognizers are suggested.
Intelligibility of emotional speech in younger and older adults.

PubMed

Dupuis, Kate; Pichora-Fuller, M Kathleen

2014-01-01

Little is known about the influence of vocal emotions on speech understanding. Word recognition accuracy for stimuli spoken to portray seven emotions (anger, disgust, fear, sadness, neutral, happiness, and pleasant surprise) was tested in younger and older listeners. Emotions were presented in either mixed (heterogeneous emotions mixed in a list) or blocked (homogeneous emotion blocked in a list) conditions. Three main hypotheses were tested. First, vocal emotion affects word recognition accuracy; specifically, portrayals of fear enhance word recognition accuracy because listeners orient to threatening information and/or distinctive acoustical cues such as high pitch mean and variation. Second, older listeners recognize words less accurately than younger listeners, but the effects of different emotions on intelligibility are similar across age groups. Third, blocking emotions in list results in better word recognition accuracy, especially for older listeners, and reduces the effect of emotion on intelligibility because as listeners develop expectations about vocal emotion, the allocation of processing resources can shift from emotional to lexical processing. Emotion was the within-subjects variable: all participants heard speech stimuli consisting of a carrier phrase followed by a target word spoken by either a younger or an older talker, with an equal number of stimuli portraying each of seven vocal emotions. The speech was presented in multi-talker babble at signal to noise ratios adjusted for each talker and each listener age group. Listener age (younger, older), condition (mixed, blocked), and talker (younger, older) were the main between-subjects variables. Fifty-six students (Mage= 18.3 years) were recruited from an undergraduate psychology course; 56 older adults (Mage= 72.3 years) were recruited from a volunteer pool. All participants had clinically normal pure-tone audiometric thresholds at frequencies ≤3000 Hz. There were significant main effects of emotion, listener age group, and condition on the accuracy of word recognition in noise. Stimuli spoken in a fearful voice were the most intelligible, while those spoken in a sad voice were the least intelligible. Overall, word recognition accuracy was poorer for older than younger adults, but there was no main effect of talker, and the pattern of the effects of different emotions on intelligibility did not differ significantly across age groups. Acoustical analyses helped elucidate the effect of emotion and some intertalker differences. Finally, all participants performed better when emotions were blocked. For both groups, performance improved over repeated presentations of each emotion in both blocked and mixed conditions. These results are the first to demonstrate a relationship between vocal emotion and word recognition accuracy in noise for younger and older listeners. In particular, the enhancement of intelligibility by emotion is greatest for words spoken to portray fear and presented heterogeneously with other emotions. Fear may have a specialized role in orienting attention to words heard in noise. This finding may be an auditory counterpart to the enhanced detection of threat information in visual displays. The effect of vocal emotion on word recognition accuracy is preserved in older listeners with good audiograms and both age groups benefit from blocking and the repetition of emotions.
Speech-to-Speech Relay Service

MedlinePlus

... are specifically trained in understanding a variety of speech disorders, which enables them to repeat what the caller says in a manner that makes the caller’s words clear and understandable to the ... people with speech disabilities cannot communicate by telephone because the parties ...
Speech Understanding with a New Implant Technology: A Comparative Study with a New Nonskin Penetrating Baha System

PubMed Central

Caversaccio, Marco

2014-01-01

Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract) to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct), aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found. PMID:25140314

Motor Speech Disorders Associated with Primary Progressive Aphasia

PubMed Central

Duffy, Joseph R.; Strand, Edythe A.; Josephs, Keith A.

2014-01-01

Background Primary progressive aphasia (PPA) and conditions that overlap with it can be accompanied by motor speech disorders. Recognition and understanding of motor speech disorders can contribute to a fuller clinical understanding of PPA and its management as well as its localization and underlying pathology. Aims To review the types of motor speech disorders that may occur with PPA, its primary variants, and its overlap syndromes (progressive supranuclear palsy syndrome, corticobasal syndrome, motor neuron disease), as well as with primary progressive apraxia of speech. Main Contribution The review should assist clinicians' and researchers' understanding of the relationship between motor speech disorders and PPA and its major variants. It also highlights the importance of recognizing neurodegenerative apraxia of speech as a condition that can occur with little or no evidence of aphasia. Conclusion Motor speech disorders can occur with PPA. Their recognition can contribute to clinical diagnosis and management of PPA and to understanding and predicting the localization and pathology associated with PPA variants and conditions that can overlap with them. PMID:25309017
The Relationship Between Spectral Modulation Detection and Speech Recognition: Adult Versus Pediatric Cochlear Implant Recipients

PubMed Central

Noble, Jack H.; Camarata, Stephen M.; Sunderhaus, Linsey W.; Dwyer, Robert T.; Dawant, Benoit M.; Dietrich, Mary S.; Labadie, Robert F.

2018-01-01

Adult cochlear implant (CI) recipients demonstrate a reliable relationship between spectral modulation detection and speech understanding. Prior studies documenting this relationship have focused on postlingually deafened adult CI recipients—leaving an open question regarding the relationship between spectral resolution and speech understanding for adults and children with prelingual onset of deafness. Here, we report CI performance on the measures of speech recognition and spectral modulation detection for 578 CI recipients including 477 postlingual adults, 65 prelingual adults, and 36 prelingual pediatric CI users. The results demonstrated a significant correlation between spectral modulation detection and various measures of speech understanding for 542 adult CI recipients. For 36 pediatric CI recipients, however, there was no significant correlation between spectral modulation detection and speech understanding in quiet or in noise nor was spectral modulation detection significantly correlated with listener age or age at implantation. These findings suggest that pediatric CI recipients might not depend upon spectral resolution for speech understanding in the same manner as adult CI recipients. It is possible that pediatric CI users are making use of different cues, such as those contained within the temporal envelope, to achieve high levels of speech understanding. Further investigation is warranted to investigate the relationship between spectral and temporal resolution and speech recognition to describe the underlying mechanisms driving peripheral auditory processing in pediatric CI users. PMID:29716437
Cortical activation patterns correlate with speech understanding after cochlear implantation

PubMed Central

Olds, Cristen; Pollonini, Luca; Abaya, Homer; Larky, Jannine; Loy, Megan; Bortfeld, Heather; Beauchamp, Michael S.; Oghalai, John S.

2015-01-01

Objectives Cochlear implants are a standard therapy for deafness, yet the ability of implanted patients to understand speech varies widely. To better understand this variability in outcomes, we used functional near-infrared spectroscopy (fNIRS) to image activity within regions of the auditory cortex and compare the results to behavioral measures of speech perception. Design We studied 32 deaf adults hearing through cochlear implants and 35 normal-hearing controls. We used fNIRS to measure responses within the lateral temporal lobe and the superior temporal gyrus to speech stimuli of varying intelligibility. The speech stimuli included normal speech, channelized speech (vocoded into 20 frequency bands), and scrambled speech (the 20 frequency bands were shuffled in random order). We also used environmental sounds as a control stimulus. Behavioral measures consisted of the Speech Reception Threshold, CNC words, and AzBio Sentence tests measured in quiet. Results Both control and implanted participants with good speech perception exhibited greater cortical activations to natural speech than to unintelligible speech. In contrast, implanted participants with poor speech perception had large, indistinguishable cortical activations to all stimuli. The ratio of cortical activation to normal speech to that of scrambled speech directly correlated with the CNC Words and AzBio Sentences scores. This pattern of cortical activation was not correlated with auditory threshold, age, side of implantation, or time after implantation. Turning off the implant reduced cortical activations in all implanted participants. Conclusions Together, these data indicate that the responses we measured within the lateral temporal lobe and the superior temporal gyrus correlate with behavioral measures of speech perception, demonstrating a neural basis for the variability in speech understanding outcomes after cochlear implantation. PMID:26709749
Inferring Speaker Affect in Spoken Natural Language Communication

ERIC Educational Resources Information Center

Pon-Barry, Heather Roberta

2013-01-01

The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…
Do 6-Month-Olds Understand That Speech Can Communicate?

ERIC Educational Resources Information Center

Vouloumanos, Athena; Martin, Alia; Onishi, Kristine H.

2014-01-01

Adults and 12-month-old infants recognize that even unfamiliar speech can communicate information between third parties, suggesting that they can separate the communicative function of speech from its lexical content. But do infants recognize that speech can communicate due to their experience understanding and producing language, or do they…
Audibility-based predictions of speech recognition for children and adults with normal hearing.

PubMed

McCreery, Ryan W; Stelmachowicz, Patricia G

2011-12-01

This study investigated the relationship between audibility and predictions of speech recognition for children and adults with normal hearing. The Speech Intelligibility Index (SII) is used to quantify the audibility of speech signals and can be applied to transfer functions to predict speech recognition scores. Although the SII is used clinically with children, relatively few studies have evaluated SII predictions of children's speech recognition directly. Children have required more audibility than adults to reach maximum levels of speech understanding in previous studies. Furthermore, children may require greater bandwidth than adults for optimal speech understanding, which could influence frequency-importance functions used to calculate the SII. Speech recognition was measured for 116 children and 19 adults with normal hearing. Stimulus bandwidth and background noise level were varied systematically in order to evaluate speech recognition as predicted by the SII and derive frequency-importance functions for children and adults. Results suggested that children required greater audibility to reach the same level of speech understanding as adults. However, differences in performance between adults and children did not vary across frequency bands. © 2011 Acoustical Society of America
Understanding the Abstract Role of Speech in Communication at 12 Months

ERIC Educational Resources Information Center

Martin, Alia; Onishi, Kristine H.; Vouloumanos, Athena

2012-01-01

Adult humans recognize that even unfamiliar speech can communicate information between third parties, demonstrating an ability to separate communicative function from linguistic content. We examined whether 12-month-old infants understand that speech can communicate before they understand the meanings of specific words. Specifically, we test the…
Variations in Articulatory Movement with Changes in Speech Task.

ERIC Educational Resources Information Center

Tasko, Stephen M.; McClean, Michael D.

2004-01-01

Studies of normal and disordered articulatory movement often rely on the use of short, simple speech tasks. However, the severity of speech disorders can be observed to vary markedly with task. Understanding task-related variations in articulatory kinematic behavior may allow for an improved understanding of normal and disordered speech motor…
Influence of musical training on understanding voiced and whispered speech in noise.

PubMed

Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J

2014-01-01

This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.
Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

PubMed Central

Preston, Jonathan L.; Seki, Ayumi

2012-01-01

Purpose The purposes are to (1) describe the assessment of residual speech sound disorders (SSD) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations, and (2) describe how assessment of domains such as speech motor control and phonological awareness can provide a more complete understanding of SSDs in bilinguals. Method A review of Japanese phonology is provided to offer a context for understanding the transfer of Japanese to English productions. A case study of an 11-year-old is presented, demonstrating parallel speech assessments in English and Japanese. Speech motor and phonological awareness tasks were conducted in both languages. Results Several patterns were observed in the participant’s English that could be plausibly explained by the influence of Japanese phonology. However, errors indicating a residual SSD were observed in both Japanese and English. A speech motor assessment suggested possible speech motor control problems, and phonological awareness was judged to be within the typical range of performance in both languages. Conclusion Understanding the phonological characteristics of L1 can help clinicians recognize speech patterns in L2 associated with transfer. Once these differences are understood, patterns associated with a residual SSD can be identified. Supplementing a relational speech analysis with measures of speech motor control and phonological awareness can provide a more comprehensive understanding of a client’s strengths and needs. PMID:21386046
Development of Attentional Control of Verbal Auditory Perception from Middle to Late Childhood: Comparisons to Healthy Aging

ERIC Educational Resources Information Center

Passow, Susanne; Müller, Maike; Westerhausen, René; Hugdahl, Kenneth; Wartenburger, Isabell; Heekeren, Hauke R.; Lindenberger, Ulman; Li, Shu-Chen

2013-01-01

Multitalker situations confront listeners with a plethora of competing auditory inputs, and hence require selective attention to relevant information, especially when the perceptual saliency of distracting inputs is high. This study augmented the classical forced-attention dichotic listening paradigm by adding an interaural intensity manipulation…
Executive Function, Visual Attention and the Cocktail Party Problem in Musicians and Non-Musicians.

PubMed

Clayton, Kameron K; Swaminathan, Jayaganesh; Yazdanbakhsh, Arash; Zuk, Jennifer; Patel, Aniruddh D; Kidd, Gerald

2016-01-01

The goal of this study was to investigate how cognitive factors influence performance in a multi-talker, "cocktail-party" like environment in musicians and non-musicians. This was achieved by relating performance in a spatial hearing task to cognitive processing abilities assessed using measures of executive function (EF) and visual attention in musicians and non-musicians. For the spatial hearing task, a speech target was presented simultaneously with two intelligible speech maskers that were either colocated with the target (0° azimuth) or were symmetrically separated from the target in azimuth (at ±15°). EF assessment included measures of cognitive flexibility, inhibition control and auditory working memory. Selective attention was assessed in the visual domain using a multiple object tracking task (MOT). For the MOT task, the observers were required to track target dots (n = 1,2,3,4,5) in the presence of interfering distractor dots. Musicians performed significantly better than non-musicians in the spatial hearing task. For the EF measures, musicians showed better performance on measures of auditory working memory compared to non-musicians. Furthermore, across all individuals, a significant correlation was observed between performance on the spatial hearing task and measures of auditory working memory. This result suggests that individual differences in performance in a cocktail party-like environment may depend in part on cognitive factors such as auditory working memory. Performance in the MOT task did not differ between groups. However, across all individuals, a significant correlation was found between performance in the MOT and spatial hearing tasks. A stepwise multiple regression analysis revealed that musicianship and performance on the MOT task significantly predicted performance on the spatial hearing task. Overall, these findings confirm the relationship between musicianship and cognitive factors including domain-general selective attention and working memory in solving the "cocktail party problem".
Executive Function, Visual Attention and the Cocktail Party Problem in Musicians and Non-Musicians

PubMed Central

Clayton, Kameron K.; Swaminathan, Jayaganesh; Yazdanbakhsh, Arash; Zuk, Jennifer; Patel, Aniruddh D.; Kidd, Gerald

2016-01-01

The goal of this study was to investigate how cognitive factors influence performance in a multi-talker, “cocktail-party” like environment in musicians and non-musicians. This was achieved by relating performance in a spatial hearing task to cognitive processing abilities assessed using measures of executive function (EF) and visual attention in musicians and non-musicians. For the spatial hearing task, a speech target was presented simultaneously with two intelligible speech maskers that were either colocated with the target (0° azimuth) or were symmetrically separated from the target in azimuth (at ±15°). EF assessment included measures of cognitive flexibility, inhibition control and auditory working memory. Selective attention was assessed in the visual domain using a multiple object tracking task (MOT). For the MOT task, the observers were required to track target dots (n = 1,2,3,4,5) in the presence of interfering distractor dots. Musicians performed significantly better than non-musicians in the spatial hearing task. For the EF measures, musicians showed better performance on measures of auditory working memory compared to non-musicians. Furthermore, across all individuals, a significant correlation was observed between performance on the spatial hearing task and measures of auditory working memory. This result suggests that individual differences in performance in a cocktail party-like environment may depend in part on cognitive factors such as auditory working memory. Performance in the MOT task did not differ between groups. However, across all individuals, a significant correlation was found between performance in the MOT and spatial hearing tasks. A stepwise multiple regression analysis revealed that musicianship and performance on the MOT task significantly predicted performance on the spatial hearing task. Overall, these findings confirm the relationship between musicianship and cognitive factors including domain-general selective attention and working memory in solving the “cocktail party problem”. PMID:27384330
Speech planning happens before speech execution: online reaction time methods in the study of apraxia of speech.

PubMed

Maas, Edwin; Mailend, Marja-Liisa

2012-10-01

The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Following a brief description of limitations of offline perceptual methods, we provide a narrative review of various types of RT paradigms from the (speech) motor programming and psycholinguistic literatures and their (thus far limited) application with AOS. On the basis of the review of the literature, we conclude that with careful consideration of potential challenges and caveats, RT approaches hold great promise to advance our understanding of AOS, in particular with respect to the speech planning processes that generate the speech signal before initiation. A deeper understanding of the nature and time course of speech planning and its disruptions in AOS may enhance diagnosis and treatment for AOS. Only a handful of published studies on apraxia of speech have used reaction time methods. However, these studies have provided deeper insight into speech planning impairments in AOS based on a variety of experimental paradigms.
Role of contextual cues on the perception of spectrally reduced interrupted speech.

PubMed

Patro, Chhayakanta; Mendel, Lisa Lucks

2016-08-01

Understanding speech within an auditory scene is constantly challenged by interfering noise in suboptimal listening environments when noise hinders the continuity of the speech stream. In such instances, a typical auditory-cognitive system perceptually integrates available speech information and "fills in" missing information in the light of semantic context. However, individuals with cochlear implants (CIs) find it difficult and effortful to understand interrupted speech compared to their normal hearing counterparts. This inefficiency in perceptual integration of speech could be attributed to further degradations in the spectral-temporal domain imposed by CIs making it difficult to utilize the contextual evidence effectively. To address these issues, 20 normal hearing adults listened to speech that was spectrally reduced and spectrally reduced interrupted in a manner similar to CI processing. The Revised Speech Perception in Noise test, which includes contextually rich and contextually poor sentences, was used to evaluate the influence of semantic context on speech perception. Results indicated that listeners benefited more from semantic context when they listened to spectrally reduced speech alone. For the spectrally reduced interrupted speech, contextual information was not as helpful under significant spectral reductions, but became beneficial as the spectral resolution improved. These results suggest top-down processing facilitates speech perception up to a point, and it fails to facilitate speech understanding when the speech signals are significantly degraded.
A Method for Assessing Auditory Spatial Analysis in Reverberant Multitalker Environments.

PubMed

Weller, Tobias; Best, Virginia; Buchholz, Jörg M; Young, Taegan

2016-07-01

Deficits in spatial hearing can have a negative impact on listeners' ability to orient in their environment and follow conversations in noisy backgrounds and may exacerbate the experience of hearing loss as a handicap. However, there are no good tools available for reliably capturing the spatial hearing abilities of listeners in complex acoustic environments containing multiple sounds of interest. The purpose of this study was to explore a new method to measure auditory spatial analysis in a reverberant multitalker scenario. This study was a descriptive case control study. Ten listeners with normal hearing (NH) aged 20-31 yr and 16 listeners with hearing impairment (HI) aged 52-85 yr participated in the study. The latter group had symmetrical sensorineural hearing losses with a four-frequency average hearing loss of 29.7 dB HL. A large reverberant room was simulated using a loudspeaker array in an anechoic chamber. In this simulated room, 96 scenes comprising between one and six concurrent talkers at different locations were generated. Listeners were presented with 45-sec samples of each scene, and were required to count, locate, and identify the gender of all talkers, using a graphical user interface on an iPad. Performance was evaluated in terms of correctly counting the sources and accuracy in localizing their direction. Listeners with NH were able to reliably analyze scenes with up to four simultaneous talkers, while most listeners with hearing loss demonstrated errors even with two talkers at a time. Localization performance decreased in both groups with increasing number of talkers and was significantly poorer in listeners with HI. Overall performance was significantly correlated with hearing loss. This new method appears to be useful for estimating spatial abilities in realistic multitalker scenes. The method is sensitive to the number of sources in the scene, and to effects of sensorineural hearing loss. Further work will be needed to compare this method to more traditional single-source localization tests. American Academy of Audiology.
The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults

PubMed Central

Rosemann, Stephanie; Gießing, Carsten; Özyurt, Jale; Carroll, Rebecca; Puschmann, Sebastian; Thiel, Christiane M.

2017-01-01

Noise-vocoded speech is commonly used to simulate the sensation after cochlear implantation as it consists of spectrally degraded speech. High individual variability exists in learning to understand both noise-vocoded speech and speech perceived through a cochlear implant (CI). This variability is partly ascribed to differing cognitive abilities like working memory, verbal skills or attention. Although clinically highly relevant, up to now, no consensus has been achieved about which cognitive factors exactly predict the intelligibility of speech in noise-vocoded situations in healthy subjects or in patients after cochlear implantation. We aimed to establish a test battery that can be used to predict speech understanding in patients prior to receiving a CI. Young and old healthy listeners completed a noise-vocoded speech test in addition to cognitive tests tapping on verbal memory, working memory, lexicon and retrieval skills as well as cognitive flexibility and attention. Partial-least-squares analysis revealed that six variables were important to significantly predict vocoded-speech performance. These were the ability to perceive visually degraded speech tested by the Text Reception Threshold, vocabulary size assessed with the Multiple Choice Word Test, working memory gauged with the Operation Span Test, verbal learning and recall of the Verbal Learning and Retention Test and task switching abilities tested by the Comprehensive Trail-Making Test. Thus, these cognitive abilities explain individual differences in noise-vocoded speech understanding and should be considered when aiming to predict hearing-aid outcome. PMID:28638329
Speech Understanding in Noise by Patients with Cochlear Implants Using a Monaural Adaptive Beamformer

ERIC Educational Resources Information Center

Dorman, Michael F.; Natale, Sarah; Spahr, Anthony; Castioni, Erin

2017-01-01

Purpose: The aim of this experiment was to compare, for patients with cochlear implants (CIs), the improvement for speech understanding in noise provided by a monaural adaptive beamformer and for two interventions that produced bilateral input (i.e., bilateral CIs and hearing preservation [HP] surgery). Method: Speech understanding scores for…
Children's Auditory Working Memory Performance in Degraded Listening Conditions

ERIC Educational Resources Information Center

Osman, Homira; Sullivan, Jessica R.

2014-01-01

Purpose: The objectives of this study were to determine (a) whether school-age children with typical hearing demonstrate poorer auditory working memory performance in multitalker babble at degraded signal-to-noise ratios than in quiet; and (b) whether the amount of cognitive demand of the task contributed to differences in performance in noise. It…
Evaluating the Effort Expended to Understand Speech in Noise Using a Dual-Task Paradigm: The Effects of Providing Visual Speech Cues

ERIC Educational Resources Information Center

Fraser, Sarah; Gagne, Jean-Pierre; Alepins, Majolaine; Dubois, Pascale

2010-01-01

Purpose: Using a dual-task paradigm, 2 experiments (Experiments 1 and 2) were conducted to assess differences in the amount of listening effort expended to understand speech in noise in audiovisual (AV) and audio-only (A-only) modalities. Experiment 1 had equivalent noise levels in both modalities, and Experiment 2 equated speech recognition…

A Near-Infrared Spectroscopy Study on Cortical Hemodynamic Responses to Normal and Whispered Speech in 3- to 7-Year-Old Children

ERIC Educational Resources Information Center

Remijn, Gerard B.; Kikuchi, Mitsuru; Yoshimura, Yuko; Shitamichi, Kiyomi; Ueno, Sanae; Tsubokawa, Tsunehisa; Kojima, Haruyuki; Higashida, Haruhiro; Minabe, Yoshio

2017-01-01

Purpose: The purpose of this study was to assess cortical hemodynamic response patterns in 3- to 7-year-old children listening to two speech modes: normally vocalized and whispered speech. Understanding whispered speech requires processing of the relatively weak, noisy signal, as well as the cognitive ability to understand the speaker's reason for…
Done Wrong or Said Wrong? Young Children Understand the Normative Directions of Fit of Different Speech Acts

ERIC Educational Resources Information Center

Rakoczy, Hannes; Tomasello, Michael

2009-01-01

Young children use and comprehend different kinds of speech acts from the beginning of their communicative development. But it is not clear how they understand the conventional and normative structure of such speech acts. In particular, imperative speech acts have a world-to-word direction of fit, such that their fulfillment means that the world…
Associations between speech understanding and auditory and visual tests of verbal working memory: effects of linguistic complexity, task, age, and hearing loss

PubMed Central

Smith, Sherri L.; Pichora-Fuller, M. Kathleen

2015-01-01

Listeners with hearing loss commonly report having difficulty understanding speech, particularly in noisy environments. Their difficulties could be due to auditory and cognitive processing problems. Performance on speech-in-noise tests has been correlated with reading working memory span (RWMS), a measure often chosen to avoid the effects of hearing loss. If the goal is to assess the cognitive consequences of listeners’ auditory processing abilities, however, then listening working memory span (LWMS) could be a more informative measure. Some studies have examined the effects of different degrees and types of masking on working memory, but less is known about the demands placed on working memory depending on the linguistic complexity of the target speech or the task used to measure speech understanding in listeners with hearing loss. Compared to RWMS, LWMS measures using different speech targets and maskers may provide a more ecologically valid approach. To examine the contributions of RWMS and LWMS to speech understanding, we administered two working memory measures (a traditional RWMS measure and a new LWMS measure), and a battery of tests varying in the linguistic complexity of the speech materials, the presence of babble masking, and the task. Participants were a group of younger listeners with normal hearing and two groups of older listeners with hearing loss (n = 24 per group). There was a significant group difference and a wider range in performance on LWMS than on RWMS. There was a significant correlation between both working memory measures only for the oldest listeners with hearing loss. Notably, there were only few significant correlations among the working memory and speech understanding measures. These findings suggest that working memory measures reflect individual differences that are distinct from those tapped by these measures of speech understanding. PMID:26441769
Seven- to Nine-Year-Olds' Understandings of Speech Marks: Some Issues and Problems

ERIC Educational Resources Information Center

Hall, Nigel; Sing, Sue

2011-01-01

At first sight the speech mark would seem to be one of the easiest to use of all punctuation marks. After all, all one has to do is take the piece of speech or written language and surround it with the appropriately shaped marks. But, are speech marks as easy to understand and use as suggested above, especially for young children beginning their…
The effect of presentation level and stimulation rate on speech perception and modulation detection for cochlear implant users.

PubMed

Brochier, Tim; McDermott, Hugh J; McKay, Colette M

2017-06-01

In order to improve speech understanding for cochlear implant users, it is important to maximize the transmission of temporal information. The combined effects of stimulation rate and presentation level on temporal information transfer and speech understanding remain unclear. The present study systematically varied presentation level (60, 50, and 40 dBA) and stimulation rate [500 and 2400 pulses per second per electrode (pps)] in order to observe how the effect of rate on speech understanding changes for different presentation levels. Speech recognition in quiet and noise, and acoustic amplitude modulation detection thresholds (AMDTs) were measured with acoustic stimuli presented to speech processors via direct audio input (DAI). With the 500 pps processor, results showed significantly better performance for consonant-vowel nucleus-consonant words in quiet, and a reduced effect of noise on sentence recognition. However, no rate or level effect was found for AMDTs, perhaps partly because of amplitude compression in the sound processor. AMDTs were found to be strongly correlated with the effect of noise on sentence perception at low levels. These results indicate that AMDTs, at least when measured with the CP910 Freedom speech processor via DAI, explain between-subject variance of speech understanding, but do not explain within-subject variance for different rates and levels.
The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech

ERIC Educational Resources Information Center

Wayne, Rachel V.; Johnsrude, Ingrid S.

2012-01-01

Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…
TEACHER'S GUIDE TO HIGH SCHOOL SPEECH.

ERIC Educational Resources Information Center

JENKINSON, EDWARD B., ED.

THIS GUIDE TO HIGH SCHOOL SPEECH FOCUSES ON SPEECH AS ORAL COMPOSITION, STRESSING THE IMPORTANCE OF CLEAR THINKING AND COMMUNICATION. THE PROPOSED 1-SEMESTER BASIC COURSE IN SPEECH ATTEMPTS TO IMPROVE THE STUDENT'S ABILITY TO COMPOSE AND DELIVER SPEECHES, TO THINK AND LISTEN CRITICALLY, AND TO UNDERSTAND THE SOCIAL FUNCTION OF SPEECH. IN ADDITION…
Evidence of degraded representation of speech in noise, in the aging midbrain and cortex

PubMed Central

Simon, Jonathan Z.; Anderson, Samira

2016-01-01

Humans have a remarkable ability to track and understand speech in unfavorable conditions, such as in background noise, but speech understanding in noise does deteriorate with age. Results from several studies have shown that in younger adults, low-frequency auditory cortical activity reliably synchronizes to the speech envelope, even when the background noise is considerably louder than the speech signal. However, cortical speech processing may be limited by age-related decreases in the precision of neural synchronization in the midbrain. To understand better the neural mechanisms contributing to impaired speech perception in older adults, we investigated how aging affects midbrain and cortical encoding of speech when presented in quiet and in the presence of a single-competing talker. Our results suggest that central auditory temporal processing deficits in older adults manifest in both the midbrain and in the cortex. Specifically, midbrain frequency following responses to a speech syllable are more degraded in noise in older adults than in younger adults. This suggests a failure of the midbrain auditory mechanisms needed to compensate for the presence of a competing talker. Similarly, in cortical responses, older adults show larger reductions than younger adults in their ability to encode the speech envelope when a competing talker is added. Interestingly, older adults showed an exaggerated cortical representation of speech in both quiet and noise conditions, suggesting a possible imbalance between inhibitory and excitatory processes, or diminished network connectivity that may impair their ability to encode speech efficiently. PMID:27535374
Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

ERIC Educational Resources Information Center

van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

2007-01-01

Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…
Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

PubMed

Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

2017-09-01

Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Speech after Mao: Literature and Belonging

ERIC Educational Resources Information Center

Hsieh, Victoria Linda

2012-01-01

This dissertation aims to understand the apparent failure of speech in post-Mao literature to fulfill its conventional functions of representation and communication. In order to understand this pattern, I begin by looking back on the utility of speech for nation-building in modern China. In addition to literary analysis of key authors and works,…
Five Lectures on Artificial Intelligence

DTIC Science & Technology

1974-09-01

large systems The current projects on speech understanding (which I will describe iater) are an exception to this, dealing explicitly with the problem...learns that "Fred lives in Sydney", we must find some new fact to resolve the tension — 1 SPEECH UNDERSTANDING SYSTEMS perhaps he lives in a zco It is...possible Speech Understanding Systems Most of the problems described above might be characterized as relating to the chunking of knowledge. Such ideas are
Development of a test battery for evaluating speech perception in complex listening environments.

PubMed

Brungart, Douglas S; Sheffield, Benjamin M; Kubli, Lina R

2014-08-01

In the real world, spoken communication occurs in complex environments that involve audiovisual speech cues, spatially separated sound sources, reverberant listening spaces, and other complicating factors that influence speech understanding. However, most clinical tools for assessing speech perception are based on simplified listening environments that do not reflect the complexities of real-world listening. In this study, speech materials from the QuickSIN speech-in-noise test by Killion, Niquette, Gudmundsen, Revit, and Banerjee [J. Acoust. Soc. Am. 116, 2395-2405 (2004)] were modified to simulate eight listening conditions spanning the range of auditory environments listeners encounter in everyday life. The standard QuickSIN test method was used to estimate 50% speech reception thresholds (SRT50) in each condition. A method of adjustment procedure was also used to obtain subjective estimates of the lowest signal-to-noise ratio (SNR) where the listeners were able to understand 100% of the speech (SRT100) and the highest SNR where they could detect the speech but could not understand any of the words (SRT0). The results show that the modified materials maintained most of the efficiency of the QuickSIN test procedure while capturing performance differences across listening conditions comparable to those reported in previous studies that have examined the effects of audiovisual cues, binaural cues, room reverberation, and time compression on the intelligibility of speech.
Speech Patterns and Racial Wage Inequality

ERIC Educational Resources Information Center

Grogger, Jeffrey

2011-01-01

Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…
The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech

ERIC Educational Resources Information Center

Ye, Ruijuan

2010-01-01

This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…
The Effectiveness of Clear Speech as a Masker

ERIC Educational Resources Information Center

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Investigation of potential cognitive tests for use with older adults in audiology clinics.

PubMed

Vaughan, Nancy; Storzbach, Daniel; Furukawa, Izumi

2008-01-01

Cognitive declines in working memory and processing speed are hallmarks of aging. Deficits in speech understanding also are seen in aging individuals. A clinical test to determine whether the cognitive aging changes contribute to aging speech understanding difficulties would be helpful for determining rehabilitation strategies in audiology clinics. To identify a clinical neurocognitive test or battery of tests that could be used in audiology clinics to help explain deficits in speech recognition in some older listeners. A correlational study examining the association between certain cognitive test scores and speech recognition performance. Speeded (time-compressed) speech was used to increase the cognitive processing load. Two hundred twenty-five adults aged 50 through 75 years were participants in this study. Both batteries of tests were administered to all participants in two separate sessions. A selected battery of neurocognitive tests and a time-compressed speech recognition test battery using various rates of speech were administered. Principal component analysis was used to extract the important component factors from each set of tests, and regression models were constructed to examine the association between tests and to identify the neurocognitive test most strongly associated with speech recognition performance. A sequencing working memory test (Letter-Number Sequencing [LNS]) was most strongly associated with rapid speech understanding. The association between the LNS test results and the compressed sentence recognition scores (CSRS) was strong even when age and hearing loss were controlled. The LNS is a sequencing test that provides information about temporal processing at the cognitive level and may prove useful in diagnosis of speech understanding problems, and in the development of aural rehabilitation and training strategies.
Relative Difficulty of Understanding Foreign Accents as a Marker of Proficiency

ERIC Educational Resources Information Center

Lev-Ari, Shiri; van Heugten, Marieke; Peperkamp, Sharon

2017-01-01

Foreign-accented speech is generally harder to understand than native-accented speech. This difficulty is reduced for non-native listeners who share their first language with the non-native speaker. It is currently unclear, however, how non-native listeners deal with foreign-accented speech produced by speakers of a different language. We show…
The Atlanta Motor Speech Disorders Corpus: Motivation, Development, and Utility.

PubMed

Laures-Gore, Jacqueline; Russell, Scott; Patel, Rupal; Frankel, Michael

2016-01-01

This paper describes the design and collection of a comprehensive spoken language dataset from speakers with motor speech disorders in Atlanta, Ga., USA. This collaborative project aimed to gather a spoken database consisting of nonmainstream American English speakers residing in the Southeastern US in order to provide a more diverse perspective of motor speech disorders. Ninety-nine adults with an acquired neurogenic disorder resulting in a motor speech disorder were recruited. Stimuli include isolated vowels, single words, sentences with contrastive focus, sentences with emotional content and prosody, sentences with acoustic and perceptual sensitivity to motor speech disorders, as well as 'The Caterpillar' and 'The Grandfather' passages. Utility of this data in understanding the potential interplay of dialect and dysarthria was demonstrated with a subset of the speech samples existing in the database. The Atlanta Motor Speech Disorders Corpus will enrich our understanding of motor speech disorders through the examination of speech from a diverse group of speakers. © 2016 S. Karger AG, Basel.
Neural Signatures of Phonetic Learning in Adulthood: A Magnetoencephalography Study

PubMed Central

Zhang, Yang; Kuhl, Patricia K.; Imada, Toshiaki; Iverson, Paul; Pruitt, John; Stevens, Erica B.; Kawakatsu, Masaki; Tohkura, Yoh'ichi; Nemoto, Iku

2010-01-01

The present study used magnetoencephalography (MEG) to examine perceptual learning of American English /r/ and /l/ categories by Japanese adults who had limited English exposure. A training software program was developed based on the principles of infant phonetic learning, featuring systematic acoustic exaggeration, multi-talker variability, visible articulation, and adaptive listening. The program was designed to help Japanese listeners utilize an acoustic dimension relevant for phonemic categorization of /r-l/ in English. Although training did not produce native-like phonetic boundary along the /r-l/ synthetic continuum in the second language learners, success was seen in highly significant identification improvement over twelve training sessions and transfer of learning to novel stimuli. Consistent with behavioral results, pre-post MEG measures showed not only enhanced neural sensitivity to the /r-l/ distinction in the left-hemisphere mismatch field (MMF) response but also bilateral decreases in equivalent current dipole (ECD) cluster and duration measures for stimulus coding in the inferior parietal region. The learning-induced increases in neural sensitivity and efficiency were also found in distributed source analysis using Minimum Current Estimates (MCE). Furthermore, the pre-post changes exhibited significant brain-behavior correlations between speech discrimination scores and MMF amplitudes as well as between the behavioral scores and ECD measures of neural efficiency. Together, the data provide corroborating evidence that substantial neural plasticity for second-language learning in adulthood can be induced with adaptive and enriched linguistic exposure. Like the MMF, the ECD cluster and duration measures are sensitive neural markers of phonetic learning. PMID:19457395

Contemporary Reflections on Speech-Based Language Learning

ERIC Educational Resources Information Center

Gustafson, Marianne

2009-01-01

In "The Relation of Language to Mental Development and of Speech to Language Teaching," S.G. Davidson displayed several timeless insights into the role of speech in developing language and reasons for using speech as the basis for instruction for children who are deaf and hard of hearing. His understanding that speech includes more than merely…
Voice technology and BBN

NASA Technical Reports Server (NTRS)

Wolf, Jared J.

1977-01-01

The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.
Speech and nonspeech: What are we talking about?

PubMed

Maas, Edwin

2017-08-01

Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.
The effect of noise-induced hearing loss on the intelligibility of speech in noise

NASA Astrophysics Data System (ADS)

Smoorenburg, G. F.; Delaat, J. A. P. M.; Plomp, R.

1981-06-01

Speech reception thresholds, both in quiet and in noise, and tone audiograms were measured for 14 normal ears (7 subjects) and 44 ears (22 subjects) with noise-induced hearing loss. Maximum hearing loss in the 4-6 kHz region equalled 40 to 90 dB (losses exceeded by 90% and 10%, respectively). Hearing loss for speech in quiet measured with respect to the median speech reception threshold for normal ears ranged from 1.8 dB to 13.4 dB. For speech in noise the numbers are 1.2 dB to 7.0 dB which means that the subjects with noise-induced hearing loss need a 1.2 to 7.0 dB higher signal-to-noise ratio than normal to understand sentences equally well. A hearing loss for speech of 1 dB corresponds to a decrease in sentence intelligibility of 15 to 20%. The relation between hearing handicap conceived as a reduced ability to understand speech and tone audiogram is discussed. The higher signal-to-noise ratio needed by people with noise-induced hearing loss to understand speech in noisy environments is shown to be due partly to the decreased bandwidth of their hearing caused by the noise dip.
Communicating by Language: The Speech Process.

ERIC Educational Resources Information Center

House, Arthur S., Ed.

This document reports on a conference focused on speech problems. The main objective of these discussions was to facilitate a deeper understanding of human communication through interaction of conference participants with colleagues in other disciplines. Topics discussed included speech production, feedback, speech perception, and development of…
Speech and Communication Disorders

MedlinePlus

... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...
Development of Trivia Game for speech understanding in background noise.

PubMed

Schwartz, Kathryn; Ringleb, Stacie I; Sandberg, Hilary; Raymer, Anastasia; Watson, Ginger S

2015-01-01

Listening in noise is an everyday activity and poses a challenge for many people. To improve the ability to understand speech in noise, a computerized auditory rehabilitation game was developed. In Trivia Game players are challenged to answer trivia questions spoken aloud. As players progress through the game, the level of background noise increases. A study using Trivia Game was conducted as a proof-of-concept investigation in healthy participants. College students with normal hearing were randomly assigned to a control (n = 13) or a treatment (n = 14) group. Treatment participants played Trivia Game 12 times over a 4-week period. All participants completed objective (auditory-only and audiovisual formats) and subjective listening in noise measures at baseline and 4 weeks later. There were no statistical differences between the groups at baseline. At post-test, the treatment group significantly improved their overall speech understanding in noise in the audiovisual condition and reported significant benefits in their functional listening abilities. Playing Trivia Game improved speech understanding in noise in healthy listeners. Significant findings for the audiovisual condition suggest that participants improved face-reading abilities. Trivia Game may be a platform for investigating changes in speech understanding in individuals with sensory, linguistic and cognitive impairments.
Is complex signal processing for bone conduction hearing aids useful?

PubMed

Kompis, Martin; Kurz, Anja; Pfiffner, Flurin; Senn, Pascal; Arnold, Andreas; Caversaccio, Marco

2014-05-01

To establish whether complex signal processing is beneficial for users of bone anchored hearing aids. Review and analysis of two studies from our own group, each comparing a speech processor with basic digital signal processing (either Baha Divino or Baha Intenso) and a processor with complex digital signal processing (either Baha BP100 or Baha BP110 power). The main differences between basic and complex signal processing are the number of audiologist accessible frequency channels and the availability and complexity of the directional multi-microphone noise reduction and loudness compression systems. Both studies show a small, statistically non-significant improvement of speech understanding in quiet with the complex digital signal processing. The average improvement for speech in noise is +0.9 dB, if speech and noise are emitted both from the front of the listener. If noise is emitted from the rear and speech from the front of the listener, the advantage of the devices with complex digital signal processing as opposed to those with basic signal processing increases, on average, to +3.2 dB (range +2.3 … +5.1 dB, p ≤ 0.0032). Complex digital signal processing does indeed improve speech understanding, especially in noise coming from the rear. This finding has been supported by another study, which has been published recently by a different research group. When compared to basic digital signal processing, complex digital signal processing can increase speech understanding of users of bone anchored hearing aids. The benefit is most significant for speech understanding in noise.
Normal Aspects of Speech, Hearing, and Language.

ERIC Educational Resources Information Center

Minifie, Fred. D., Ed.; And Others

This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…
Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

ERIC Educational Resources Information Center

Maas, Edwin; Mailend, Marja-Liisa

2012-01-01

Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…
The benefits of remote microphone technology for adults with cochlear implants.

PubMed

Fitzpatrick, Elizabeth M; Séguin, Christiane; Schramm, David R; Armstrong, Shelly; Chénier, Josée

2009-10-01

Cochlear implantation has become a standard practice for adults with severe to profound hearing loss who demonstrate limited benefit from hearing aids. Despite the substantial auditory benefits provided by cochlear implants, many adults experience difficulty understanding speech in noisy environments and in other challenging listening conditions such as television. Remote microphone technology may provide some benefit in these situations; however, little is known about whether these systems are effective in improving speech understanding in difficult acoustic environments for this population. This study was undertaken with adult cochlear implant recipients to assess the potential benefits of remote microphone technology. The objectives were to examine the measurable and perceived benefit of remote microphone devices during television viewing and to assess the benefits of a frequency-modulated system for speech understanding in noise. Fifteen adult unilateral cochlear implant users were fit with remote microphone devices in a clinical environment. The study used a combination of direct measurements and patient perceptions to assess speech understanding with and without remote microphone technology. The direct measures involved a within-subject repeated-measures design. Direct measures of patients' speech understanding during television viewing were collected using their cochlear implant alone and with their implant device coupled to an assistive listening device. Questionnaires were administered to document patients' perceptions of benefits during the television-listening tasks. Speech recognition tests of open-set sentences in noise with and without remote microphone technology were also administered. Participants showed improved speech understanding for television listening when using remote microphone devices coupled to their cochlear implant compared with a cochlear implant alone. This benefit was documented both when listening to news and talk show recordings. Questionnaire results also showed statistically significant differences between listening with a cochlear implant alone and listening with a remote microphone device. Participants judged that remote microphone technology provided them with better comprehension, more confidence, and greater ease of listening. Use of a frequency-modulated system coupled to a cochlear implant also showed significant improvement over a cochlear implant alone for open-set sentence recognition in +10 and +5 dB signal to noise ratios. Benefits were measured during remote microphone use in focused-listening situations in a clinical setting, for both television viewing and speech understanding in noise in the audiometric sound suite. The results suggest that adult cochlear implant users should be counseled regarding the potential for enhanced speech understanding in difficult listening environments through the use of remote microphone technology.
Understanding the Oral Mind: Implications for Speech Education.

ERIC Educational Resources Information Center

Cocetti, Robert A.

The primary goal of the basic course in speech should be to investigate oral communication rather than public speaking. Fundamental to understanding oral communication is some understanding of the oral mind, that operates when orality is the primary means of expression. Since narrative invites action rather than leisurely analysis, the oral mind…
Visually Impaired Persons' Comprehension of Text Presented with Speech Synthesis.

ERIC Educational Resources Information Center

Hjelmquist, E.; And Others

1992-01-01

This study of 48 individuals with visual impairments (16 middle-aged with experience in synthetic speech, 16 middle-aged inexperienced, and 16 older inexperienced) found that speech synthesis, compared to natural speech, generally yielded lower results with respect to memory and understanding of texts. Experience had no effect on performance.…
Five-year speech and language outcomes in children with cleft lip-palate.

PubMed

Prathanee, Benjamas; Pumnum, Tawitree; Seepuaham, Cholada; Jaiyong, Pechcharat

2016-10-01

To investigate 5-year speech and language outcomes in children with cleft lip/palate (CLP). Thirty-eight children aged 4-7 years and 8 months were recruited for this study. Speech abilities including articulation, resonance, voice, and intelligibility were assessed based on Thai Universal Parameters of Speech Outcomes. Language ability was assessed by the Language Screening Test. The findings revealed that children with clefts had speech and language delay, abnormal understandability, resonance abnormality, and voice disturbance; articulation defects that were 8.33 (1.75, 22.47), 50.00 (32.92, 67.08), 36.11 (20.82, 53.78), 30.56 (16.35, 48.11), and 94.44 (81.34, 99.32). Articulation errors were the most common speech and language defects in children with clefts, followed by abnormal understandability, resonance abnormality, and voice disturbance. These results should be of critical concern. Protocol reviewing and early intervention programs are needed for improved speech outcomes. Copyright © 2016 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Objective support for subjective reports of successful inner speech in two people with aphasia.

PubMed

Hayward, William; Snider, Sarah F; Luta, George; Friedman, Rhonda B; Turkeltaub, Peter E

2016-01-01

People with aphasia frequently report being able to say a word correctly in their heads, even if they are unable to say that word aloud. It is difficult to know what is meant by these reports of "successful inner speech". We probe the experience of successful inner speech in two people with aphasia. We show that these reports are associated with correct overt speech and phonologically related nonword errors, that they relate to word characteristics associated with ease of lexical access but not ease of production, and that they predict whether or not individual words are relearned during anomia treatment. These findings suggest that reports of successful inner speech are meaningful and may be useful to study self-monitoring in aphasia, to better understand anomia, and to predict treatment outcomes. Ultimately, the study of inner speech in people with aphasia could provide critical insights that inform our understanding of normal language.
Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

PubMed Central

Rauschecker, Josef P; Scott, Sophie K

2010-01-01

Speech and language are considered uniquely human abilities: animals have communication systems, but they do not match human linguistic skills in terms of recursive structure and combinatorial power. Yet, in evolution, spoken language must have emerged from neural mechanisms at least partially available in animals. In this paper, we will demonstrate how our understanding of speech perception, one important facet of language, has profited from findings and theory in nonhuman primate studies. Chief among these are physiological and anatomical studies showing that primate auditory cortex, across species, shows patterns of hierarchical structure, topographic mapping and streams of functional processing. We will identify roles for different cortical areas in the perceptual processing of speech and review functional imaging work in humans that bears on our understanding of how the brain decodes and monitors speech. A new model connects structures in the temporal, frontal and parietal lobes linking speech perception and production. PMID:19471271
My speech problem, your listening problem, and my frustration: the experience of living with childhood speech impairment.

PubMed

McCormack, Jane; McLeod, Sharynne; McAllister, Lindy; Harrison, Linda J

2010-10-01

The purpose of this article was to understand the experience of speech impairment (speech sound disorders) in everyday life as described by children with speech impairment and their communication partners. Interviews were undertaken with 13 preschool children with speech impairment (mild to severe) and 21 significant others (family members and teachers). A phenomenological analysis of the interview transcripts revealed 2 global themes regarding the experience of living with speech impairment for these children and their families. The first theme encompassed the problems experienced by participants, namely (a) the child's inability to "speak properly," (b) the communication partner's failure to "listen properly," and (c) frustration caused by the speaking and listening problems. The second theme described the solutions participants used to overcome the problems. Solutions included (a) strategies to improve the child's speech accuracy (e.g., home practice, speech-language pathology) and (b) strategies to improve the listener's understanding (e.g., using gestures, repetition). Both short- and long-term solutions were identified. Successful communication is dependent on the skills of speakers and listeners. Intervention with children who experience speech impairment needs to reflect this reciprocity by supporting both the speaker and the listener and by addressing the frustration they experience.
Vowel reduction across tasks for male speakers of American English.

PubMed

Kuo, Christina; Weismer, Gary

2016-07-01

This study examined acoustic variation of vowels within speakers across speech tasks. The overarching goal of the study was to understand within-speaker variation as one index of the range of normal speech motor behavior for American English vowels. Ten male speakers of American English performed four speech tasks including citation form sentence reading with a clear-speech style (clear-speech), citation form sentence reading (citation), passage reading (reading), and conversational speech (conversation). Eight monophthong vowels in a variety of consonant contexts were studied. Clear-speech was operationally defined as the reference point for describing variation. Acoustic measures associated with the conventions of vowel targets were obtained and examined. These included temporal midpoint formant frequencies for the first three formants (F1, F2, and F3) and the derived Euclidean distances in the F1-F2 and F2-F3 planes. Results indicated that reduction toward the center of the F1-F2 and F2-F3 planes increased in magnitude across the tasks in the order of clear-speech, citation, reading, and conversation. The cross-task variation was comparable for all speakers despite fine-grained individual differences. The characteristics of systematic within-speaker acoustic variation across tasks have potential implications for the understanding of the mechanisms of speech motor control and motor speech disorders.
Redistribution of neural phase coherence reflects establishment of feedforward map in speech motor adaptation

PubMed Central

Sengupta, Ranit

2015-01-01

Despite recent progress in our understanding of sensorimotor integration in speech learning, a comprehensive framework to investigate its neural basis is lacking at behaviorally relevant timescales. Structural and functional imaging studies in humans have helped us identify brain networks that support speech but fail to capture the precise spatiotemporal coordination within the networks that takes place during speech learning. Here we use neuronal oscillations to investigate interactions within speech motor networks in a paradigm of speech motor adaptation under altered feedback with continuous recording of EEG in which subjects adapted to the real-time auditory perturbation of a target vowel sound. As subjects adapted to the task, concurrent changes were observed in the theta-gamma phase coherence during speech planning at several distinct scalp regions that is consistent with the establishment of a feedforward map. In particular, there was an increase in coherence over the central region and a decrease over the fronto-temporal regions, revealing a redistribution of coherence over an interacting network of brain regions that could be a general feature of error-based motor learning in general. Our findings have implications for understanding the neural basis of speech motor learning and could elucidate how transient breakdown of neuronal communication within speech networks relates to speech disorders. PMID:25632078
Poorer verbal working memory for a second language selectively impacts academic achievement in university medical students

PubMed Central

Canny, Benedict J.; Reser, David H.; Rajan, Ramesh

2013-01-01

Working memory (WM) is often poorer for a second language (L2). In low noise conditions, people listening to a language other than their first language (L1) may have similar auditory perception skills for that L2 as native listeners, but do worse in high noise conditions, and this has been attributed to the poorer WM for L2. Given that WM is critical for academic success in children and young adults, these speech in noise effects have implications for academic performance where the language of instruction is L2 for a student. We used a well-established Speech-in-Noise task as a verbal WM (vWM) test, and developed a model correlating vWM and measures of English proficiency and/or usage to scholastic outcomes in a multi-faceted assessment medical education program. Significant differences in Speech-Noise Ratio (SNR50 ) values were observed between medical undergraduates who had learned English before or after five years of age, with the latter group doing worse in the ability to extract whole connected speech in the presence of background multi-talker babble (Student-t tests, p < 0.001). Significant negative correlations were observed between the SNR50 and seven of the nine variables of English usage, learning styles, stress, and musical abilities in a questionnaire administered to the students previously. The remaining two variables, Perceived Stress Scale (PSS) and the Age of Acquisition of English (AoAoE) were significantly positively correlated with the SNR50 , showing that those with a poorer capacity to discriminate simple English sentences from noise had learnt English later in life and had higher levels of stress – all characteristics of the international students. Local students exhibited significantly lower SNR50 scores and were significantly younger when they first learnt English. No significant correlation was detected between the SNR50 and the students’ Visual/Verbal Learning Style (r = −0.023). Standard multiple regression was carried out to assess the relationship between language proficiency and verbal working memory (SNR50 ) using 5 variables of L2 proficiency, with the results showing that the variance in SNR50 was significantly predicted by this model (r2 = 0.335). Hierarchical multiple regression was then used to test the ability of three independent variable measures (SNR50 , age of acquisition of English and English proficiency) to predict academic performance as the dependent variable in a factor analysis model which predicted significant performance differences in an assessment requiring communications skills (p = 0.008), but not on a companion assessment requiring knowledge of procedural skills, or other assessments requiring factual knowledge. Thus, impaired vWM for an L2 appears to affect specific communications-based assessments in university medical students. PMID:23638357

Poorer verbal working memory for a second language selectively impacts academic achievement in university medical students.

PubMed

Mann, Collette; Canny, Benedict J; Reser, David H; Rajan, Ramesh

2013-01-01

Working memory (WM) is often poorer for a second language (L2). In low noise conditions, people listening to a language other than their first language (L1) may have similar auditory perception skills for that L2 as native listeners, but do worse in high noise conditions, and this has been attributed to the poorer WM for L2. Given that WM is critical for academic success in children and young adults, these speech in noise effects have implications for academic performance where the language of instruction is L2 for a student. We used a well-established Speech-in-Noise task as a verbal WM (vWM) test, and developed a model correlating vWM and measures of English proficiency and/or usage to scholastic outcomes in a multi-faceted assessment medical education program. Significant differences in Speech-Noise Ratio (SNR50) values were observed between medical undergraduates who had learned English before or after five years of age, with the latter group doing worse in the ability to extract whole connected speech in the presence of background multi-talker babble (Student-t tests, p < 0.001). Significant negative correlations were observed between the SNR50 and seven of the nine variables of English usage, learning styles, stress, and musical abilities in a questionnaire administered to the students previously. The remaining two variables, Perceived Stress Scale (PSS) and the Age of Acquisition of English (AoAoE) were significantly positively correlated with the SNR50, showing that those with a poorer capacity to discriminate simple English sentences from noise had learnt English later in life and had higher levels of stress - all characteristics of the international students. Local students exhibited significantly lower SNR50 scores and were significantly younger when they first learnt English. No significant correlation was detected between the SNR50 and the students' Visual/Verbal Learning Style (r = -0.023). Standard multiple regression was carried out to assess the relationship between language proficiency and verbal working memory (SNR50) using 5 variables of L2 proficiency, with the results showing that the variance in SNR50 was significantly predicted by this model (r (2) = 0.335). Hierarchical multiple regression was then used to test the ability of three independent variable measures (SNR50, age of acquisition of English and English proficiency) to predict academic performance as the dependent variable in a factor analysis model which predicted significant performance differences in an assessment requiring communications skills (p = 0.008), but not on a companion assessment requiring knowledge of procedural skills, or other assessments requiring factual knowledge. Thus, impaired vWM for an L2 appears to affect specific communications-based assessments in university medical students.
Auditory Brainstem Response to Complex Sounds Predicts Self-Reported Speech-in-Noise Performance

ERIC Educational Resources Information Center

Anderson, Samira; Parbery-Clark, Alexandra; White-Schwoch, Travis; Kraus, Nina

2013-01-01

Purpose: To compare the ability of the auditory brainstem response to complex sounds (cABR) to predict subjective ratings of speech understanding in noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse & Noble, 2004) relative to the predictive ability of the Quick Speech-in-Noise test (QuickSIN; Killion, Niquette,…
Family-Centered Services for Children with ASD and Limited Speech: The Experiences of Parents and Speech-Language Pathologists

ERIC Educational Resources Information Center

Mandak, Kelsey; Light, Janice

2018-01-01

Although family-centered services have long been discussed as essential in providing successful services to families of children with autism spectrum disorder (ASD), ideal implementation is often lacking. This study aimed to increase understanding of how families with children with ASD and limited speech receive services from speech-language…
Association of Orofacial Muscle Activity and Movement during Changes in Speech Rate and Intensity

ERIC Educational Resources Information Center

McClean, Michael D.; Tasko, Stephen M.

2003-01-01

Understanding how orofacial muscle activity and movement covary across changes in speech rate and intensity has implications for the neural control of speech production and the use of clinical procedures that manipulate speech prosody. The present study involved a correlation analysis relating average lower-lip and jaw-muscle activity to lip and…
Refinement of Speech Breathing in Healthy 4- to 6-Year-Old Children

ERIC Educational Resources Information Center

Boliek, Carol A.; Hixon, Thomas J.; Watson, Peter J.; Jones, Patricia B.

2009-01-01

Purpose: The purpose of this study was to offer a better understanding of the development of neuromotor control for speech breathing and provide a normative data set that can serve as a useful standard for clinical evaluation and management of young children with speech disorders involving the breathing subsystem. Method: Speech breathing was…
Speech understanding in noise with an eyeglass hearing aid: asymmetric fitting and the head shadow benefit of anterior microphones.

PubMed

Mens, Lucas H M

2011-01-01

To test speech understanding in noise using array microphones integrated in an eyeglass device and to test if microphones placed anteriorly at the temple provide better directivity than above the pinna. Sentences were presented from the front and uncorrelated noise from 45, 135, 225 and 315°. Fifteen hearing impaired participants with a significant speech discrimination loss were included, as well as 5 normal hearing listeners. The device (Varibel) improved speech understanding in noise compared to most conventional directional devices with a directional benefit of 5.3 dB in the asymmetric fit mode, which was not significantly different from the bilateral fully directional mode (6.3 dB). Anterior microphones outperformed microphones at a conventional position above the pinna by 2.6 dB. By integrating microphones in an eyeglass frame, a long array can be used resulting in a higher directionality index and improved speech understanding in noise. An asymmetric fit did not significantly reduce performance and can be considered to increase acceptance and environmental awareness. Directional microphones at the temple seemed to profit more from the head shadow than above the pinna, better suppressing noise from behind the listener.
How may the basal ganglia contribute to auditory categorization and speech perception?

PubMed Central

Lim, Sung-Joo; Fiez, Julie A.; Holt, Lori L.

2014-01-01

Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood. PMID:25136291
Clear Speech Modifications in Children Aged 6-10

NASA Astrophysics Data System (ADS)

Taylor, Griffin Lijding

Modifications to speech production made by adult talkers in response to instructions to speak clearly have been well documented in the literature. Targeting adult populations has been motivated by efforts to improve speech production for the benefit of the communication partners, however, many adults also have communication partners who are children. Surprisingly, there is limited literature on whether children can change their speech production when cued to speak clearly. Pettinato, Tuomainen, Granlund, and Hazan (2016) showed that by age 12, children exhibited enlarged vowel space areas and reduced articulation rate when prompted to speak clearly, but did not produce any other adult-like clear speech modifications in connected speech. Moreover, Syrett and Kawahara (2013) suggested that preschoolers produced longer and more intense vowels when prompted to speak clearly at the word level. These findings contrasted with adult talkers who show significant temporal and spectral differences between speech produced in control and clear speech conditions. Therefore, it was the purpose of this study to analyze changes in temporal and spectral characteristics of speech production that children aged 6-10 made in these experimental conditions. It is important to elucidate the clear speech profile of this population to better understand which adult-like clear speech modifications they make spontaneously and which modifications are still developing. Understanding these baselines will advance future studies that measure the impact of more explicit instructions and children's abilities to better accommodate their interlocutors, which is a critical component of children's pragmatic and speech-motor development.
Sound-direction identification, interaural time delay discrimination, and speech intelligibility advantages in noise for a bilateral cochlear implant user.

PubMed

Van Hoesel, Richard; Ramsden, Richard; Odriscoll, Martin

2002-04-01

To characterize some of the benefits available from using two cochlear implants compared with just one, sound-direction identification (ID) abilities, sensitivity to interaural time delays (ITDs) and speech intelligibility in noise were measured for a bilateral multi-channel cochlear implant user. Sound-direction ID in the horizontal plane was tested with a bilateral cochlear implant user. The subject was tested both unilaterally and bilaterally using two independent behind-the-ear ESPRIT (Cochlear Ltd.) processors, as well as bilaterally using custom research processors. Pink noise bursts were presented using an 11-loudspeaker array spanning the subject's frontal 180 degrees arc in an anechoic room. After each burst, the subject was asked to identify which loudspeaker had produced the sound. No explicit training, and no feedback were given. Presentation levels were nominally at 70 dB SPL, except for a repeat experiment using the clinical devices where the presentation levels were reduced to 60 dB SPL to avoid activation of the devices' automatic gain control (AGC) circuits. Overall presentation levels were randomly varied by +/- 3 dB. For the research processor, a "low-update-rate" and a "high-update-rate" strategy were tested. Direct measurements of ITD just noticeable differences (JNDs) were made using a 3 AFC paradigm targeting 70% correct performance on the psychometric function. Stimuli included simple, low-rate electrical pulse trains as well as high-rate pulse trains modulated at 100 Hz. Speech data comparing monaural and binaural performance in noise were also collected with both low, and high update-rate strategies on the research processors. Open-set sentences were presented from directly in front of the subject and competing multi-talker babble noise was presented from the same loudspeaker, or from a loudspeaker placed 90 degrees to the left or right of the subject. For the sound-direction ID task, monaural performance using the clinical devices showed large mean absolute errors of 81 degrees and 73 degrees, with standard deviations (averaged across all 11 loud-speakers) of 10 degrees and 17 degrees, for left and right ears, respectively. Fore bilateral device use at a presentation level of 70 dB SPL, the mean error improved to about 16 degrees with an average standard deviation of 18 degrees. When the presentation level was decreased to 60 dB SPL to avoid activation of the automatic gain control (AGC) circuits in the clinical processors, the mean response error improved further to 8 degrees with a standard deviation of 13 degrees. Further tests with the custom research processors, which had a higher stimulation rate and did not include AGCs, showed comparable response errors: around 8 or 9 degrees and a standard deviation of about 11 degrees for both update rates. The best ITD JNDs measured for this subject were between 350 to 400 microsec for simple low-rate pulse trains. Speech results showed a substantial headshadow advantage for bilateral device use when speech and noise were spatially separated, but little evidence of binaural unmasking. For spatially coincident speech and noise, listening with both ears showed similar results to listening with either side alone when loudness summation was compensated for. No significant differences were observed between binaural results for high and low update-rates in any test configuration. Only for monaural listening in one test configuration did the high rate show a small significant improvement over the low rate. Results show that even if interaural time delay cues are not well coded or perceived, bilateral implants can offer important advantages, both for speech in noise as well as for sound-direction identification.
Status Report on Speech Research: A Report on the Status and Progress of Studies on the Nature of Speech, Instrumentation for Its Investigation, and Practical Applications, April 1-September 30, 1986.

ERIC Educational Resources Information Center

O'Brien, Nancy, Ed.

Focusing on the status, progress, instrumentation, and applications of studies on the nature of speech, this report contains the following research studies: "The Role of Psychophysics in Understanding Speech Perception" (B. H. Repp); "Specialized Perceiving Systems for Speech and Other Biologically Significant Sounds" (I. G. Mattingly; A. M.…
Sound Source Localization and Speech Understanding in Complex Listening Environments by Single-sided Deaf Listeners After Cochlear Implantation.

PubMed

Zeitler, Daniel M; Dorman, Michael F; Natale, Sarah J; Loiselle, Louise; Yost, William A; Gifford, Rene H

2015-09-01

To assess improvements in sound source localization and speech understanding in complex listening environments after unilateral cochlear implantation for single-sided deafness (SSD). Nonrandomized, open, prospective case series. Tertiary referral center. Nine subjects with a unilateral cochlear implant (CI) for SSD (SSD-CI) were tested. Reference groups for the task of sound source localization included young (n = 45) and older (n = 12) normal-hearing (NH) subjects and 27 bilateral CI (BCI) subjects. Unilateral cochlear implantation. Sound source localization was tested with 13 loudspeakers in a 180 arc in front of the subject. Speech understanding was tested with the subject seated in an 8-loudspeaker sound system arrayed in a 360-degree pattern. Directionally appropriate noise, originally recorded in a restaurant, was played from each loudspeaker. Speech understanding in noise was tested using the Azbio sentence test and sound source localization quantified using root mean square error. All CI subjects showed poorer-than-normal sound source localization. SSD-CI subjects showed a bimodal distribution of scores: six subjects had scores near the mean of those obtained by BCI subjects, whereas three had scores just outside the 95th percentile of NH listeners. Speech understanding improved significantly in the restaurant environment when the signal was presented to the side of the CI. Cochlear implantation for SSD can offer improved speech understanding in complex listening environments and improved sound source localization in both children and adults. On tasks of sound source localization, SSD-CI patients typically perform as well as BCI patients and, in some cases, achieve scores at the upper boundary of normal performance.
Spectro-temporal cues enhance modulation sensitivity in cochlear implant users

PubMed Central

Zheng, Yi; Escabí, Monty; Litovsky, Ruth Y.

2018-01-01

Although speech understanding is highly variable amongst cochlear implants (CIs) subjects, the remarkably high speech recognition performance of many CI users is unexpected and not well understood. Numerous factors, including neural health and degradation of the spectral information in the speech signal of CIs, likely contribute to speech understanding. We studied the ability to use spectro-temporal modulations, which may be critical for speech understanding and discrimination, and hypothesize that CI users adopt a different perceptual strategy than normal-hearing (NH) individuals, whereby they rely more heavily on joint spectro-temporal cues to enhance detection of auditory cues. Modulation detection sensitivity was studied in CI users and NH subjects using broadband “ripple” stimuli that were modulated spectrally, temporally, or jointly, i.e., spectro-temporally. The spectro-temporal modulation transfer functions of CI users and NH subjects was decomposed into spectral and temporal dimensions and compared to those subjects’ spectral-only and temporal-only modulation transfer functions. In CI users, the joint spectro-temporal sensitivity was better than that predicted by spectral-only and temporal-only sensitivity, indicating a heightened spectro-temporal sensitivity. Such an enhancement through the combined integration of spectral and temporal cues was not observed in NH subjects. The unique use of spectro-temporal cues by CI patients can yield benefits for use of cues that are important for speech understanding. This finding has implications for developing sound processing strategies that may rely on joint spectro-temporal modulations to improve speech comprehension of CI users, and the findings of this study may be valuable for developing clinical assessment tools to optimize CI processor performance. PMID:28601530
Understanding Freedom of Speech in America: The Origin & Evolution of the 1st Amendment.

ERIC Educational Resources Information Center

Barnes, Judy

In this booklet the content and implications of the First Amendment are analyzed. Historical origins of free speech from ancient Greece to England before the discovery of America, free speech in colonial America, and the Bill of Rights and its meaning for free speech are outlined. The evolution of the First Amendment is described, and the…
A Hypertext-Based Computer Architecture for Management of the Joint Command, Control and Communications Curriculum

DTIC Science & Technology

1992-06-01

Boards) Security, Privacy, and Freedom of Speech Issues 4.1.2 Understand the relationships between information processing and collection and...to-many (Mailing and discussion Lists) ... Many-to-Many (Bulletin Boards) Security, Privacy, and Freedom of Speech Issues 69 4.1.3 Understand the...Communication one-to-one (e-mail) °o° one-to-many (Mailing and discussion Lists) ... Many-to-Many (Bulletin Boards) oo Security, Privacy, and Freedom of Speech Issues
Cleft Palate

PubMed Central

Kosowski, Tomasz R.; Weathers, William M.; Wolfswinkel, Erik M.; Ridgway, Emily B.

2012-01-01

Our understanding of cleft palates has come a long way over the last few decades. A better understanding of the long-term consequences of a cleft palate and its effect on speech development challenges surgeons to not only effectively repair the cleft, but to also restore function of the palate for adequate speech. Coordination with speech pathologists is integral for effective management of cleft palate patients, particularly as children begin to develop language. In this article, the authors review and summarize the various challenges and goals of cleft palate management. PMID:24179449
Asymmetries in the Processing of Vowel Height

ERIC Educational Resources Information Center

Scharinger, Mathias; Monahan, Philip J.; Idsardi, William J.

2012-01-01

Purpose: Speech perception can be described as the transformation of continuous acoustic information into discrete memory representations. Therefore, research on neural representations of speech sounds is particularly important for a better understanding of this transformation. Speech perception models make specific assumptions regarding the…
An analysis of the masking of speech by competing speech using self-report data.

PubMed

Agus, Trevor R; Akeroyd, Michael A; Noble, William; Bhullar, Navjot

2009-01-01

Many of the items in the "Speech, Spatial, and Qualities of Hearing" scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol. 43, 85-99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively higher scores for competing speech (six items), energetic masking (one item), and no masking (three items). The results suggest significant masking by competing speech in everyday listening situations.
[Modeling developmental aspects of sensorimotor control of speech production].

PubMed

Kröger, B J; Birkholz, P; Neuschaefer-Rube, C

2007-05-01

Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the neural functions of different cortical areas during speech production in detail. (i) Two sensory and two motor maps or neural representations and the appertaining neural mappings or projections establish the sensorimotor feedback control system. These maps and mappings are already formed and trained during the prelinguistic phase of speech acquisition. (ii) The feedforward sensorimotor control system comprises the lexical map (representations of sounds, syllables, and words of the first language) and the mappings from lexical to sensory and to motor maps. The training of the appertaining mappings form the linguistic phase of speech acquisition. (iii) Three prelinguistic learning phases--i. e. silent mouthing, quasi stationary vocalic articulation, and realisation of articulatory protogestures--can be defined on the basis of our simulation studies using the computational neural model. These learning phases can be associated with temporal phases of prelinguistic speech acquisition obtained from natural data. The neural model illuminates the detailed function of specific cortical areas during speech production. In particular it can be shown that developmental disorders of speech production may result from a delayed or incorrect process within one of the prelinguistic learning phases defined by the neural model.
Speech perception in noise with a harmonic complex excited vocoder.

PubMed

Churchill, Tyler H; Kan, Alan; Goupell, Matthew J; Ihlefeld, Antje; Litovsky, Ruth Y

2014-04-01

A cochlear implant (CI) presents band-pass-filtered acoustic envelope information by modulating current pulse train levels. Similarly, a vocoder presents envelope information by modulating an acoustic carrier. By studying how normal hearing (NH) listeners are able to understand degraded speech signals with a vocoder, the parameters that best simulate electric hearing and factors that might contribute to the NH-CI performance difference may be better understood. A vocoder with harmonic complex carriers (fundamental frequency, f0 = 100 Hz) was used to study the effect of carrier phase dispersion on speech envelopes and intelligibility. The starting phases of the harmonic components were randomly dispersed to varying degrees prior to carrier filtering and modulation. NH listeners were tested on recognition of a closed set of vocoded words in background noise. Two sets of synthesis filters simulated different amounts of current spread in CIs. Results showed that the speech vocoded with carriers whose starting phases were maximally dispersed was the most intelligible. Superior speech understanding may have been a result of the flattening of the dispersed-phase carrier's intrinsic temporal envelopes produced by the large number of interacting components in the high-frequency channels. Cross-correlogram analyses of auditory nerve model simulations confirmed that randomly dispersing the carrier's component starting phases resulted in better neural envelope representation. However, neural metrics extracted from these analyses were not found to accurately predict speech recognition scores for all vocoded speech conditions. It is possible that central speech understanding mechanisms are insensitive to the envelope-fine structure dichotomy exploited by vocoders.
Helping Metaphors Take Root in the EFL Classroom

ERIC Educational Resources Information Center

Lowery, Denise

2013-01-01

Learners of English as a foreign language often find it difficult to understand figurative speech, which relies heavily on metaphor. This article explores why metaphors challenge learners and presents ways to incorporate metaphors into EFL instruction to help learners understand figurative speech. Topics discussed include cognitive metaphor,…

Speech Understanding Research. Annual Technical Report.

ERIC Educational Resources Information Center

Walker, Donald E.; And Others

This report is the third in a series of annual reports describing the research performed by Stanford Research Institute to provide the technology that will allow speech understanding systems to be designed and implemented for a variety of different task domains and environmental constraints. The current work is being carried out cooperatively with…
Cochlear Implantation in Older Adults

PubMed Central

Lin, Frank R.; Chien, Wade W.; Li, Lingsheng; Niparko, John K.; Francis, Howard W.

2012-01-01

Cochlear implants allow individuals with severe-to-profound hearing loss access to sound and spoken language. The number of older adults in the United States who are potential candidates for cochlear implantation is approximately 150,000 and will continue to increase with the aging of the population. Should cochlear implantation (CI) be routinely recommended for these older adults, and do these individuals benefit from CI? We reviewed our 12 year experience with cochlear implantation in adults ≥60 years (n = 445) at Johns Hopkins to investigate the impact of CI on speech understanding and to identify factors associated with speech performance. Complete data on speech outcomes at baseline and 1 year post-CI were available for 83 individuals. Our results demonstrate that cochlear implantation in adults ≥60 years consistently improved speech understanding scores with a mean increase of 60. 0% (S. D. 24. 1) on HINT sentences in quiet . The magnitude of the gain in speech scores was negatively associated with age at implantation such that for every increasing year of age at CI the gain in speech scores was 1. 3 percentage points less (95% CI: 0. 6 – 1. 9) after adjusting for age at hearing loss onset. Conversely, individuals with higher pre-CI speech scores (HINT scores between 40–60%) had significantly greater post-CI speech scores by a mean of 10. 0 percentage points (95% CI: 0. 4 – 19. 6) than those with lower pre-CI speech scores (HINT <40%) after adjusting for age at CI and age at hearing loss onset. These results suggest that older adult CI candidates who are younger at implantation and with higher preoperative speech scores obtain the highest speech understanding scores after cochlear implantation with possible implications for current Medicare policy. Finally, we provide an extended discussion of the epidemiology and impact of hearing loss in older adults. Future research of CI in older adults should expand beyond simple speech outcomes to take into account the broad cognitive, social, and physical functioning outcomes that are likely detrimentally impacted by hearing loss and may be mitigated by cochlear implantation. PMID:22932787
Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding?

PubMed

Meeuws, Matthias; Pascoal, David; Bermejo, Iñigo; Artaso, Miguel; De Ceulaer, Geert; Govaerts, Paul J

2017-07-01

The software application FOX ('Fitting to Outcome eXpert') is an intelligent agent to assist in the programing of cochlear implant (CI) processors. The current version utilizes a mixture of deterministic and probabilistic logic which is able to improve over time through a learning effect. This study aimed at assessing whether this learning capacity yields measurable improvements in speech understanding. A retrospective study was performed on 25 consecutive CI recipients with a median CI use experience of 10 years who came for their annual CI follow-up fitting session. All subjects were assessed by means of speech audiometry with open set monosyllables at 40, 55, 70, and 85 dB SPL in quiet with their home MAP. Other psychoacoustic tests were executed depending on the audiologist's clinical judgment. The home MAP and the corresponding test results were entered into FOX. If FOX suggested to make MAP changes, they were implemented and another speech audiometry was performed with the new MAP. FOX suggested MAP changes in 21 subjects (84%). The within-subject comparison showed a significant median improvement of 10, 3, 1, and 7% at 40, 55, 70, and 85 dB SPL, respectively. All but two subjects showed an instantaneous improvement in their mean speech audiometric score. Persons with long-term CI use, who received a FOX-assisted CI fitting at least 6 months ago, display improved speech understanding after MAP modifications, as recommended by the current version of FOX. This can be explained only by intrinsic improvements in FOX's algorithms, as they have resulted from learning. This learning is an inherent feature of artificial intelligence and it may yield measurable benefit in speech understanding even in long-term CI recipients.
Spectro-temporal cues enhance modulation sensitivity in cochlear implant users.

PubMed

Zheng, Yi; Escabí, Monty; Litovsky, Ruth Y

2017-08-01

Although speech understanding is highly variable amongst cochlear implants (CIs) subjects, the remarkably high speech recognition performance of many CI users is unexpected and not well understood. Numerous factors, including neural health and degradation of the spectral information in the speech signal of CIs, likely contribute to speech understanding. We studied the ability to use spectro-temporal modulations, which may be critical for speech understanding and discrimination, and hypothesize that CI users adopt a different perceptual strategy than normal-hearing (NH) individuals, whereby they rely more heavily on joint spectro-temporal cues to enhance detection of auditory cues. Modulation detection sensitivity was studied in CI users and NH subjects using broadband "ripple" stimuli that were modulated spectrally, temporally, or jointly, i.e., spectro-temporally. The spectro-temporal modulation transfer functions of CI users and NH subjects was decomposed into spectral and temporal dimensions and compared to those subjects' spectral-only and temporal-only modulation transfer functions. In CI users, the joint spectro-temporal sensitivity was better than that predicted by spectral-only and temporal-only sensitivity, indicating a heightened spectro-temporal sensitivity. Such an enhancement through the combined integration of spectral and temporal cues was not observed in NH subjects. The unique use of spectro-temporal cues by CI patients can yield benefits for use of cues that are important for speech understanding. This finding has implications for developing sound processing strategies that may rely on joint spectro-temporal modulations to improve speech comprehension of CI users, and the findings of this study may be valuable for developing clinical assessment tools to optimize CI processor performance. Copyright © 2017 Elsevier B.V. All rights reserved.
Speech Restoration: An Interactive Process

ERIC Educational Resources Information Center

Grataloup, Claire; Hoen, Michael; Veuillet, Evelyne; Collet, Lionel; Pellegrino, Francois; Meunier, Fanny

2009-01-01

Purpose: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. Method: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them…
Anatomy and Physiology of the Speech Mechanism.

ERIC Educational Resources Information Center

Sheets, Boyd V.

This monograph on the anatomical and physiological aspects of the speech mechanism stresses the importance of a general understanding of the process of verbal communication. Contents include "Positions of the Body,""Basic Concepts Linked with the Speech Mechanism,""The Nervous System,""The Respiratory System--Sound-Power Source,""The…
Mirror neuron system as the joint from action to language.

PubMed

Chen, Wei; Yuan, Ti-Fei

2008-08-01

Mirror neuron system (MNS) represents one of the most important discoveries of cognitive neuroscience in the past decade, and it has been found to involve in multiple aspects of brain functions including action understanding, imitation, language understanding, empathy, action prediction and speech evolution. This manuscript reviewed the function of MNS in action understanding as well as language evolution, and specifically assessed its roles as the bridge from body language to fluent speeches. Then we discussed the speech defects of autism patients due to the disruption of MNS. Finally, given that MNS is plastic in adult brain, we proposed MNS targeted therapy provides an efficient rehabilitation approach for brain damages conditions as well as autism patients.
New developments in the management of speech and language disorders.

PubMed

Harding, Celia; Gourlay, Sara

2008-05-01

Speech and language disorders, which include swallowing difficulties, are usually managed by speech and language therapists. Such a diverse, complex and challenging clinical group of symptoms requires practitioners with detailed knowledge and understanding of research within those areas, as well as the ability to implement appropriate therapy strategies within many environments. These environments range from neonatal units, acute paediatric wards and health centres through to nurseries, schools and children's homes. This paper summarises the key issues that are fundamental to our understanding of this client group.
A Model for Speech Processing in Second Language Listening Activities

ERIC Educational Resources Information Center

Zoghbor, Wafa Shahada

2016-01-01

Teachers' understanding of the process of speech perception could inform practice in listening classrooms. Catford (1950) developed a model for speech perception taking into account the influence of the acoustic features of the linguistic forms used by the speaker, whereby the listener "identifies" and "interprets" these…
Speech-Language Therapists' Process of Including Significant Others in Aphasia Rehabilitation

ERIC Educational Resources Information Center

Hallé, Marie-Christine; Le Dorze, Guylaine; Mingant, Anne

2014-01-01

Background: Although aphasia rehabilitation should include significant others, it is currently unknown how this recommendation is adopted in speech-language therapy practice. Speech-language therapists' (SLTs) experience of including significant others in aphasia rehabilitation is also understudied, yet a better understanding of clinical…
Hierarchical Spatiotemporal Dynamics of Speech Rhythm and Articulation

ERIC Educational Resources Information Center

Tilsen, Samuel Edward

2009-01-01

Hierarchy is one of the most important concepts in the scientific study of language. This dissertation aims to understand why we observe hierarchical structures in speech by investigating the cognitive processes from which they emerge. To that end, the dissertation explores how articulatory, rhythmic, and prosodic patterns of speech interact.…
High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bouchard, Kristofer E.; Conant, David F.; Anumanchipalli, Gopala K.

A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship acrossmore » speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics.« less
High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings

PubMed Central

Anumanchipalli, Gopala K.; Dichter, Benjamin; Chaisanguanthum, Kris S.; Johnson, Keith; Chang, Edward F.

2016-01-01

A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial—especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship across speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics. PMID:27019106
High-Resolution, Non-Invasive Imaging of Upper Vocal Tract Articulators Compatible with Human Brain Recordings

DOE PAGES

Bouchard, Kristofer E.; Conant, David F.; Anumanchipalli, Gopala K.; ...

2016-03-28

A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship acrossmore » speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics.« less
Combined Electric and Contralateral Acoustic Hearing: Word and Sentence Recognition with Bimodal Hearing

ERIC Educational Resources Information Center

Gifford, Rene H.; Dorman, Michael F.; McKarns, Sharon A.; Spahr, Anthony J.

2007-01-01

Purpose: The authors assessed whether (a) a full-insertion cochlear implant would provide a higher level of speech understanding than bilateral low-frequency acoustic hearing, (b) contralateral acoustic hearing would add to the speech understanding provided by the implant, and (c) the level of performance achieved with electric stimulation plus…
Effects of degree and configuration of hearing loss on the contribution of high- and low-frequency speech information to bilateral speech understanding

PubMed Central

Hornsby, Benjamin W. Y.; Johnson, Earl E.; Picou, Erin

2011-01-01

Objectives The purpose of this study was to examine the effects of degree and configuration of hearing loss on the use of, and benefit from, information in amplified high- and low-frequency speech presented in background noise. Design Sixty-two adults with a wide range of high- and low-frequency sensorineural hearing loss (5–115+ dB HL) participated. To examine the contribution of speech information in different frequency regions, speech understanding in noise was assessed in multiple low- and high-pass filter conditions, as well as a band-pass (713–3534 Hz) and wideband (143–8976 Hz) condition. To increase audibility over a wide frequency range, speech and noise were amplified based on each individual’s hearing loss. A stepwise multiple linear regression approach was used to examine the contribution of several factors to 1) absolute performance in each filter condition and 2) the change in performance with the addition of amplified high- and low-frequency speech components. Results Results from the regression analysis showed that degree of hearing loss was the strongest predictor of absolute performance for low- and high-pass filtered speech materials. In addition, configuration of hearing loss affected both absolute performance for severely low-pass filtered speech and benefit from extending high-frequency (3534–8976 Hz) bandwidth. Specifically, individuals with steeply sloping high-frequency losses made better use of low-pass filtered speech information than individuals with similar low-frequency thresholds but less high-frequency loss. In contrast, given similar high-frequency thresholds, individuals with flat hearing losses received more benefit from extending high-frequency bandwidth than individuals with more sloping losses. Conclusions Consistent with previous work, benefit from speech information in a given frequency region generally decreases as degree of hearing loss in that frequency region increases. However, given a similar degree of loss, the configuration of hearing loss also affects the ability to use speech information in different frequency regions. Except for individuals with steeply sloping high-frequency losses, providing high-frequency amplification (3534–8976 Hz) had either a beneficial effect on, or did not significantly degrade, speech understanding. These findings highlight the importance of extended high-frequency amplification for listeners with a wide range of high-frequency hearing losses, when seeking to maximize intelligibility. PMID:21336138
Musical Experience and the Aging Auditory System: Implications for Cognitive Abilities and Hearing Speech in Noise

PubMed Central

Parbery-Clark, Alexandra; Strait, Dana L.; Anderson, Samira; Hittner, Emily; Kraus, Nina

2011-01-01

Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18–30), we asked whether musical experience benefits an older cohort of musicians (ages 45–65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline. PMID:21589653
Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise.

PubMed

Parbery-Clark, Alexandra; Strait, Dana L; Anderson, Samira; Hittner, Emily; Kraus, Nina

2011-05-11

Much of our daily communication occurs in the presence of background noise, compromising our ability to hear. While understanding speech in noise is a challenge for everyone, it becomes increasingly difficult as we age. Although aging is generally accompanied by hearing loss, this perceptual decline cannot fully account for the difficulties experienced by older adults for hearing in noise. Decreased cognitive skills concurrent with reduced perceptual acuity are thought to contribute to the difficulty older adults experience understanding speech in noise. Given that musical experience positively impacts speech perception in noise in young adults (ages 18-30), we asked whether musical experience benefits an older cohort of musicians (ages 45-65), potentially offsetting the age-related decline in speech-in-noise perceptual abilities and associated cognitive function (i.e., working memory). Consistent with performance in young adults, older musicians demonstrated enhanced speech-in-noise perception relative to nonmusicians along with greater auditory, but not visual, working memory capacity. By demonstrating that speech-in-noise perception and related cognitive function are enhanced in older musicians, our results imply that musical training may reduce the impact of age-related auditory decline.
Spectral and temporal resolutions of information-bearing acoustic changes for understanding vocoded sentencesa)

PubMed Central

Stilp, Christian E.; Goupell, Matthew J.

2015-01-01

Short-time spectral changes in the speech signal are important for understanding noise-vocoded sentences. These information-bearing acoustic changes, measured using cochlea-scaled entropy in cochlear implant simulations [CSECI; Stilp et al. (2013). J. Acoust. Soc. Am. 133(2), EL136–EL141; Stilp (2014). J. Acoust. Soc. Am. 135(3), 1518–1529], may offer better understanding of speech perception by cochlear implant (CI) users. However, perceptual importance of CSECI for normal-hearing listeners was tested at only one spectral resolution and one temporal resolution, limiting generalizability of results to CI users. Here, experiments investigated the importance of these informational changes for understanding noise-vocoded sentences at different spectral resolutions (4–24 spectral channels; Experiment 1), temporal resolutions (4–64 Hz cutoff for low-pass filters that extracted amplitude envelopes; Experiment 2), or when both parameters varied (6–12 channels, 8–32 Hz; Experiment 3). Sentence intelligibility was reduced more by replacing high-CSECI intervals with noise than replacing low-CSECI intervals, but only when sentences had sufficient spectral and/or temporal resolution. High-CSECI intervals were more important for speech understanding as spectral resolution worsened and temporal resolution improved. Trade-offs between CSECI and intermediate spectral and temporal resolutions were minimal. These results suggest that signal processing strategies that emphasize information-bearing acoustic changes in speech may improve speech perception for CI users. PMID:25698018
Bone-anchored Hearing Aids: correlation between pure-tone thresholds and outcome in three user groups.

PubMed

Pfiffner, Flurin; Kompis, Martin; Stieger, Christof

2009-10-01

To investigate correlations between preoperative hearing thresholds and postoperative aided thresholds and speech understanding of users of Bone-anchored Hearing Aids (BAHA). Such correlations may be useful to estimate the postoperative outcome with BAHA from preoperative data. Retrospective case review. Tertiary referral center. : Ninety-two adult unilaterally implanted BAHA users in 3 groups: (A) 24 subjects with a unilateral conductive hearing loss, (B) 38 subjects with a bilateral conductive hearing loss, and (C) 30 subjects with single-sided deafness. Preoperative air-conduction and bone-conduction thresholds and 3-month postoperative aided and unaided sound-field thresholds as well as speech understanding using German 2-digit numbers and monosyllabic words were measured and analyzed. Correlation between preoperative air-conduction and bone-conduction thresholds of the better and of the poorer ear and postoperative aided thresholds as well as correlations between gain in sound-field threshold and gain in speech understanding. Aided postoperative sound-field thresholds correlate best with BC threshold of the better ear (correlation coefficients, r2 = 0.237 to 0.419, p = 0.0006 to 0.0064, depending on the group of subjects). Improvements in sound-field threshold correspond to improvements in speech understanding. When estimating expected postoperative aided sound-field thresholds of BAHA users from preoperative hearing thresholds, the BC threshold of the better ear should be used. For the patient groups considered, speech understanding in quiet can be estimated from the improvement in sound-field thresholds.

Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior

PubMed Central

2018-01-01

Everyday conversation frequently includes challenges to the clarity of the acoustic speech signal, including hearing impairment, background noise, and foreign accents. Although an obvious problem is the increased risk of making word identification errors, extracting meaning from a degraded acoustic signal is also cognitively demanding, which contributes to increased listening effort. The concepts of cognitive demand and listening effort are critical in understanding the challenges listeners face in comprehension, which are not fully predicted by audiometric measures. In this article, the authors review converging behavioral, pupillometric, and neuroimaging evidence that understanding acoustically degraded speech requires additional cognitive support and that this cognitive load can interfere with other operations such as language processing and memory for what has been heard. Behaviorally, acoustic challenge is associated with increased errors in speech understanding, poorer performance on concurrent secondary tasks, more difficulty processing linguistically complex sentences, and reduced memory for verbal material. Measures of pupil dilation support the challenge associated with processing a degraded acoustic signal, indirectly reflecting an increase in neural activity. Finally, functional brain imaging reveals that the neural resources required to understand degraded speech extend beyond traditional perisylvian language networks, most commonly including regions of prefrontal cortex, premotor cortex, and the cingulo-opercular network. Far from being exclusively an auditory problem, acoustic degradation presents listeners with a systems-level challenge that requires the allocation of executive cognitive resources. An important point is that a number of dissociable processes can be engaged to understand degraded speech, including verbal working memory and attention-based performance monitoring. The specific resources required likely differ as a function of the acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners’ abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication. PMID:28938250
Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior.

PubMed

Peelle, Jonathan E

Everyday conversation frequently includes challenges to the clarity of the acoustic speech signal, including hearing impairment, background noise, and foreign accents. Although an obvious problem is the increased risk of making word identification errors, extracting meaning from a degraded acoustic signal is also cognitively demanding, which contributes to increased listening effort. The concepts of cognitive demand and listening effort are critical in understanding the challenges listeners face in comprehension, which are not fully predicted by audiometric measures. In this article, the authors review converging behavioral, pupillometric, and neuroimaging evidence that understanding acoustically degraded speech requires additional cognitive support and that this cognitive load can interfere with other operations such as language processing and memory for what has been heard. Behaviorally, acoustic challenge is associated with increased errors in speech understanding, poorer performance on concurrent secondary tasks, more difficulty processing linguistically complex sentences, and reduced memory for verbal material. Measures of pupil dilation support the challenge associated with processing a degraded acoustic signal, indirectly reflecting an increase in neural activity. Finally, functional brain imaging reveals that the neural resources required to understand degraded speech extend beyond traditional perisylvian language networks, most commonly including regions of prefrontal cortex, premotor cortex, and the cingulo-opercular network. Far from being exclusively an auditory problem, acoustic degradation presents listeners with a systems-level challenge that requires the allocation of executive cognitive resources. An important point is that a number of dissociable processes can be engaged to understand degraded speech, including verbal working memory and attention-based performance monitoring. The specific resources required likely differ as a function of the acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners' abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication.
The role of auditory and cognitive factors in understanding speech in noise by normal-hearing older listeners

PubMed Central

Schoof, Tim; Rosen, Stuart

2014-01-01

Normal-hearing older adults often experience increased difficulties understanding speech in noise. In addition, they benefit less from amplitude fluctuations in the masker. These difficulties may be attributed to an age-related auditory temporal processing deficit. However, a decline in cognitive processing likely also plays an important role. This study examined the relative contribution of declines in both auditory and cognitive processing to the speech in noise performance in older adults. Participants included older (60–72 years) and younger (19–29 years) adults with normal hearing. Speech reception thresholds (SRTs) were measured for sentences in steady-state speech-shaped noise (SS), 10-Hz sinusoidally amplitude-modulated speech-shaped noise (AM), and two-talker babble. In addition, auditory temporal processing abilities were assessed by measuring thresholds for gap, amplitude-modulation, and frequency-modulation detection. Measures of processing speed, attention, working memory, Text Reception Threshold (a visual analog of the SRT), and reading ability were also obtained. Of primary interest was the extent to which the various measures correlate with listeners' abilities to perceive speech in noise. SRTs were significantly worse for older adults in the presence of two-talker babble but not SS and AM noise. In addition, older adults showed some cognitive processing declines (working memory and processing speed) although no declines in auditory temporal processing. However, working memory and processing speed did not correlate significantly with SRTs in babble. Despite declines in cognitive processing, normal-hearing older adults do not necessarily have problems understanding speech in noise as SRTs in SS and AM noise did not differ significantly between the two groups. Moreover, while older adults had higher SRTs in two-talker babble, this could not be explained by age-related cognitive declines in working memory or processing speed. PMID:25429266
Content Analysis of the Professional Journal of the Royal College of Speech and Language Therapists, III: 1966-2015--Into the 21st Century

ERIC Educational Resources Information Center

Armstrong, Linda; Stansfield, Jois; Bloch, Steven

2017-01-01

Background: Following content analyses of the first 30 years of the UK speech and language therapy professional body's journal, this study was conducted to survey the published work of the speech (and language) therapy profession over the last 50 years and trace key changes and themes. Aim: To understand better the development of the UK speech and…
Phrase-level speech simulation with an airway modulation model of speech production

PubMed Central

Story, Brad H.

2012-01-01

Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated. PMID:23503742
When cognition kicks in: working memory and speech understanding in noise.

PubMed

Rönnberg, Jerker; Rudner, Mary; Lunner, Thomas; Zekveld, Adriana A

2010-01-01

Perceptual load and cognitive load can be separately manipulated and dissociated in their effects on speech understanding in noise. The Ease of Language Understanding model assumes a theoretical position where perceptual task characteristics interact with the individual's implicit capacities to extract the phonological elements of speech. Phonological precision and speed of lexical access are important determinants for listening in adverse conditions. If there are mismatches between the phonological elements perceived and phonological representations in long-term memory, explicit working memory (WM)-related capacities will be continually invoked to reconstruct and infer the contents of the ongoing discourse. Whether this induces a high cognitive load or not will in turn depend on the individual's storage and processing capacities in WM. Data suggest that modulated noise maskers may serve as triggers for speech maskers and therefore induce a WM, explicit mode of processing. Individuals with high WM capacity benefit more than low WM-capacity individuals from fast amplitude compression at low or negative input speech-to-noise ratios. The general conclusion is that there is an overarching interaction between the focal purpose of processing in the primary listening task and the extent to which a secondary, distracting task taps into these processes.
A Noninvasive Imaging Approach to Understanding Speech Changes following Deep Brain Stimulation in Parkinson's Disease

ERIC Educational Resources Information Center

Narayana, Shalini; Jacks, Adam; Robin, Donald A.; Poizner, Howard; Zhang, Wei; Franklin, Crystal; Liotti, Mario; Vogel, Deanie; Fox, Peter T.

2009-01-01

Purpose: To explore the use of noninvasive functional imaging and "virtual" lesion techniques to study the neural mechanisms underlying motor speech disorders in Parkinson's disease. Here, we report the use of positron emission tomography (PET) and transcranial magnetic stimulation (TMS) to explain exacerbated speech impairment following…
Sentence-Level Movements in Parkinson's Disease: Loud, Clear, and Slow Speech

ERIC Educational Resources Information Center

Kearney,Elaine; Giles, Renuka; Haworth, Brandon; Faloutsos, Petros; Baljko, Melanie; Yunusova, Yana

2017-01-01

Purpose: To further understand the effect of Parkinson's disease (PD) on articulatory movements in speech and to expand our knowledge of therapeutic treatment strategies, this study examined movements of the jaw, tongue blade, and tongue dorsum during sentence production with respect to speech intelligibility and compared the effect of varying…
Articulatory Control in Childhood Apraxia of Speech in a Novel Word-Learning Task

ERIC Educational Resources Information Center

Case, Julie; Grigos, Maria I.

2016-01-01

Purpose: Articulatory control and speech production accuracy were examined in children with childhood apraxia of speech (CAS) and typically developing (TD) controls within a novel word-learning task to better understand the influence of planning and programming deficits in the production of unfamiliar words. Method: Participants included 16…
Effect of Age on Silent Gap Discrimination in Synthetic Speech Stimuli.

ERIC Educational Resources Information Center

Lister, Jennifer; Tarver, Kenton

2004-01-01

The difficulty that older listeners experience understanding conversational speech may be related to their limited ability to use information present in the silent intervals (i.e., temporal gaps) between dynamic speech sounds. When temporal gaps are present between nonspeech stimuli that are spectrally invariant (e.g., noise bands or sinusoids),…
Mock Trial: A Window to Free Speech Rights and Abilities

ERIC Educational Resources Information Center

Schwartz, Sherry

2010-01-01

This article provides some strategies to alleviate the current tensions between personal responsibility and freedom of speech rights in the public school classroom. The article advocates the necessity of making sure students understand the points and implications of the first amendment by providing a mock trial unit concerning free speech rights.…
Speech Perception in Complex Acoustic Environments: Developmental Effects

ERIC Educational Resources Information Center

Leibold, Lori J.

2017-01-01

Purpose: The ability to hear and understand speech in complex acoustic environments follows a prolonged time course of development. The purpose of this article is to provide a general overview of the literature describing age effects in susceptibility to auditory masking in the context of speech recognition, including a summary of findings related…
Central Presbycusis: A Review and Evaluation of the Evidence

PubMed Central

Humes, Larry E.; Dubno, Judy R.; Gordon-Salant, Sandra; Lister, Jennifer J.; Cacace, Anthony T.; Cruickshanks, Karen J.; Gates, George A.; Wilson, Richard H.; Wingfield, Arthur

2018-01-01

Background The authors reviewed the evidence regarding the existence of age-related declines in central auditory processes and the consequences of any such declines for everyday communication. Purpose This report summarizes the review process and presents its findings. Data Collection and Analysis The authors reviewed 165 articles germane to central presbycusis. Of the 165 articles, 132 articles with a focus on human behavioral measures for either speech or nonspeech stimuli were selected for further analysis. Results For 76 smaller-scale studies of speech understanding in older adults reviewed, the following findings emerged: (1) the three most commonly studied behavioral measures were speech in competition, temporally distorted speech, and binaural speech perception (especially dichotic listening); (2) for speech in competition and temporally degraded speech, hearing loss proved to have a significant negative effect on performance in most of the laboratory studies; (3) significant negative effects of age, unconfounded by hearing loss, were observed in most of the studies of speech in competing speech, time-compressed speech, and binaural speech perception; and (4) the influence of cognitive processing on speech understanding has been examined much less frequently, but when included, significant positive associations with speech understanding were observed. For 36 smaller-scale studies of the perception of nonspeech stimuli by older adults reviewed, the following findings emerged: (1) the three most frequently studied behavioral measures were gap detection, temporal discrimination, and temporal-order discrimination or identification; (2) hearing loss was seldom a significant factor; and (3) negative effects of age were almost always observed. For 18 studies reviewed that made use of test batteries and medium-to-large sample sizes, the following findings emerged: (1) all studies included speech-based measures of auditory processing; (2) 4 of the 18 studies included nonspeech stimuli; (3) for the speech-based measures, monaural speech in a competing-speech background, dichotic speech, and monaural time-compressed speech were investigated most frequently; (4) the most frequently used tests were the Synthetic Sentence Identification (SSI) test with Ipsilateral Competing Message (ICM), the Dichotic Sentence Identification (DSI) test, and time-compressed speech; (5) many of these studies using speech-based measures reported significant effects of age, but most of these studies were confounded by declines in hearing, cognition, or both; (6) for nonspeech auditory-processing measures, the focus was on measures of temporal processing in all four studies; (7) effects of cognition on nonspeech measures of auditory processing have been studied less frequently, with mixed results, whereas the effects of hearing loss on performance were minimal due to judicious selection of stimuli; and (8) there is a paucity of observational studies using test batteries and longitudinal designs. Conclusions Based on this review of the scientific literature, there is insufficient evidence to confirm the existence of central presbycusis as an isolated entity. On the other hand, recent evidence has been accumulating in support of the existence of central presbycusis as a multifactorial condition that involves age- and/or disease-related changes in the auditory system and in the brain. Moreover, there is a clear need for additional research in this area. PMID:22967738
Analyzing crowdsourced ratings of speech-based take-over requests for automated driving.

PubMed

Bazilinskyy, P; de Winter, J C F

2017-10-01

Take-over requests in automated driving should fit the urgency of the traffic situation. The robustness of various published research findings on the valuations of speech-based warning messages is unclear. This research aimed to establish how people value speech-based take-over requests as a function of speech rate, background noise, spoken phrase, and speaker's gender and emotional tone. By means of crowdsourcing, 2669 participants from 95 countries listened to a random 10 out of 140 take-over requests, and rated each take-over request on urgency, commandingness, pleasantness, and ease of understanding. Our results replicate several published findings, in particular that an increase in speech rate results in a monotonic increase of perceived urgency. The female voice was easier to understand than a male voice when there was a high level of background noise, a finding that contradicts the literature. Moreover, a take-over request spoken with Indian accent was found to be easier to understand by participants from India than by participants from other countries. Our results replicate effects in the literature regarding speech-based warnings, and shed new light on effects of background noise, gender, and nationality. The results may have implications for the selection of appropriate take-over requests in automated driving. Additionally, our study demonstrates the promise of crowdsourcing for testing human factors and ergonomics theories with large sample sizes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Initial Development of a Spatially Separated Speech-in-Noise and Localization Training Program

PubMed Central

Tyler, Richard S.; Witt, Shelley A.; Dunn, Camille C.; Wang, Wenjun

2010-01-01

Objective This article describes the initial development of a novel approach for training hearing-impaired listeners to improve their ability to understand speech in the presence of background noise and to also improve their ability to localize sounds. Design Most people with hearing loss, even those well fit with hearing devices, still experience significant problems understanding speech in noise. Prior research suggests that at least some subjects can experience improved speech understanding with training. However, all training systems that we are aware of have one basic, critical limitation. They do not provide spatial separation of the speech and noise, therefore ignoring the potential benefits of training binaural hearing. In this paper we describe our initial experience with a home-based training system that includes spatially separated speech-in-noise and localization training. Results Throughout the development of this system patient input, training and preliminary pilot data from individuals with bilateral cochlear implants were utilized. Positive feedback from subjective reports indicated that some individuals were engaged in the treatment, and formal testing showed benefit. Feedback and practical issues resulted from the reduction of an eight-loudspeaker to a two-loudspeaker system. Conclusions These preliminary findings suggest we have successfully developed a viable spatial hearing training system that can improve binaural hearing in noise and localization. Applications include, but are not limited to, hearing with hearing aids and cochlear implants. PMID:20701836
Can You Understand Me? Speaking Robots and Accented Speech

ERIC Educational Resources Information Center

Moussalli, Souheila; Cardoso, Walcir

2017-01-01

The results of our previous research on the pedagogical use of Speaking Robots (SRs) revealed positive effects on motivating students to practice their oral skills in a stress-free environment. However, our findings indicated that the SR was sometimes unable to understand students' foreign accented speech. In this paper, we report the results of a…
Speech Understanding in Complex Listening Environments by Listeners Fit with Cochlear Implants

ERIC Educational Resources Information Center

Dorman, Michael F.; Gifford, Rene H.

2017-01-01

Purpose: The aim of this article is to summarize recent published and unpublished research from our 2 laboratories on improving speech understanding in complex listening environments by listeners fit with cochlear implants (CIs). Method: CI listeners were tested in 2 listening environments. One was a simulation of a restaurant with multiple,…
The Diagnosis and Understanding of Apraxia of Speech: Why Including Neurodegenerative Etiologies May Be Important

ERIC Educational Resources Information Center

Duffy, Joseph R.; Josephs, Keith A.

2012-01-01

Purpose: To discuss apraxia of speech (AOS) as it occurs in neurodegenerative disease (progressive AOS [PAOS]) and how its careful study may contribute to general concepts of AOS and help refine its diagnostic criteria. Method: The article summarizes our current understanding of the clinical features and neuroanatomical and pathologic correlates…
Speech-language pathology students' self-reports on voice training: easier to understand or to do?

PubMed

Lindhe, Christina; Hartelius, Lena

2009-01-01

The aim of the study was to describe the subjective ratings of the course 'Training of the student's own voice and speech', from a student-centred perspective. A questionnaire was completed after each of the six individual sessions. Six speech and language pathology (SLP) students rated how they perceived the practical exercises in terms of doing and understanding. The results showed that five of the six participants rated the exercises as significantly easier to understand than to do. The exercises were also rated as easier to do over time. Results are interpreted within in a theoretical framework of approaches to learning. The findings support the importance of both the physical and reflective aspects of the voice training process.
Technical devices for hearing-impaired individuals: cochlear implants and brain stem implants - developments of the last decade

PubMed Central

Müller, Joachim

2005-01-01

Over the past two decades, the fascinating possibilities of cochlear implants for congenitally deaf or deafened children and adults developed tremendously and created a rapidly developing interdisciplinary research field. The main advancements of cochlear implantation in the past decade are marked by significant improvement of hearing and speech understanding in CI users. These improvements are attributed to the enhancement of speech coding strategies. The Implantation of more (and increasingly younger) children as well as the possibilities of the restoration of binaural hearing abilities with cochlear implants reflect the high standards reached by this development. Despite this progress, modern cochlear implants do not yet enable normal speech understanding, not even for the best patients. In particular speech understanding in noise remains problematic [1]. Until the mid 1990ies research concentrated on unilateral implantation. Remarkable and effective improvements have been made with bilateral implantation since 1996. Nowadays an increasing numbers of patients enjoy these benefits. PMID:22073052

Technical devices for hearing-impaired individuals: cochlear implants and brain stem implants - developments of the last decade.

PubMed

Müller, Joachim

2005-01-01

Over the past two decades, the fascinating possibilities of cochlear implants for congenitally deaf or deafened children and adults developed tremendously and created a rapidly developing interdisciplinary research field.The main advancements of cochlear implantation in the past decade are marked by significant improvement of hearing and speech understanding in CI users. These improvements are attributed to the enhancement of speech coding strategies.The Implantation of more (and increasingly younger) children as well as the possibilities of the restoration of binaural hearing abilities with cochlear implants reflect the high standards reached by this development. Despite this progress, modern cochlear implants do not yet enable normal speech understanding, not even for the best patients. In particular speech understanding in noise remains problematic [1]. Until the mid 1990ies research concentrated on unilateral implantation. Remarkable and effective improvements have been made with bilateral implantation since 1996. Nowadays an increasing numbers of patients enjoy these benefits.
Child speech, language and communication need re-examined in a public health context: a new direction for the speech and language therapy profession.

PubMed

Law, James; Reilly, Sheena; Snow, Pamela C

2013-01-01

Historically speech and language therapy services for children have been framed within a rehabilitative framework with explicit assumptions made about providing therapy to individuals. While this is clearly important in many cases, we argue that this model needs revisiting for a number of reasons. First, our understanding of the nature of disability, and therefore communication disabilities, has changed over the past century. Second, there is an increasing understanding of the impact that the social gradient has on early communication difficulties. Finally, understanding how these factors interact with one other and have an impact across the life course remains poorly understood. To describe the public health paradigm and explore its implications for speech and language therapy with children. We test the application of public health methodologies to speech and language therapy services by looking at four dimensions of service delivery: (1) the uptake of services and whether those children who need services receive them; (2) the development of universal prevention services in relation to social disadvantage; (3) the risk of over-interpreting co-morbidity from clinical samples; and (4) the overlap between communicative competence and mental health. It is concluded that there is a strong case for speech and language therapy services to be reconceptualized to respond to the needs of the whole population and according to socially determined needs, focusing on primary prevention. This is not to disregard individual need, but to highlight the needs of the population as a whole. Although the socio-political context is different between countries, we maintain that this is relevant wherever speech and language therapists have a responsibility for covering whole populations. Finally, we recommend that speech and language therapy services be conceptualized within the framework laid down in The Ottawa Charter for Health Promotion. © 2013 Royal College of Speech and Language Therapists.
Unilateral Hearing Loss: Understanding Speech Recognition and Localization Variability-Implications for Cochlear Implant Candidacy.

PubMed

Firszt, Jill B; Reeder, Ruth M; Holden, Laura K

At a minimum, unilateral hearing loss (UHL) impairs sound localization ability and understanding speech in noisy environments, particularly if the loss is severe to profound. Accompanying the numerous negative consequences of UHL is considerable unexplained individual variability in the magnitude of its effects. Identification of covariables that affect outcome and contribute to variability in UHLs could augment counseling, treatment options, and rehabilitation. Cochlear implantation as a treatment for UHL is on the rise yet little is known about factors that could impact performance or whether there is a group at risk for poor cochlear implant outcomes when hearing is near-normal in one ear. The overall goal of our research is to investigate the range and source of variability in speech recognition in noise and localization among individuals with severe to profound UHL and thereby help determine factors relevant to decisions regarding cochlear implantation in this population. The present study evaluated adults with severe to profound UHL and adults with bilateral normal hearing. Measures included adaptive sentence understanding in diffuse restaurant noise, localization, roving-source speech recognition (words from 1 of 15 speakers in a 140° arc), and an adaptive speech-reception threshold psychoacoustic task with varied noise types and noise-source locations. There were three age-sex-matched groups: UHL (severe to profound hearing loss in one ear and normal hearing in the contralateral ear), normal hearing listening bilaterally, and normal hearing listening unilaterally. Although the normal-hearing-bilateral group scored significantly better and had less performance variability than UHLs on all measures, some UHL participants scored within the range of the normal-hearing-bilateral group on all measures. The normal-hearing participants listening unilaterally had better monosyllabic word understanding than UHLs for words presented on the blocked/deaf side but not the open/hearing side. In contrast, UHLs localized better than the normal-hearing unilateral listeners for stimuli on the open/hearing side but not the blocked/deaf side. This suggests that UHLs had learned strategies for improved localization on the side of the intact ear. The UHL and unilateral normal-hearing participant groups were not significantly different for speech in noise measures. UHL participants with childhood rather than recent hearing loss onset localized significantly better; however, these two groups did not differ for speech recognition in noise. Age at onset in UHL adults appears to affect localization ability differently than understanding speech in noise. Hearing thresholds were significantly correlated with speech recognition for UHL participants but not the other two groups. Auditory abilities of UHLs varied widely and could be explained only in part by hearing threshold levels. Age at onset and length of hearing loss influenced performance on some, but not all measures. Results support the need for a revised and diverse set of clinical measures, including sound localization, understanding speech in varied environments, and careful consideration of functional abilities as individuals with severe to profound UHL are being considered potential cochlear implant candidates.
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

PubMed Central

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2015-01-01

Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on functioning. PMID:26136699
Musicians change their tune: how hearing loss alters the neural code.

PubMed

Parbery-Clark, Alexandra; Anderson, Samira; Kraus, Nina

2013-08-01

Individuals with sensorineural hearing loss have difficulty understanding speech, especially in background noise. This deficit remains even when audibility is restored through amplification, suggesting that mechanisms beyond a reduction in peripheral sensitivity contribute to the perceptual difficulties associated with hearing loss. Given that normal-hearing musicians have enhanced auditory perceptual skills, including speech-in-noise perception, coupled with heightened subcortical responses to speech, we aimed to determine whether similar advantages could be observed in middle-aged adults with hearing loss. Results indicate that musicians with hearing loss, despite self-perceptions of average performance for understanding speech in noise, have a greater ability to hear in noise relative to nonmusicians. This is accompanied by more robust subcortical encoding of sound (e.g., stimulus-to-response correlations and response consistency) as well as more resilient neural responses to speech in the presence of background noise (e.g., neural timing). Musicians with hearing loss also demonstrate unique neural signatures of spectral encoding relative to nonmusicians: enhanced neural encoding of the speech-sound's fundamental frequency but not of its upper harmonics. This stands in contrast to previous outcomes in normal-hearing musicians, who have enhanced encoding of the harmonics but not the fundamental frequency. Taken together, our data suggest that although hearing loss modifies a musician's spectral encoding of speech, the musician advantage for perceiving speech in noise persists in a hearing-impaired population by adaptively strengthening underlying neural mechanisms for speech-in-noise perception. Copyright © 2013 Elsevier B.V. All rights reserved.
Speech profile of patients undergoing primary palatoplasty.

PubMed

Menegueti, Katia Ignacio; Mangilli, Laura Davison; Alonso, Nivaldo; Andrade, Claudia Regina Furquim de

2017-10-26

To characterize the profile and speech characteristics of patients undergoing primary palatoplasty in a Brazilian university hospital, considering the time of intervention (early, before two years of age; late, after two years of age). Participants were 97 patients of both genders with cleft palate and/or cleft and lip palate, assigned to the Speech-language Pathology Department, who had been submitted to primary palatoplasty and presented no prior history of speech-language therapy. Patients were divided into two groups: early intervention group (EIG) - 43 patients undergoing primary palatoplasty before 2 years of age and late intervention group (LIG) - 54 patients undergoing primary palatoplasty after 2 years of age. All patients underwent speech-language pathology assessment. The following parameters were assessed: resonance classification, presence of nasal turbulence, presence of weak intraoral air pressure, presence of audible nasal air emission, speech understandability, and compensatory articulation disorder (CAD). At statistical significance level of 5% (p≤0.05), no significant difference was observed between the groups in the following parameters: resonance classification (p=0.067); level of hypernasality (p=0.113), presence of nasal turbulence (p=0.179); presence of weak intraoral air pressure (p=0.152); presence of nasal air emission (p=0.369), and speech understandability (p=0.113). The groups differed with respect to presence of compensatory articulation disorders (p=0.020), with the LIG presenting higher occurrence of altered phonemes. It was possible to assess the general profile and speech characteristics of the study participants. Patients submitted to early primary palatoplasty present better speech profile.
Brain 'talks over' boring quotes: top-down activation of voice-selective areas while listening to monotonous direct speech quotations.

PubMed

Yao, Bo; Belin, Pascal; Scheepers, Christoph

2012-04-15

In human communication, direct speech (e.g., Mary said, "I'm hungry") is perceived as more vivid than indirect speech (e.g., Mary said that she was hungry). This vividness distinction has previously been found to underlie silent reading of quotations: Using functional magnetic resonance imaging (fMRI), we found that direct speech elicited higher brain activity in the temporal voice areas (TVA) of the auditory cortex than indirect speech, consistent with an "inner voice" experience in reading direct speech. Here we show that listening to monotonously spoken direct versus indirect speech quotations also engenders differential TVA activity. This suggests that individuals engage in top-down simulations or imagery of enriched supra-segmental acoustic representations while listening to monotonous direct speech. The findings shed new light on the acoustic nature of the "inner voice" in understanding direct speech. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Speech Understanding and Sound Source Localization by Cochlear Implant Listeners Using a Pinna-Effect Imitating Microphone and an Adaptive Beamformer.

PubMed

Dorman, Michael F; Natale, Sarah; Loiselle, Louise

2018-03-01

Sentence understanding scores for patients with cochlear implants (CIs) when tested in quiet are relatively high. However, sentence understanding scores for patients with CIs plummet with the addition of noise. To assess, for patients with CIs (MED-EL), (1) the value to speech understanding of two new, noise-reducing microphone settings and (2) the effect of the microphone settings on sound source localization. Single-subject, repeated measures design. For tests of speech understanding, repeated measures on (1) number of CIs (one, two), (2) microphone type (omni, natural, adaptive beamformer), and (3) type of noise (restaurant, cocktail party). For sound source localization, repeated measures on type of signal (low-pass [LP], high-pass [HP], broadband noise). Ten listeners, ranging in age from 48 to 83 yr (mean = 57 yr), participated in this prospective study. Speech understanding was assessed in two noise environments using monaural and bilateral CIs fit with three microphone types. Sound source localization was assessed using three microphone types. In Experiment 1, sentence understanding scores (in terms of percent words correct) were obtained in quiet and in noise. For each patient, noise was first added to the signal to drive performance off of the ceiling in the bilateral CI-omni microphone condition. The other conditions were then administered at that signal-to-noise ratio in quasi-random order. In Experiment 2, sound source localization accuracy was assessed for three signal types using a 13-loudspeaker array over a 180° arc. The dependent measure was root-mean-score error. Both the natural and adaptive microphone settings significantly improved speech understanding in the two noise environments. The magnitude of the improvement varied between 16 and 19 percentage points for tests conducted in the restaurant environment and between 19 and 36 percentage points for tests conducted in the cocktail party environment. In the restaurant and cocktail party environments, both the natural and adaptive settings, when implemented on a single CI, allowed scores that were as good as, or better, than scores in the bilateral omni test condition. Sound source localization accuracy was unaltered by either the natural or adaptive settings for LP, HP, or wideband noise stimuli. The data support the use of the natural microphone setting as a default setting. The natural setting (1) provides better speech understanding in noise than the omni setting, (2) does not impair sound source localization, and (3) retains low-frequency sensitivity to signals from the rear. Moreover, bilateral CIs equipped with adaptive beamforming technology can engender speech understanding scores in noise that fall only a little short of scores for a single CI in quiet. American Academy of Audiology
Assessment of Functional Hearing in Greek-Speaking Children Diagnosed with Central Auditory Processing Disorder.

PubMed

Sidiras, Chris; Iliadou, Vasiliki Vivian; Chermak, Gail D; Nimatoudis, Ioannis

2016-05-01

Including speech recognition in noise testing in audiological evaluations may reveal functional hearing deficits that may otherwise remain undetected. The current study explored the potential utility of the Speech-in-Babble (SinB) test in the assessment of central auditory processing disorder (CAPD) in young children for whom diagnosis is challenging. A cross-sectional analysis. Forty-one Greek children 4-13 yr of age diagnosed with CAPD and exhibiting listening and academic problems (clinical group) and 20 age-matched controls with no listening or academic problems participated in the study. All participants' auditory processing was assessed using the same tests and instrumentation in a sound-treated room. Two equivalent lists of the SinB test, developed at the Psychoacoustic Laboratory of the Aristotle University of Thessaloniki, were administered monaurally in a counterbalanced order. SinB consists of lists of 50 phonetically balanced disyllabic words presented in background multitalker babble. Five signal-to-noise ratios (SNRs) were used in a fixed order. The children were instructed to repeat the word after each presentation. The SNR at which the child achieved 50% correct word identification served as the dependent variable or outcome measure, with higher SinB scores (measured in SNR dB) corresponding to poorer performance. SinB performance was better (lower SNR) for the normal control group versus the clinical group [F(1,35) = 43.03, p < 0.0001]. SinB inversely correlated with age for both CAPD and control groups (r = -0.648, p < 0.001 and r = -0.658, p < 0.005, respectively). Regression analysis revealed that linear models better explained the variance in the data than a quadratic model for both the control and CAPD groups. The slope (beta value of the linear model) was steeper for the clinical group compared to the control group (beta = -0.306 versus beta = -0.130, respectively). An analysis of covariance run with age as the covariate to assess the potential effect of comorbidity on SinB performance in children with CAPD with and without comorbid conditions revealed no significant differences between groups [F(1,38) = 0.149, p > 0.05]. This study offers the first detailed presentation of the performance of Greek children on a Greek language SinB test. The main finding is that SinB scores improved as a function of age in a constant manner as represented by the slope of the linear regression line for both CAPD and control groups. Results suggest that this speech recognition in competition test holds promise for differentiating typically developing Greek children from those children with CAPD across the age range studied here (4-13 yr). The SinB seemed rather immune to the presence of comorbid conditions presented by some of the children in this study, suggesting its potential utility as a valid measure of central auditory processing. While there are many speech-in-noise or competition tests in English, there are fewer in other languages. Tests like the SinB should be developed in other languages to ensure that children demonstrating "listening" problems can be properly evaluated. American Academy of Audiology.
Dopamine Regulation of Human Speech and Bird Song: A Critical Review

ERIC Educational Resources Information Center

Simonyan, Kristina; Horwitz, Barry; Jarvis, Erich D.

2012-01-01

To understand the neural basis of human speech control, extensive research has been done using a variety of methodologies in a range of experimental models. Nevertheless, several critical questions about learned vocal motor control still remain open. One of them is the mechanism(s) by which neurotransmitters, such as dopamine, modulate speech and…
Co-Working: Parents' Conception of Roles in Supporting Their Children's Speech and Language Development

ERIC Educational Resources Information Center

Davies, Karen E.; Marshall, Julie; Brown, Laura J. E.; Goldbart, Juliet

2017-01-01

Speech and language therapists' (SLTs) roles include enabling parents to provide intervention. We know little about how parents understand their role during speech and language intervention or whether these change during involvement with SLTs. The theory of conceptual change, applied to parents as adult learners, is used as a framework for…
Function as a Determinant of Speech Production--Implications for Language Development in Schools.

ERIC Educational Resources Information Center

Skull, John

The function of speech and its implications for studying, understanding, and promoting language development are explored in this paper. Function is considered to be the purpose of the speaker when speaking, variously termed context of situation, situation, context, circumstance, or mode. It is noted that very few studies of speech and speech…
Speech Perception with Music Maskers by Cochlear Implant Users and Normal-Hearing Listeners

ERIC Educational Resources Information Center

Eskridge, Elizabeth N.; Galvin, John J., III; Aronoff, Justin M.; Li, Tianhao; Fu, Qian-Jie

2012-01-01

Purpose: The goal of this study was to investigate how the spectral and temporal properties in background music may interfere with cochlear implant (CI) and normal-hearing listeners' (NH) speech understanding. Method: Speech-recognition thresholds (SRTs) were adaptively measured in 11 CI and 9 NH subjects. CI subjects were tested while using their…
Cueing listeners to attend to a target talker progressively improves word report as the duration of the cue-target interval lengthens to 2,000 ms.

PubMed

Holmes, Emma; Kitterick, Padraig T; Summerfield, A Quentin

2018-04-25

Endogenous attention is typically studied by presenting instructive cues in advance of a target stimulus array. For endogenous visual attention, task performance improves as the duration of the cue-target interval increases up to 800 ms. Less is known about how endogenous auditory attention unfolds over time or the mechanisms by which an instructive cue presented in advance of an auditory array improves performance. The current experiment used five cue-target intervals (0, 250, 500, 1,000, and 2,000 ms) to compare four hypotheses for how preparatory attention develops over time in a multi-talker listening task. Young adults were cued to attend to a target talker who spoke in a mixture of three talkers. Visual cues indicated the target talker's spatial location or their gender. Participants directed attention to location and gender simultaneously ("objects") at all cue-target intervals. Participants were consistently faster and more accurate at reporting words spoken by the target talker when the cue-target interval was 2,000 ms than 0 ms. In addition, the latency of correct responses progressively shortened as the duration of the cue-target interval increased from 0 to 2,000 ms. These findings suggest that the mechanisms involved in preparatory auditory attention develop gradually over time, taking at least 2,000 ms to reach optimal configuration, yet providing cumulative improvements in speech intelligibility as the duration of the cue-target interval increases from 0 to 2,000 ms. These results demonstrate an improvement in performance for cue-target intervals longer than those that have been reported previously in the visual or auditory modalities.
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

NASA Astrophysics Data System (ADS)

Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

2017-09-01

This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.
Objective support for subjective reports of successful inner speech in two people with aphasia

PubMed Central

Hayward, William; Snider, Sarah F.; Luta, George; Friedman, Rhonda B.; Turkeltaub, Peter E.

2016-01-01

People with aphasia frequently report being able to say a word correctly in their heads, even if they are unable to say that word aloud. It is difficult to know what is meant by these reports of “successful inner speech”. We probe the experience of successful inner speech in two people with aphasia. We show that these reports are associated with correct overt speech and phonologically related nonwords errors, that they relate to word characteristics associated with ease of lexical access but not ease of production, and that they predict whether or not individual words are relearned during anomia treatment. These findings suggest that reports of successful inner speech are meaningful and may be useful to study self-monitoring in aphasia, to better understand anomia, and to predict treatment outcomes. Ultimately, the study of inner speech in people with aphasia could provide critical insights that inform our understanding of normal language. PMID:27469037
Fluid Dynamics of Human Phonation and Speech

NASA Astrophysics Data System (ADS)

Mittal, Rajat; Erath, Byron D.; Plesniak, Michael W.

2013-01-01

This article presents a review of the fluid dynamics, flow-structure interactions, and acoustics associated with human phonation and speech. Our voice is produced through the process of phonation in the larynx, and an improved understanding of the underlying physics of this process is essential to advancing the treatment of voice disorders. Insights into the physics of phonation and speech can also contribute to improved vocal training and the development of new speech compression and synthesis schemes. This article introduces the key biomechanical features of the laryngeal physiology, reviews the basic principles of voice production, and summarizes the progress made over the past half-century in understanding the flow physics of phonation and speech. Laryngeal pathologies, which significantly enhance the complexity of phonatory dynamics, are discussed. After a thorough examination of the state of the art in computational modeling and experimental investigations of phonatory biomechanics, we present a synopsis of the pacing issues in this arena and an outlook for research in this fascinating subject.
Relationships among psychoacoustic judgments, speech understanding ability and self-perceived handicap in tinnitus subjects.

PubMed

Newman, C W; Wharton, J A; Shivapuja, B G; Jacobson, G P

1994-01-01

Tinnitus is often a disturbing symptom which affects 6-20% of the population. Relationships among tinnitus pitch and loudness judgments, audiometric speech understanding measures and self-perceived handicap were evaluated in a sample of subjects with tinnitus and hearing loss (THL). Data obtained from the THL sample on the audiometric speech measures were compared to the performance of an age-matched hearing loss only (HL) group. Both groups had normal hearing through 1 kHz with a sloping configuration of < or = 20 dB/octave between 2-12 kHz. The THL subjects performed more poorly on the low predictability items of the Speech Perception in Noise Test, suggesting that tinnitus may interfere with the perception of speech signals having reduced linguistic redundancy. The THL subjects rated their tinnitus as annoying at relatively low sensation levels using the pitch-match frequency as the reference tone. Further, significant relationships were found between loudness judgment measures and self-rated annoyance. No predictable relationships were observed between the audiometric speech measures and perceived handicap using the Tinnitus Handicap Questionnaire. These findings support the use of self-report measures in tinnitus patients in that audiometric speech tests alone may be insufficient in describing an individual's reaction to his/her communication breakdowns.
New Perspectives on Assessing Amplification Effects

PubMed Central

Souza, Pamela E.; Tremblay, Kelly L.

2006-01-01

Clinicians have long been aware of the range of performance variability with hearing aids. Despite improvements in technology, there remain many instances of well-selected and appropriately fitted hearing aids whereby the user reports minimal improvement in speech understanding. This review presents a multistage framework for understanding how a hearing aid affects performance. Six stages are considered: (1) acoustic content of the signal, (2) modification of the signal by the hearing aid, (3) interaction between sound at the output of the hearing aid and the listener's ear, (4) integrity of the auditory system, (5) coding of available acoustic cues by the listener's auditory system, and (6) correct identification of the speech sound. Within this framework, this review describes methodology and research on 2 new assessment techniques: acoustic analysis of speech measured at the output of the hearing aid and auditory evoked potentials recorded while the listener wears hearing aids. Acoustic analysis topics include the relationship between conventional probe microphone tests and probe microphone measurements using speech, appropriate procedures for such tests, and assessment of signal-processing effects on speech acoustics and recognition. Auditory evoked potential topics include an overview of physiologic measures of speech processing and the effect of hearing loss and hearing aids on cortical auditory evoked potential measurements in response to speech. Finally, the clinical utility of these procedures is discussed. PMID:16959734
Crosslinguistic application of English-centric rhythm descriptors in motor speech disorders.

PubMed

Liss, Julie M; Utianski, Rene; Lansford, Kaitlin

2013-01-01

Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. The goals of this paper are to (i) provide a review of the cognitive-linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. Copyright © 2013 S. Karger AG, Basel.

Crosslinguistic Application of English-Centric Rhythm Descriptors in Motor Speech Disorders

PubMed Central

Liss, Julie M.; Utianski, Rene; Lansford, Kaitlin

2014-01-01

Background Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. Objective The goals of this paper are to (i) provide a review of the cognitive- linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. Summary This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. PMID:24157596
"When He's around His Brothers ... He's Not so Quiet": The Private and Public Worlds of School-Aged Children with Speech Sound Disorder

ERIC Educational Resources Information Center

McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline

2013-01-01

Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context.…
Review of Speech-to-Text Recognition Technology for Enhancing Learning

ERIC Educational Resources Information Center

Shadiev, Rustam; Hwang, Wu-Yuin; Chen, Nian-Shing; Huang, Yueh-Min

2014-01-01

This paper reviewed literature from 1999 to 2014 inclusively on how Speech-to-Text Recognition (STR) technology has been applied to enhance learning. The first aim of this review is to understand how STR technology has been used to support learning over the past fifteen years, and the second is to analyze all research evidence to understand how…
Maynard Dixon: "Free Speech."

ERIC Educational Resources Information Center

Day, Michael

1987-01-01

Based on Maynard Dixon's oil painting, "Free Speech," this lesson attempts to expand high school students' understanding of art as a social commentary and the use of works of art to convey ideas and ideals. (JDH)
[Occurrence of child abuse: knowledge and possibility of action of speech-language pathologists].

PubMed

Noguchi, Milica Satake; de Assis, Simone Gonçalves; Malaquias, Juaci Vitória

2006-01-01

This work presents the results of an epidemiological survey about the professional experience of Speech-Language Pathologists and Audiologists of Rio de Janeiro (Brazil) with children and adolescents who are victims of domestic violence. To understand the occurrence of child abuse and neglect of children and adolescents treated by speech-language pathologists, characterizing the victims according to: most affected age group, gender, form of violence, aggressor, most frequent speech-language complaint, how the abuse was identified and follow-up. 500 self-administered mail surveys were sent to a random sample of professional living in Rio de Janeiro. The survey forms were identified only by numbers to assure anonymity. 224 completed surveys were mailed back. 54 respondents indicated exposure to at least one incident of abuse. The majority of victims were children, the main abuser was the mother, and physical violence was the most frequent form of abuse. The main speech disorder was late language development. In most cases, the victim himself told the therapist about the abuse--through verbal expression or other means of expression such as drawings, story telling, dramatizing or playing. As the majority of the victims abandoned speech-language therapy, it was not possible to follow-up the cases. Due to the importance if this issue and the limited Brazilian literature about Speech-Language and Hearing Sciences and child abuse, it is paramount to invest in the training of speech-language pathologists. It is the duty of speech-language pathologists to expose this complex problem and to give voice to children who are victims of violence, understanding that behind a speech-language complaint there might be a cry for help.
[Influence of mixing ratios of a FM-system on speech understanding of CI-users].

PubMed

Hey, M; Anft, D; Hocke, T; Scholz, G; Hessel, H; Begall, K

2009-05-01

At school we find two major acoustic situations: (first) the "teacher is talking" being disturbed by the pupils making noise and (second) another "pupil is talking" disturbed by other pupils. The understanding of words and sentences in hearing impaired patients with a cochlear implant (CI) in a noisy situation can be improved by using a FM system. The aim of this study is to test speech understanding depending on mixing ratios between FM input and microphone input to the speech processor in different circumstances. Speech understanding was evaluated using the adaptive Oldenburger sentence test (OLSA) in background noise. CI patients used the FM system Microlink for Freedom CIs together with a Campus transmitter (Phonak AG). 17 postlingually deafened adults were tested, using unilateral Freedom cochlear implant systems (Cochlear Ltd). A group of eight normally hearing adults was used as a control group in the same setup. We found that the median value of L (50)=1.6 dB in CI patients without a FM system is higher than the median value of L(50)=-13 dB in normally hearing subjects. The sentence recognition in CI patients with FM system increased with increasing mixing ratio. The benefit using the FM system to understand the teacher is of high advantage in any mixing ratio. The difference between the L(50) values in situations with or without a FM-system is 15 dB for the mixing ratio 3:1 (FM to microphone). If we take into account an increase of 15% per dB in the OLSA (at L(50)) in CI patients, the difference of 15 dB means a calculated advantage of 225%. The speech understanding during the second condition ("pupil is talking") however remained nearly the same in all used mixing ratios. The calculations showed no statistical difference between these situations with and without a FM system. The speaker comprehension for the two investigated listening conditions showed different results. Understanding in the "teacher is talking" situation increased with increasing mixing ratio (FM to microphone) and in the "pupil is talking" situation remained on the same level. We could not find an optimal FM setting for both listening conditions. This leads to different suggestions for different listening conditions. All patients showed an increased speech understanding in noisy environments. This result strongly encourages the use of a FM-system in a classroom.
Can you see what I am talking about? Human speech triggers referential expectation in four-month-old infants

PubMed Central

Marno, Hanna; Farroni, Teresa; Vidal Dos Santos, Yamil; Ekramnia, Milad; Nespor, Marina; Mehler, Jacques

2015-01-01

Infants’ sensitivity to selectively attend to human speech and to process it in a unique way has been widely reported in the past. However, in order to successfully acquire language, one should also understand that speech is a referential, and that words can stand for other entities in the world. While there has been some evidence showing that young infants can make inferences about the communicative intentions of a speaker, whether they would also appreciate the direct relationship between a specific word and its referent, is still unknown. In the present study we tested four-month-old infants to see whether they would expect to find a referent when they hear human speech. Our results showed that compared to other auditory stimuli or to silence, when infants were listening to speech they were more prepared to find some visual referents of the words, as signalled by their faster orienting towards the visual objects. Hence, our study is the first to report evidence that infants at a very young age already understand the referential relationship between auditory words and physical objects, thus show a precursor in appreciating the symbolic nature of language, even if they do not understand yet the meanings of words. PMID:26323990
On the importance of early reflections for speech in rooms.

PubMed

Bradley, J S; Sato, H; Picard, M

2003-06-01

This paper presents the results of new studies based on speech intelligibility tests in simulated sound fields and analyses of impulse response measurements in rooms used for speech communication. The speech intelligibility test results confirm the importance of early reflections for achieving good conditions for speech in rooms. The addition of early reflections increased the effective signal-to-noise ratio and related speech intelligibility scores for both impaired and nonimpaired listeners. The new results also show that for common conditions where the direct sound is reduced, it is only possible to understand speech because of the presence of early reflections. Analyses of measured impulse responses in rooms intended for speech show that early reflections can increase the effective signal-to-noise ratio by up to 9 dB. A room acoustics computer model is used to demonstrate that the relative importance of early reflections can be influenced by the room acoustics design.
Auditory Verbal Working Memory as a Predictor of Speech Perception in Modulated Maskers in Listeners with Normal Hearing

ERIC Educational Resources Information Center

Millman, Rebecca E.; Mattys, Sven L.

2017-01-01

Purpose: Background noise can interfere with our ability to understand speech. Working memory capacity (WMC) has been shown to contribute to the perception of speech in modulated noise maskers. WMC has been assessed with a variety of auditory and visual tests, often pertaining to different components of working memory. This study assessed the…
The Listening Ear: The Development of Speech as a Creative Influence in Education (Learning Resource Series).

ERIC Educational Resources Information Center

McAllen, Audrey E.

This book gives teachers an understanding of speech training through specially selected exercises. The book's exercises aim to help develop clear speaking in the classroom. Methodically and perceptively used, the book will assist those concerned with the creative powers of speech as a teaching art. In Part 1, there are sections on the links…
Why Do Speech and Language Therapists Stay in, Leave and (Sometimes) Return to the National Health Service (NHS)?

ERIC Educational Resources Information Center

Loan-Clarke, John; Arnold, John; Coombs, Crispin; Bosley, Sara; Martin, Caroline

2009-01-01

Background: Research into recruitment, retention and return of speech and language therapists in the National Health Service (NHS) is relatively limited, particularly in respect of understanding the factors that drive employment choice decisions. Aims: To identify what factors influence speech and language therapists working in the NHS to stay,…
Tracking the Speech Signal--Time-Locked MEG Signals during Perception of Ultra-Fast and Moderately Fast Speech in Blind and in Sighted Listeners

ERIC Educational Resources Information Center

Hertrich, Ingo; Dietrich, Susanne; Ackermann, Hermann

2013-01-01

Blind people can learn to understand speech at ultra-high syllable rates (ca. 20 syllables/s), a capability associated with hemodynamic activation of the central-visual system. To further elucidate the neural mechanisms underlying this skill, magnetoencephalographic (MEG) measurements during listening to sentence utterances were cross-correlated…
Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

ERIC Educational Resources Information Center

Kimotho, Stephen Gichuhi; Nyaga, Rahab Njeri

2016-01-01

Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and…
Do Native Speakers of North American and Singapore English Differentially Perceive Comprehensibility in Second Language Speech?

ERIC Educational Resources Information Center

Saito, Kazuya; Shintani, Natsuko

2016-01-01

The current study examined the extent to which native speakers of North American and Singapore English differentially perceive the comprehensibility (ease of understanding) of second language (L2) speech. Spontaneous speech samples elicited from 50 Japanese learners of English with various proficiency levels were first rated by 10 Canadian and 10…
Alignment of classification paradigms for communication abilities in children with cerebral palsy

PubMed Central

Hustad, Katherine C.; Oakes, Ashley; McFadd, Emily; Allison, Kristen M.

2015-01-01

Aim We examined three communication ability classification paradigms for children with cerebral palsy (CP): the Communication Function Classification System (CFCS), the Viking Speech Scale (VSS), and the Speech Language Profile Groups (SLPG). Questions addressed inter-judge reliability, whether the VSS and the CFCS captured impairments in speech and language, and whether there were differences in speech intelligibility among levels within each classification paradigm. Method 80 children (42 males) with a range of types and severity levels of CP participated (mean age, 60 months; SD 4.8 months). Two speech-language pathologists classified each child via parent-child interaction samples and previous experience with the children for the CFCS and VSS, and uisng quantitative speech and language assessment data for the SLPG. Intelligibility scores were obtained using standard clinical intelligibility measurement. Results Kappa values were .67 (95% CI [.55, .79]) for the CFCS, .82 (95% CI [.72, .92]), for the VSS, .95 (95% CI [.72, .92]) for the SLPG. Descriptively, reliability within levels of each paradigm varied, with the lowest agreement occurring within the CFCS at levels II (42%), III (40%), and IV (61%). Neither the CFCS nor the VSS were sensitive to language impairments captured by the SLPG. Significant differences in speech intelligibility were found among levels for all classification paradigms. Interpretation Multiple tools are necessary to understand speech, language, and communication profiles in children with CP. Characterization of abilities at all levels of the ICF will advance our understanding of the ways that speech, language, and communication abilities present in children with CP. PMID:26521844
Phenotype of FOXP2 haploinsufficiency in a mother and son.

PubMed

Rice, Gregory M; Raca, Gordana; Jakielski, Kathy J; Laffin, Jennifer J; Iyama-Kurtycz, Christina M; Hartley, Sigan L; Sprague, Rae E; Heintzelman, Anne T; Shriberg, Lawrence D

2012-01-01

Disruptions in FOXP2, a transcription factor, are the only known monogenic cause of speech and language impairment. We report on clinical findings for two new individuals with a submicroscopic deletion of FOXP2: a boy with severe apraxia of speech and his currently moderately affected mother. A 1.57 Mb deletion on chromosome 7q31 was detected by array comparative genomic hybridization (aCGH). In addition to FOXP2, the patients' deletion involves two other genes, MDFIC and PPP1R3A, neither of which has been associated with speech or language disorders. Thus, findings for these two family members provide informative phenotypic information on FOXP2 haploinsufficiency. Evaluation by a clinical geneticist indicated no major congenital anomalies or dysmorphic features. Evaluations by a clinical psychologist and occupational therapist indicated cognitive-linguistic processing and sensorimotor control deficits, but did not support a diagnosis of autism spectrum disorder. Evaluation by clinical and research speech pathologists confirmed that both patients' speech deficits met contemporary criteria for apraxia of speech. Notably, the patients were not able to laugh, cough, or sneeze spontaneously, replicating findings reported for two other FOXP2 cases and a potential diagnostic sign of nonsyndromic apraxia of speech. Speech severity findings for the boy were not consistent with the hypothesis that loss of maternal FOXP2 should be relatively benign. Better understanding of the behavioral phenotype of FOXP2 disruptions will aid identification of patients, toward an eventual understanding of the pathophysiology of syndromic and nonsyndromic apraxia of speech. Copyright © 2011 Wiley Periodicals, Inc.
The impact of cochlear implantation on speech understanding, subjective hearing performance, and tinnitus perception in patients with unilateral severe to profound hearing loss.

PubMed

Távora-Vieira, Dayse; Marino, Roberta; Acharya, Aanand; Rajan, Gunesh P

2015-03-01

This study aimed to determine the impact of cochlear implantation on speech understanding in noise, subjective perception of hearing, and tinnitus perception of adult patients with unilateral severe to profound hearing loss and to investigate whether duration of deafness and age at implantation would influence the outcomes. In addition, this article describes the auditory training protocol used for unilaterally deaf patients. This is a prospective study of subjects undergoing cochlear implantation for unilateral deafness with or without associated tinnitus. Speech perception in noise was tested using the Bamford-Kowal-Bench speech-in-noise test presented at 65 dB SPL. The Speech, Spatial, and Qualities of Hearing Scale and the Abbreviated Profile of Hearing Aid Benefit were used to evaluate the subjective perception of hearing with a cochlear implant and quality of life. Tinnitus disturbance was measured using the Tinnitus Reaction Questionnaire. Data were collected before cochlear implantation and 3, 6, 12, and 24 months after implantation. Twenty-eight postlingual unilaterally deaf adults with or without tinnitus were implanted. There was a significant improvement in speech perception in noise across time in all spatial configurations. There was an overall significant improvement on the subjective perception of hearing and quality of life. Tinnitus disturbance reduced significantly across time. Age at implantation and duration of deafness did not influence the outcomes significantly. Cochlear implantation provided significant improvement in speech understanding in challenging situations, subjective perception of hearing performance, and quality of life. Cochlear implantation also resulted in reduced tinnitus disturbance. Age at implantation and duration of deafness did not seem to influence the outcomes.
Impact of Hearing Aid Technology on Outcomes in Daily Life II: Speech Understanding and Listening Effort.

PubMed

Johnson, Jani A; Xu, Jingjing; Cox, Robyn M

2016-01-01

Modern hearing aid (HA) devices include a collection of acoustic signal-processing features designed to improve listening outcomes in a variety of daily auditory environments. Manufacturers market these features at successive levels of technological sophistication. The features included in costlier premium hearing devices are designed to result in further improvements to daily listening outcomes compared with the features included in basic hearing devices. However, independent research has not substantiated such improvements. This research was designed to explore differences in speech-understanding and listening-effort outcomes for older adults using premium-feature and basic-feature HAs in their daily lives. For this participant-blinded, repeated, crossover trial 45 older adults (mean age 70.3 years) with mild-to-moderate sensorineural hearing loss wore each of four pairs of bilaterally fitted HAs for 1 month. HAs were premium- and basic-feature devices from two major brands. After each 1-month trial, participants' speech-understanding and listening-effort outcomes were evaluated in the laboratory and in daily life. Three types of speech-understanding and listening-effort data were collected: measures of laboratory performance, responses to standardized self-report questionnaires, and participant diary entries about daily communication. The only statistically significant superiority for the premium-feature HAs occurred for listening effort in the loud laboratory condition and was demonstrated for only one of the tested brands. The predominant complaint of older adults with mild-to-moderate hearing impairment is difficulty understanding speech in various settings. The combined results of all the outcome measures used in this research suggest that, when fitted using scientifically based practices, both premium- and basic-feature HAs are capable of providing considerable, but essentially equivalent, improvements to speech understanding and listening effort in daily life for this population. For HA providers to make evidence-based recommendations to their clientele with hearing impairment it is essential that further independent research investigates the relative benefit/deficit of different levels of hearing technology across brands and manufacturers in these and other real-world listening domains.
Unilateral Hearing Loss: Understanding Speech Recognition and Localization Variability - Implications for Cochlear Implant Candidacy

PubMed Central

Firszt, Jill B.; Reeder, Ruth M.; Holden, Laura K.

2016-01-01

Objectives At a minimum, unilateral hearing loss (UHL) impairs sound localization ability and understanding speech in noisy environments, particularly if the loss is severe to profound. Accompanying the numerous negative consequences of UHL is considerable unexplained individual variability in the magnitude of its effects. Identification of co-variables that affect outcome and contribute to variability in UHLs could augment counseling, treatment options, and rehabilitation. Cochlear implantation as a treatment for UHL is on the rise yet little is known about factors that could impact performance or whether there is a group at risk for poor cochlear implant outcomes when hearing is near-normal in one ear. The overall goal of our research is to investigate the range and source of variability in speech recognition in noise and localization among individuals with severe to profound UHL and thereby help determine factors relevant to decisions regarding cochlear implantation in this population. Design The present study evaluated adults with severe to profound UHL and adults with bilateral normal hearing. Measures included adaptive sentence understanding in diffuse restaurant noise, localization, roving-source speech recognition (words from 1 of 15 speakers in a 140° arc) and an adaptive speech-reception threshold psychoacoustic task with varied noise types and noise-source locations. There were three age-gender-matched groups: UHL (severe to profound hearing loss in one ear and normal hearing in the contralateral ear), normal hearing listening bilaterally, and normal hearing listening unilaterally. Results Although the normal-hearing-bilateral group scored significantly better and had less performance variability than UHLs on all measures, some UHL participants scored within the range of the normal-hearing-bilateral group on all measures. The normal-hearing participants listening unilaterally had better monosyllabic word understanding than UHLs for words presented on the blocked/deaf side but not the open/hearing side. In contrast, UHLs localized better than the normal hearing unilateral listeners for stimuli on the open/hearing side but not the blocked/deaf side. This suggests that UHLs had learned strategies for improved localization on the side of the intact ear. The UHL and unilateral normal hearing participant groups were not significantly different for speech-in-noise measures. UHL participants with childhood rather than recent hearing loss onset localized significantly better; however, these two groups did not differ for speech recognition in noise. Age at onset in UHL adults appears to affect localization ability differently than understanding speech in noise. Hearing thresholds were significantly correlated with speech recognition for UHL participants but not the other two groups. Conclusions Auditory abilities of UHLs varied widely and could be explained only in part by hearing threshold levels. Age at onset and length of hearing loss influenced performance on some, but not all measures. Results support the need for a revised and diverse set of clinical measures, including sound localization, understanding speech in varied environments and careful consideration of functional abilities as individuals with severe to profound UHL are being considered potential cochlear implant candidates. PMID:28067750
Genetics Home Reference: CHMP2B-related frontotemporal dementia

MedlinePlus

... CHMP2B -related frontotemporal dementia develop progressive problems with speech and language (aphasia). They may have trouble speaking, although they can often understand others' speech and written text. Affected individuals may also have ...

Reliance on auditory feedback in children with childhood apraxia of speech.

PubMed

Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Guarino, Anthony J; Green, Jordan R

2015-01-01

Children with childhood apraxia of speech (CAS) have been hypothesized to continuously monitor their speech through auditory feedback to minimize speech errors. We used an auditory masking paradigm to determine the effect of attenuating auditory feedback on speech in 30 children: 9 with CAS, 10 with speech delay, and 11 with typical development. The masking only affected the speech of children with CAS as measured by voice onset time and vowel space area. These findings provide preliminary support for greater reliance on auditory feedback among children with CAS. Readers of this article should be able to (i) describe the motivation for investigating the role of auditory feedback in children with CAS; (ii) report the effects of feedback attenuation on speech production in children with CAS, speech delay, and typical development, and (iii) understand how the current findings may support a feedforward program deficit in children with CAS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Speech perception: Some new directions in research and theory

PubMed Central

Pisoni, David B.

2012-01-01

The perception of speech is one of the most fascinating attributes of human behavior; both the auditory periphery and higher centers help define the parameters of sound perception. In this paper some of the fundamental perceptual problems facing speech sciences are described. The paper focuses on several of the new directions speech perception research is taking to solve these problems. Recent developments suggest that major breakthroughs in research and theory will soon be possible. The current study of segmentation, invariance, and normalization are described. The paper summarizes some of the new techniques used to understand auditory perception of speech signals and their linguistic significance to the human listener. PMID:4031245
Child Speech, Language and Communication Need Re-Examined in a Public Health Context: A New Direction for the Speech and Language Therapy Profession

ERIC Educational Resources Information Center

Law, James; Reilly, Sheena; Snow, Pamela C.

2013-01-01

Background: Historically speech and language therapy services for children have been framed within a rehabilitative framework with explicit assumptions made about providing therapy to individuals. While this is clearly important in many cases, we argue that this model needs revisiting for a number of reasons. First, our understanding of the nature…
Between-Word Processes in Children with Speech Difficulties: Insights from a Usage-Based Approach to Phonology

ERIC Educational Resources Information Center

Newton, Caroline

2012-01-01

There are some children with speech and/or language difficulties who are significantly more difficult to understand in connected speech than in single words. The study reported here explores the between-word behaviours of three such children, aged 11;8, 12;2 and 12;10. It focuses on whether these patterns could be accounted for by lenition, as…
Modeling Speech Level as a Function of Background Noise Level and Talker-to-Listener Distance for Talkers Wearing Hearing Protection Devices

ERIC Educational Resources Information Center

Bouserhal, Rachel E.; Bockstael, Annelies; MacDonald, Ewen; Falk, Tiago H.; Voix, Jérémie

2017-01-01

Purpose: Studying the variations in speech levels with changing background noise level and talker-to-listener distance for talkers wearing hearing protection devices (HPDs) can aid in understanding communication in background noise. Method: Speech was recorded using an intra-aural HPD from 12 different talkers at 5 different distances in 3…
USSR Report, Cybernetics Computers and Automation Technology

DTIC Science & Technology

1985-09-05

understand each other excellently, although in their speech they frequently omit, it would seem, needed words. However, the life experience of the...participants in a conversa- tion and their perception of voice intonations and gestures make it possible to fill in the missing elements of speech ...the Soviet Union. Comrade M. S. Gorbachev’s speech pointed out that microelectronics, computer technology, instrument building and the whole
The Effect of Conventional and Transparent Surgical Masks on Speech Understanding in Individuals with and without Hearing Loss.

PubMed

Atcherson, Samuel R; Mendel, Lisa Lucks; Baltimore, Wesley J; Patro, Chhayakanta; Lee, Sungmin; Pousson, Monique; Spann, M Joshua

2017-01-01

It is generally well known that speech perception is often improved with integrated audiovisual input whether in quiet or in noise. In many health-care environments, however, conventional surgical masks block visual access to the mouth and obscure other potential facial cues. In addition, these environments can be noisy. Although these masks may not alter the acoustic properties, the presence of noise in addition to the lack of visual input can have a deleterious effect on speech understanding. A transparent ("see-through") surgical mask may help to overcome this issue. To compare the effect of noise and various visual input conditions on speech understanding for listeners with normal hearing (NH) and hearing impairment using different surgical masks. Participants were assigned to one of three groups based on hearing sensitivity in this quasi-experimental, cross-sectional study. A total of 31 adults participated in this study: one talker, ten listeners with NH, ten listeners with moderate sensorineural hearing loss, and ten listeners with severe-to-profound hearing loss. Selected lists from the Connected Speech Test were digitally recorded with and without surgical masks and then presented to the listeners at 65 dB HL in five conditions against a background of four-talker babble (+10 dB SNR): without a mask (auditory only), without a mask (auditory and visual), with a transparent mask (auditory only), with a transparent mask (auditory and visual), and with a paper mask (auditory only). A significant difference was found in the spectral analyses of the speech stimuli with and without the masks; however, no more than ∼2 dB root mean square. Listeners with NH performed consistently well across all conditions. Both groups of listeners with hearing impairment benefitted from visual input from the transparent mask. The magnitude of improvement in speech perception in noise was greatest for the severe-to-profound group. Findings confirm improved speech perception performance in noise for listeners with hearing impairment when visual input is provided using a transparent surgical mask. Most importantly, the use of the transparent mask did not negatively affect speech perception performance in noise. American Academy of Audiology
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.

PubMed

Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth

2017-08-09

Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate anatomically distinct cortical representations of modulated noise in normal-hearing and hearing-impaired listeners. This work provides the first link among hearing thresholds, the amplitude of cortical representations of modulated sounds, and the ability to understand speech in modulated background noise. In light of previous work, we propose that magnified cortical representations of modulated sounds disrupt the separation of speech from modulated background noise in auditory cortex. Copyright © 2017 Millman et al.
Automated speech understanding: the next generation

NASA Astrophysics Data System (ADS)

Picone, J.; Ebel, W. J.; Deshmukh, N.

1995-04-01

Modern speech understanding systems merge interdisciplinary technologies from Signal Processing, Pattern Recognition, Natural Language, and Linguistics into a unified statistical framework. These systems, which have applications in a wide range of signal processing problems, represent a revolution in Digital Signal Processing (DSP). Once a field dominated by vector-oriented processors and linear algebra-based mathematics, the current generation of DSP-based systems rely on sophisticated statistical models implemented using a complex software paradigm. Such systems are now capable of understanding continuous speech input for vocabularies of several thousand words in operational environments. The current generation of deployed systems, based on small vocabularies of isolated words, will soon be replaced by a new technology offering natural language access to vast information resources such as the Internet, and provide completely automated voice interfaces for mundane tasks such as travel planning and directory assistance.
Key considerations in designing a speech brain-computer interface.

PubMed

Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Chabardès, Stéphan; Yvert, Blaise

2016-11-01

Restoring communication in case of aphasia is a key challenge for neurotechnologies. To this end, brain-computer strategies can be envisioned to allow artificial speech synthesis from the continuous decoding of neural signals underlying speech imagination. Such speech brain-computer interfaces do not exist yet and their design should consider three key choices that need to be made: the choice of appropriate brain regions to record neural activity from, the choice of an appropriate recording technique, and the choice of a neural decoding scheme in association with an appropriate speech synthesis method. These key considerations are discussed here in light of (1) the current understanding of the functional neuroanatomy of cortical areas underlying overt and covert speech production, (2) the available literature making use of a variety of brain recording techniques to better characterize and address the challenge of decoding cortical speech signals, and (3) the different speech synthesis approaches that can be considered depending on the level of speech representation (phonetic, acoustic or articulatory) envisioned to be decoded at the core of a speech BCI paradigm. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Music and Speech Perception in Children Using Sung Speech

PubMed Central

Nie, Yingjiu; Galvin, John J.; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

2018-01-01

This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners. PMID:29609496
Music and Speech Perception in Children Using Sung Speech.

PubMed

Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

2018-01-01

This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Using ILD or ITD Cues for Sound Source Localization and Speech Understanding in a Complex Listening Environment by Listeners with Bilateral and with Hearing-Preservation Cochlear Implants

ERIC Educational Resources Information Center

Loiselle, Louise H.; Dorman, Michael F.; Yost, William A.; Cook, Sarah J.; Gifford, Rene H.

2016-01-01

Purpose: To assess the role of interaural time differences and interaural level differences in (a) sound-source localization, and (b) speech understanding in a cocktail party listening environment for listeners with bilateral cochlear implants (CIs) and for listeners with hearing-preservation CIs. Methods: Eleven bilateral listeners with MED-EL…
How do individuals with Asperger syndrome respond to nonliteral language and inappropriate requests in computer-mediated communication?

PubMed

Rajendran, Gnanathusharan; Mitchell, Peter; Rickards, Hugh

2005-08-01

Computer-mediated communication in individuals with Asperger syndrome, Tourette syndrome and normal controls was explored with a program called Bubble Dialogue (Gray, Creighton, McMahon, and Cunninghamn (1991)) in which the users type text into speech bubbles. Two scenarios, based on Happé (1994) were adapted to investigate understanding of figure of speech and sarcasm, and a third, developed by ourselves, looked at responses to inappropriate requests (lending money and disclosing home address on a first meeting). Dialogue transcripts were assessed by 62 raters who were blind to the clinical diagnoses. Hierarchical linear modelling revealed that rated understanding of a figure of speech was predicted mainly by verbal ability and executive ability, as well as by clinical diagnosis, whereas handling inappropriate requests was predicted by age, verbal ability, executive ability and diagnosis. Notably, the Tourette comparison group showed better understanding than the Asperger group in interpreting a figure of speech and handling inappropriate requests, and differences between these groups were possibly attributable to individual differences in executive ability. In contrast, understanding sarcasm was predicted by age but not by either verbal ability, executive ability or clinical diagnosis. Evidently, there is a complicated relation between Asperger syndrome, verbal ability and executive abilities with respect to communicative performance.
Air traffic controllers' long-term speech-in-noise training effects: A control group study.

PubMed

Zaballos, Maria T P; Plasencia, Daniel P; González, María L Z; de Miguel, Angel R; Macías, Ángel R

2016-01-01

Speech perception in noise relies on the capacity of the auditory system to process complex sounds using sensory and cognitive skills. The possibility that these can be trained during adulthood is of special interest in auditory disorders, where speech in noise perception becomes compromised. Air traffic controllers (ATC) are constantly exposed to radio communication, a situation that seems to produce auditory learning. The objective of this study has been to quantify this effect. 19 ATC and 19 normal hearing individuals underwent a speech in noise test with three signal to noise ratios: 5, 0 and -5 dB. Noise and speech were presented through two different loudspeakers in azimuth position. Speech tokes were presented at 65 dB SPL, while white noise files were at 60, 65 and 70 dB respectively. Air traffic controllers outperform the control group in all conditions [P<0.05 in ANOVA and Mann-Whitney U tests]. Group differences were largest in the most difficult condition, SNR=-5 dB. However, no correlation between experience and performance were found for any of the conditions tested. The reason might be that ceiling performance is achieved much faster than the minimum experience time recorded, 5 years, although intrinsic cognitive abilities cannot be disregarded. ATC demonstrated enhanced ability to hear speech in challenging listening environments. This study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions, although good cognitive qualities are likely to be a basic requirement for this training to be effective. Our results show that ATC outperform the control group in all conditions. Thus, this study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions.
Alignment of classification paradigms for communication abilities in children with cerebral palsy.

PubMed

Hustad, Katherine C; Oakes, Ashley; McFadd, Emily; Allison, Kristen M

2016-06-01

We examined three communication ability classification paradigms for children with cerebral palsy (CP): the Communication Function Classification System (CFCS), the Viking Speech Scale (VSS), and the Speech Language Profile Groups (SLPG). Questions addressed interjudge reliability, whether the VSS and the CFCS captured impairments in speech and language, and whether there were differences in speech intelligibility among levels within each classification paradigm. Eighty children (42 males, 38 females) with a range of types and severity levels of CP participated (mean age 60mo, range 50-72mo [SD 5mo]). Two speech-language pathologists classified each child via parent-child interaction samples and previous experience with the children for the CFCS and VSS, and using quantitative speech and language assessment data for the SLPG. Intelligibility scores were obtained using standard clinical intelligibility measurement. Kappa values were 0.67 (95% confidence interval [CI] 0.55-0.79) for the CFCS, 0.82 (95% CI 0.72-0.92) for the VSS, and 0.95 (95% CI 0.72-0.92) for the SLPG. Descriptively, reliability within levels of each paradigm varied, with the lowest agreement occurring within the CFCS at levels II (42%), III (40%), and IV (61%). Neither the CFCS nor the VSS were sensitive to language impairments captured by the SLPG. Significant differences in speech intelligibility were found among levels for all classification paradigms. Multiple tools are necessary to understand speech, language, and communication profiles in children with CP. Characterization of abilities at all levels of the International Classification of Functioning, Disability and Health will advance our understanding of the ways that speech, language, and communication abilities present in children with CP. © 2015 Mac Keith Press.
Sensitivity to linguistic register in 20-month-olds: Understanding the register-listener relationship and its abstract rules

PubMed Central

Kobayashi, Tessei; Itakura, Shoji

2018-01-01

Linguistic register reflects changes in speech that depend on the situation, especially the status of listeners and listener-speaker relationships. Following the sociolinguistic rules of register is essential in establishing and maintaining social interactions. Recent research suggests that children over 3 years of age can understand appropriate register-listener relationships as well as the fact that people change register depending on their listeners. However, given previous findings that infants under 2 years of age have already formed both social and speech categories, it may be possible that even younger children can also understand appropriate register-listener relationships. The present study used Infant-Directed Speech (IDS) and formal Adult-Directed Speech (ADS) to examine whether 20-month-old toddlers can understand register-listener relationships. In Experiment 1, we used a violation-of-expectation method to examine whether 20-month-olds understand the individual associations between linguistic registers and listeners. Results showed that the toddlers looked significantly longer at a scene in which the adult was talked to in IDS than when the infant was talked to in IDS. In contrast, there was no difference when the adult and the infant were talked to in formal ADS. In Experiments 2 and 3, we used a habituation switch paradigm to examine whether 20-month-olds understand the abstract rule that a change of register depends on listeners rather than on speakers. Results showed that the toddlers looked significantly longer at the scene where the register rule was violated. The present findings provide new evidence that even 20-month-olds already understand that people change their way of speaking based on listeners, although their understanding of individual register-listener relationships is immature. PMID:29630608
The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts

PubMed Central

Hayes-Harb, Rachel; Smith, Bruce L.; Bent, Tessa; Bradlow, Ann R.

2009-01-01

This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit. PMID:19606271
New Developments in Understanding the Complexity of Human Speech Production.

PubMed

Simonyan, Kristina; Ackermann, Hermann; Chang, Edward F; Greenlee, Jeremy D

2016-11-09

Speech is one of the most unique features of human communication. Our ability to articulate our thoughts by means of speech production depends critically on the integrity of the motor cortex. Long thought to be a low-order brain region, exciting work in the past years is overturning this notion. Here, we highlight some of major experimental advances in speech motor control research and discuss the emerging findings about the complexity of speech motocortical organization and its large-scale networks. This review summarizes the talks presented at a symposium at the Annual Meeting of the Society of Neuroscience; it does not represent a comprehensive review of contemporary literature in the broader field of speech motor control. Copyright © 2016 the authors 0270-6474/16/3611440-09$15.00/0.
Brainstem Correlates of Speech-in-Noise Perception in Children

PubMed Central

Anderson, Samira; Skoe, Erika; Chandrasekaran, Bharath; Zecker, Steven; Kraus, Nina

2010-01-01

Children often have difficulty understanding speech in challenging listening environments. In the absence of peripheral hearing loss, these speech perception difficulties may arise from dysfunction at more central levels in the auditory system, including subcortical structures. We examined brainstem encoding of pitch in a speech syllable in 38 school-age children. In children with poor speech-in-noise perception, we find impaired encoding of the fundamental frequency and the second harmonic, two important cues for pitch perception. Pitch, an important factor in speaker identification, aids the listener in tracking a specific voice from a background of voices. These results suggest that the robustness of subcortical neural encoding of pitch features in time-varying signals is an important factor in determining success with speech perception in noise. PMID:20708671

Prior exposure to a reverberant listening environment improves speech intelligibility in adult cochlear implant listeners.

PubMed

Srinivasan, Nirmal Kumar; Tobey, Emily A; Loizou, Philipos C

2016-01-01

The goal of this study is to investigate whether prior exposure to reverberant listening environment improves speech intelligibility of adult cochlear implant (CI) users. Six adult CI users participated in this study. Speech intelligibility was measured in five different simulated reverberant listening environments with two different speech corpuses. Within each listening environment, prior exposure was varied by either having the same environment across all trials (blocked presentation) or having different environment from trial to trial (unblocked). Speech intelligibility decreased as reverberation time increased. Although substantial individual variability was observed, all CI listeners showed an increase in the blocked presentation condition as compared to the unblocked presentation condition for both speech corpuses. Prior listening exposure to a reverberant listening environment improves speech intelligibility in adult CI listeners. Further research is required to understand the underlying mechanism of adaptation to listening environment.
Factors affecting speech understanding in gated interference: Cochlear implant users and normal-hearing listeners

NASA Astrophysics Data System (ADS)

Nelson, Peggy B.; Jin, Su-Hyun

2004-05-01

Previous work [Nelson, Jin, Carney, and Nelson (2003), J. Acoust. Soc. Am 113, 961-968] suggested that cochlear implant users do not benefit from masking release when listening in modulated noise. The previous findings indicated that implant users experience little to no release from masking when identifying sentences in speech-shaped noise, regardless of the modulation frequency applied to the noise. The lack of masking release occurred for all implant subjects who were using three different devices and speech processing strategies. In the present study, possible causes of this reduced masking release in implant listeners were investigated. Normal-hearing listeners, implant users, and normal-hearing listeners presented with a four-band simulation of a cochlear implant were tested for their understanding of sentences in gated noise (1-32 Hz gate frequencies) when the duty cycle of the noise was varied from 25% to 75%. No systematic effect of noise duty cycle on implant and simulation listeners' performance was noted, indicating that the masking caused by gated noise is not only energetic masking. Masking release significantly increased when the number of spectral channels was increased from 4 to 12 for simulation listeners, suggesting that spectral resolution is important for masking release. Listeners were also tested for their understanding of gated sentences (sentences in quiet interrupted by periods of silence ranging from 1 to 32 Hz as a measure of auditory fusion, or the ability to integrate speech across temporal gaps. Implant and simulation listeners had significant difficulty understanding gated sentences at every gate frequency. When the number of spectral channels was increased for simulation listeners, their ability to understand gated sentences improved significantly. Findings suggest that implant listeners' difficulty understanding speech in modulated conditions is related to at least two (possibly related) factors: degraded spectral information and limitations in auditory fusion across temporal gaps.
Multi-talker background and semantic priming effect

PubMed Central

Dekerle, Marie; Boulenger, Véronique; Hoen, Michel; Meunier, Fanny

2014-01-01

The reported studies have aimed to investigate whether informational masking in a multi-talker background relies on semantic interference between the background and target using an adapted semantic priming paradigm. In 3 experiments, participants were required to perform a lexical decision task on a target item embedded in backgrounds composed of 1–4 voices. These voices were Semantically Consistent (SC) voices (i.e., pronouncing words sharing semantic features with the target) or Semantically Inconsistent (SI) voices (i.e., pronouncing words semantically unrelated to each other and to the target). In the first experiment, backgrounds consisted of 1 or 2 SC voices. One and 2 SI voices were added in Experiments 2 and 3, respectively. The results showed a semantic priming effect only in the conditions where the number of SC voices was greater than the number of SI voices, suggesting that semantic priming depended on prime intelligibility and strategic processes. However, even if backgrounds were composed of 3 or 4 voices, reducing intelligibility, participants were able to recognize words from these backgrounds, although no semantic priming effect on the targets was observed. Overall this finding suggests that informational masking can occur at a semantic level if intelligibility is sufficient. Based on the Effortfulness Hypothesis, we also suggest that when there is an increased difficulty in extracting target signals (caused by a relatively high number of voices in the background), more cognitive resources were allocated to formal processes (i.e., acoustic and phonological), leading to a decrease in available resources for deeper semantic processing of background words, therefore preventing semantic priming from occurring. PMID:25400572
Effects of Age and Language on Co-Speech Gesture Production: An Investigation of French, American, and Italian Children's Narratives

ERIC Educational Resources Information Center

Colletta, Jean-Marc; Guidetti, Michele; Capirci, Olga; Cristilli, Carla; Demir, Ozlem Ece; Kunene-Nicolas, Ramona N.; Levine, Susan

2015-01-01

The aim of this paper is to compare speech and co-speech gestures observed during a narrative retelling task in five- and ten-year-old children from three different linguistic groups, French, American, and Italian, in order to better understand the role of age and language in the development of multimodal monologue discourse abilities. We asked 98…
The Case for Private Speech as a Mode of Self-Formation: What Its Absence Contributes to Understanding Autism

ERIC Educational Resources Information Center

Shopen, Roey

2014-01-01

Private speech is common among 3- to 7-year-olds but rare among children with autism spectrum disorders (ASDs). Thus far, this phenomenon has only been studied in narrow cognitive contexts. This article presents a case for why the phenomenon of private speech is essential for the development of self and subjectivity and for why an analysis of…
Investigations in mechanisms and strategies to enhance hearing with cochlear implants

NASA Astrophysics Data System (ADS)

Churchill, Tyler H.

Cochlear implants (CIs) produce hearing sensations by stimulating the auditory nerve (AN) with current pulses whose amplitudes are modulated by filtered acoustic temporal envelopes. While this technology has provided hearing for multitudinous CI recipients, even bilaterally-implanted listeners have more difficulty understanding speech in noise and localizing sounds than normal hearing (NH) listeners. Three studies reported here have explored ways to improve electric hearing abilities. Vocoders are often used to simulate CIs for NH listeners. Study 1 was a psychoacoustic vocoder study examining the effects of harmonic carrier phase dispersion and simulated CI current spread on speech intelligibility in noise. Results showed that simulated current spread was detrimental to speech understanding and that speech vocoded with carriers whose components' starting phases were equal was the least intelligible. Cross-correlogram analyses of AN model simulations confirmed that carrier component phase dispersion resulted in better neural envelope representation. Localization abilities rely on binaural processing mechanisms in the brainstem and mid-brain that are not fully understood. In Study 2, several potential mechanisms were evaluated based on the ability of metrics extracted from stereo AN simulations to predict azimuthal locations. Results suggest that unique across-frequency patterns of binaural cross-correlation may provide a strong cue set for lateralization and that interaural level differences alone cannot explain NH sensitivity to lateral position. While it is known that many bilateral CI users are sensitive to interaural time differences (ITDs) in low-rate pulsatile stimulation, most contemporary CI processing strategies use high-rate, constant-rate pulse trains. In Study 3, we examined the effects of pulse rate and pulse timing on ITD discrimination, ITD lateralization, and speech recognition by bilateral CI listeners. Results showed that listeners were able to use low-rate pulse timing cues presented redundantly on multiple electrodes for ITD discrimination and lateralization of speech stimuli even when mixed with high rates on other electrodes. These results have contributed to a better understanding of those aspects of the auditory system that support speech understanding and binaural hearing, suggested vocoder parameters that may simulate aspects of electric hearing, and shown that redundant, low-rate pulse timing supports improved spatial hearing for bilateral CI listeners.
Instrumental and perceptual phonetic analyses: the case for two-tier transcriptions.

PubMed

Howard, Sara; Heselwood, Barry

2011-11-01

In this article, we discuss the relationship between instrumental and perceptual phonetic analyses. Using data drawn from typical and atypical speech production, we argue that the use of two-tier transcriptions, which can compare and contrast perceptual and instrumental information, is valuable both for our general understanding of the mechanisms of speech production and perception and also for assessment and intervention for individuals with atypical speech production. The central tenet of our case is that instrumental and perceptual analyses are not in competition to give a single 'correct' account of speech data. They take instead perspectives on complementary phonetic domains, which interlock in the speech chain to encompass production, transmission and perception.
Application of artifical intelligence principles to the analysis of "crazy" speech.

PubMed

Garfield, D A; Rapp, C

1994-04-01

Artificial intelligence computer simulation methods can be used to investigate psychotic or "crazy" speech. Here, symbolic reasoning algorithms establish semantic networks that schematize speech. These semantic networks consist of two main structures: case frames and object taxonomies. Node-based reasoning rules apply to object taxonomies and pathway-based reasoning rules apply to case frames. Normal listeners may recognize speech as "crazy talk" based on violations of node- and pathway-based reasoning rules. In this article, three separate segments of schizophrenic speech illustrate violations of these rules. This artificial intelligence approach is compared and contrasted with other neurolinguistic approaches and is discussed as a conceptual link between neurobiological and psychodynamic understandings of psychopathology.
A chimpanzee recognizes synthetic speech with significantly reduced acoustic cues to phonetic content.

PubMed

Heimbauer, Lisa A; Beran, Michael J; Owren, Michael J

2011-07-26

A long-standing debate concerns whether humans are specialized for speech perception, which some researchers argue is demonstrated by the ability to understand synthetic speech with significantly reduced acoustic cues to phonetic content. We tested a chimpanzee (Pan troglodytes) that recognizes 128 spoken words, asking whether she could understand such speech. Three experiments presented 48 individual words, with the animal selecting a corresponding visuographic symbol from among four alternatives. Experiment 1 tested spectrally reduced, noise-vocoded (NV) synthesis, originally developed to simulate input received by human cochlear-implant users. Experiment 2 tested "impossibly unspeechlike" sine-wave (SW) synthesis, which reduces speech to just three moving tones. Although receiving only intermittent and noncontingent reward, the chimpanzee performed well above chance level, including when hearing synthetic versions for the first time. Recognition of SW words was least accurate but improved in experiment 3 when natural words in the same session were rewarded. The chimpanzee was more accurate with NV than SW versions, as were 32 human participants hearing these items. The chimpanzee's ability to spontaneously recognize acoustically reduced synthetic words suggests that experience rather than specialization is critical for speech-perception capabilities that some have suggested are uniquely human. Copyright © 2011 Elsevier Ltd. All rights reserved.
Changes in Voice Onset Time and Motor Speech Skills in Children following Motor Speech Therapy: Evidence from /pa/ productions

PubMed Central

Yu, Vickie Y.; Kadis, Darren S.; Oh, Anna; Goshulak, Debra; Namasivayam, Aravind; Pukonen, Margit; Kroll, Robert; De Nil, Luc F.; Pang, Elizabeth W.

2016-01-01

This study evaluated changes in motor speech control and inter-gestural coordination for children with speech sound disorders (SSD) subsequent to PROMPT (Prompts for Restructuring Oral Muscular Phonetic Targets) intervention. We measured the distribution patterns of voice onset time (VOT) for a voiceless stop (/p/) to examine the changes in inter-gestural coordination. Two standardized tests were used (VMPAC, GFTA-2) to assess the changes in motor speech skills and articulation. Data showed positive changes in patterns of VOT with a lower pattern of variability. All children showed significantly higher scores for VMPAC, but only some children showed higher scores for GFTA-2. Results suggest that the proprioceptive feedback provided through PROMPT had a positive influence on motor speech control and inter-gestural coordination in voicing behavior. This set of VOT data for children with SSD adds to our understanding of the speech characteristics underlying motor speech control. Directions for future studies are discussed. PMID:24446799
The right hemisphere is highlighted in connected natural speech production and perception.

PubMed

Alexandrou, Anna Maria; Saarinen, Timo; Mäkelä, Sasu; Kujala, Jan; Salmelin, Riitta

2017-05-15

Current understanding of the cortical mechanisms of speech perception and production stems mostly from studies that focus on single words or sentences. However, it has been suggested that processing of real-life connected speech may rely on additional cortical mechanisms. In the present study, we examined the neural substrates of natural speech production and perception with magnetoencephalography by modulating three central features related to speech: amount of linguistic content, speaking rate and social relevance. The amount of linguistic content was modulated by contrasting natural speech production and perception to speech-like non-linguistic tasks. Meaningful speech was produced and perceived at three speaking rates: normal, slow and fast. Social relevance was probed by having participants attend to speech produced by themselves and an unknown person. These speech-related features were each associated with distinct spatiospectral modulation patterns that involved cortical regions in both hemispheres. Natural speech processing markedly engaged the right hemisphere in addition to the left. In particular, the right temporo-parietal junction, previously linked to attentional processes and social cognition, was highlighted in the task modulations. The present findings suggest that its functional role extends to active generation and perception of meaningful, socially relevant speech. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Understanding speech in noise after correction of congenital unilateral aural atresia: effects of age in the emergence of binaural squelch but not in use of head-shadow.

PubMed

Gray, Lincoln; Kesser, Bradley; Cole, Erika

2009-09-01

Unilateral hearing loss causes difficulty hearing in noise (the "cocktail party effect") due to absence of redundancy, head-shadow, and binaural squelch. This study explores the emergence of the head-shadow and binaural squelch effects in children with unilateral congenital aural atresia undergoing surgery to correct their hearing deficit. Adding patients and data from a similar study previously published, we also evaluate covariates such as the age of the patient, surgical outcome, and complexity of the task that might predict the extent of binaural benefit--patients' ability to "use" their new ear--when understanding speech in noise. Patients with unilateral congenital aural atresia were tested for their ability to understand speech in noise before and again 1 month after surgery to repair their atresia. In a sound-attenuating booth participants faced a speaker that produced speech signals with noise 90 degrees to the side of the normal (non-atretic) ear and again to the side of the atretic ear. The Hearing in Noise Test (HINT for adults or HINT-C for children) was used to estimate the patients' speech reception thresholds. The speech-in-noise test (SPIN) or the Pediatric Speech Intelligibility (PSI) Test was used in the previous study. There was consistent improvement, averaging 5dB regardless of age, in the ability to take advantage of head-shadow in understanding speech with noise to the side of the non-atretic (normal) ear. There was, in contrast, a strong negative linear effect of age (r(2)=.78, selecting patients over 8 years) in the emergence of binaural squelch to understand speech with noise to the side of the atretic ear. In patients over 8 years, this trend replicated over different studies and different tests. Children less than 8 years, however, showed less improvement in the HINT-C than in the PSI after surgery with noise toward their atretic ear (effect size=3). No binaural result was correlated with degree of hearing improvement after surgery. All patients are able to take advantage of a favorable signal-to-noise ratio in their newly opened ear; that is with noise toward the side of the normal ear (but this physical, bilateral, head-shadow effect need not involve true central binaural processing). With noise toward the atretic ear, the emergence of binaural squelch replicates between two studies for all but the youngest patients. Approximately 2dB of binaural gain is lost for each decade that surgery is delayed, and zero (or poorer) binaural benefit is predicted after 38 years of age. Older adults do more poorly, possibly secondary to their long period of auditory deprivation. At the youngest ages, however, binaural results are different in open- and closed-set speech tests; the more complex hearing tasks may involve a greater cognitive load. Other cognitive abilities (late evoked potentials, grey matter in auditory cortex, and multitasking) show similar effects of age, peaking at the same late-teen/young-adult period. Longer follow-up is likely critical for the understanding of these data. Getting a new ear may be--like multitasking--challenging for the youngest and oldest subjects.
Lost in Translation: Understanding Students' Use of Social Networking and Online Resources to Support Early Clinical Practices. A National Survey of Graduate Speech-Language Pathology Students

ERIC Educational Resources Information Center

Boster, Jamie B.; McCarthy, John W.

2018-01-01

The Internet is a source of many resources for graduate speech-language pathology (SLP) students. It is important to understand the resources students are aware of, which they use, and why they are being chosen as sources of information for therapy activities. A national online survey of graduate SLP students was conducted to assess their…
The MIT Summit Speech Recognition System: A Progress Report

DTIC Science & Technology

1989-01-01

understanding of the human communication process. Despite recent development of some speech recognition systems with high accuracy, the performance of such...over the past four decades on human communication , in the hope that such systems will one day have a performance approaching that of humans. We are...optimize its use. Third, the system must have a stochastic component to deal with the present state of ignorance in our understanding of the human
The Carolinas Speech Communication Annual, 1997.

ERIC Educational Resources Information Center

McKinney, Bruce C.

1997-01-01

This 1997 issue of "The Carolinas Speech Communication Annual" contains the following articles: "'Bridges of Understanding': UNESCO's Creation of a Fantasy for the American Public" (Michael H. Eaves and Charles F. Beadle, Jr.); "Developing a Communication Cooperative: A Student, Faculty, and Organizational Learning…
DETECTION AND IDENTIFICATION OF SPEECH SOUNDS USING CORTICAL ACTIVITY PATTERNS

PubMed Central

Centanni, T.M.; Sloan, A.M.; Reed, A.C.; Engineer, C.T.; Rennaker, R.; Kilgard, M.P.

2014-01-01

We have developed a classifier capable of locating and identifying speech sounds using activity from rat auditory cortex with an accuracy equivalent to behavioral performance without the need to specify the onset time of the speech sounds. This classifier can identify speech sounds from a large speech set within 40 ms of stimulus presentation. To compare the temporal limits of the classifier to behavior, we developed a novel task that requires rats to identify individual consonant sounds from a stream of distracter consonants. The classifier successfully predicted the ability of rats to accurately identify speech sounds for syllable presentation rates up to 10 syllables per second (up to 17.9 ± 1.5 bits/sec), which is comparable to human performance. Our results demonstrate that the spatiotemporal patterns generated in primary auditory cortex can be used to quickly and accurately identify consonant sounds from a continuous speech stream without prior knowledge of the stimulus onset times. Improved understanding of the neural mechanisms that support robust speech processing in difficult listening conditions could improve the identification and treatment of a variety of speech processing disorders. PMID:24286757
Brain networks engaged in audiovisual integration during speech perception revealed by persistent homology-based network filtration.

PubMed

Kim, Heejung; Hahm, Jarang; Lee, Hyekyoung; Kang, Eunjoo; Kang, Hyejin; Lee, Dong Soo

2015-05-01

The human brain naturally integrates audiovisual information to improve speech perception. However, in noisy environments, understanding speech is difficult and may require much effort. Although the brain network is supposed to be engaged in speech perception, it is unclear how speech-related brain regions are connected during natural bimodal audiovisual or unimodal speech perception with counterpart irrelevant noise. To investigate the topological changes of speech-related brain networks at all possible thresholds, we used a persistent homological framework through hierarchical clustering, such as single linkage distance, to analyze the connected component of the functional network during speech perception using functional magnetic resonance imaging. For speech perception, bimodal (audio-visual speech cue) or unimodal speech cues with counterpart irrelevant noise (auditory white-noise or visual gum-chewing) were delivered to 15 subjects. In terms of positive relationship, similar connected components were observed in bimodal and unimodal speech conditions during filtration. However, during speech perception by congruent audiovisual stimuli, the tighter couplings of left anterior temporal gyrus-anterior insula component and right premotor-visual components were observed than auditory or visual speech cue conditions, respectively. Interestingly, visual speech is perceived under white noise by tight negative coupling in the left inferior frontal region-right anterior cingulate, left anterior insula, and bilateral visual regions, including right middle temporal gyrus, right fusiform components. In conclusion, the speech brain network is tightly positively or negatively connected, and can reflect efficient or effortful processes during natural audiovisual integration or lip-reading, respectively, in speech perception.
Contribution of auditory working memory to speech understanding in mandarin-speaking cochlear implant users.

PubMed

Tao, Duoduo; Deng, Rui; Jiang, Ye; Galvin, John J; Fu, Qian-Jie; Chen, Bing

2014-01-01

To investigate how auditory working memory relates to speech perception performance by Mandarin-speaking cochlear implant (CI) users. Auditory working memory and speech perception was measured in Mandarin-speaking CI and normal-hearing (NH) participants. Working memory capacity was measured using forward digit span and backward digit span; working memory efficiency was measured using articulation rate. Speech perception was assessed with: (a) word-in-sentence recognition in quiet, (b) word-in-sentence recognition in speech-shaped steady noise at +5 dB signal-to-noise ratio, (c) Chinese disyllable recognition in quiet, (d) Chinese lexical tone recognition in quiet. Self-reported school rank was also collected regarding performance in schoolwork. There was large inter-subject variability in auditory working memory and speech performance for CI participants. Working memory and speech performance were significantly poorer for CI than for NH participants. All three working memory measures were strongly correlated with each other for both CI and NH participants. Partial correlation analyses were performed on the CI data while controlling for demographic variables. Working memory efficiency was significantly correlated only with sentence recognition in quiet when working memory capacity was partialled out. Working memory capacity was correlated with disyllable recognition and school rank when efficiency was partialled out. There was no correlation between working memory and lexical tone recognition in the present CI participants. Mandarin-speaking CI users experience significant deficits in auditory working memory and speech performance compared with NH listeners. The present data suggest that auditory working memory may contribute to CI users' difficulties in speech understanding. The present pattern of results with Mandarin-speaking CI users is consistent with previous auditory working memory studies with English-speaking CI users, suggesting that the lexical importance of voice pitch cues (albeit poorly coded by the CI) did not influence the relationship between working memory and speech perception.
Ageing without hearing loss or cognitive impairment causes a decrease in speech intelligibility only in informational maskers.

PubMed

Rajan, R; Cainer, K E

2008-06-23

In most everyday settings, speech is heard in the presence of competing sounds and understanding speech requires skills in auditory streaming and segregation, followed by identification and recognition, of the attended signals. Ageing leads to difficulties in understanding speech in noisy backgrounds. In addition to age-related changes in hearing-related factors, cognitive factors also play a role but it is unclear to what extent these are generalized or modality-specific cognitive factors. We examined how ageing in normal-hearing decade age cohorts from 20 to 69 years affected discrimination of open-set speech in background noise. We used two types of sentences of similar structural and linguistic characteristics but different masking levels (i.e. differences in signal-to-noise ratios required for detection of sentences in a standard masker) so as to vary sentence demand, and two background maskers (one causing purely energetic masking effects and the other causing energetic and informational masking) to vary load conditions. There was a decline in performance (measured as speech reception thresholds for perception of sentences in noise) in the oldest cohort for both types of sentences, but only in the presence of the more demanding informational masker. We interpret these results to indicate a modality-specific decline in cognitive processing, likely a decrease in the ability to use acoustic and phonetic cues efficiently to segregate speech from background noise, in subjects aged >60.
On the use of the distortion-sensitivity approach in examining the role of linguistic abilities in speech understanding in noise.

PubMed

Goverts, S Theo; Huysmans, Elke; Kramer, Sophia E; de Groot, Annette M B; Houtgast, Tammo

2011-12-01

Researchers have used the distortion-sensitivity approach in the psychoacoustical domain to investigate the role of auditory processing abilities in speech perception in noise (van Schijndel, Houtgast, & Festen, 2001; Goverts & Houtgast, 2010). In this study, the authors examined the potential applicability of the distortion-sensitivity approach for investigating the role of linguistic abilities in speech understanding in noise. The authors applied the distortion-sensitivity approach by measuring the processing of visually presented masked text in a condition with manipulated syntactic, lexical, and semantic cues and while using the Text Reception Threshold (George et al., 2007; Kramer, Zekveld, & Houtgast, 2009; Zekveld, George, Kramer, Goverts, & Houtgast, 2007) method. Two groups that differed in linguistic abilities were studied: 13 native and 10 non-native speakers of Dutch, all typically hearing university students. As expected, the non-native subjects showed substantially reduced performance. The results of the distortion-sensitivity approach yielded differentiated results on the use of specific linguistic cues in the 2 groups. The results show the potential value of the distortion-sensitivity approach in studying the role of linguistic abilities in speech understanding in noise of individuals with hearing impairment.

A method for determining internal noise criteria based on practical speech communication applied to helicopters

NASA Technical Reports Server (NTRS)

Sternfeld, H., Jr.; Doyle, L. B.

1978-01-01

The relationship between the internal noise environment of helicopters and the ability of personnel to understand commands and instructions was studied. A test program was conducted to relate speech intelligibility to a standard measurement called Articulation Index. An acoustical simulator was used to provide noise environments typical of Army helicopters. Speech material (command sentences and phonetically balanced word lists) were presented at several voice levels in each helicopter environment. Recommended helicopter internal noise criteria, based on speech communication, were derived and the effectiveness of hearing protection devices were evaluated.
Cognitive Spare Capacity and Speech Communication: A Narrative Overview

PubMed Central

2014-01-01

Background noise can make speech communication tiring and cognitively taxing, especially for individuals with hearing impairment. It is now well established that better working memory capacity is associated with better ability to understand speech under adverse conditions as well as better ability to benefit from the advanced signal processing in modern hearing aids. Recent work has shown that although such processing cannot overcome hearing handicap, it can increase cognitive spare capacity, that is, the ability to engage in higher level processing of speech. This paper surveys recent work on cognitive spare capacity and suggests new avenues of investigation. PMID:24971355
Scaling and universality in the human voice.

PubMed

Luque, Jordi; Luque, Bartolo; Lacasa, Lucas

2015-04-06

Speech is a distinctive complex feature of human capabilities. In order to understand the physics underlying speech production, in this work, we empirically analyse the statistics of large human speech datasets ranging several languages. We first show that during speech, the energy is unevenly released and power-law distributed, reporting a universal robust Gutenberg-Richter-like law in speech. We further show that such 'earthquakes in speech' show temporal correlations, as the interevent statistics are again power-law distributed. As this feature takes place in the intraphoneme range, we conjecture that the process responsible for this complex phenomenon is not cognitive, but it resides in the physiological (mechanical) mechanisms of speech production. Moreover, we show that these waiting time distributions are scale invariant under a renormalization group transformation, suggesting that the process of speech generation is indeed operating close to a critical point. These results are put in contrast with current paradigms in speech processing, which point towards low dimensional deterministic chaos as the origin of nonlinear traits in speech fluctuations. As these latter fluctuations are indeed the aspects that humanize synthetic speech, these findings may have an impact in future speech synthesis technologies. Results are robust and independent of the communication language or the number of speakers, pointing towards a universal pattern and yet another hint of complexity in human speech. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Infants’ brain responses to speech suggest Analysis by Synthesis

PubMed Central

Kuhl, Patricia K.; Ramírez, Rey R.; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

2014-01-01

Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners’ knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca’s area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of “motherese” on early language learning, and (iii) the “social-gating” hypothesis and humans’ development of social understanding. PMID:25024207
Infants' brain responses to speech suggest analysis by synthesis.

PubMed

Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

2014-08-05

Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding.
45 CFR 1308.9 - Eligibility criteria: Speech or language impairments.

Code of Federal Regulations, 2010 CFR

2010-10-01

... HUMAN DEVELOPMENT SERVICES, DEPARTMENT OF HEALTH AND HUMAN SERVICES THE ADMINISTRATION FOR CHILDREN... language impairments. (a) A speech or language impairment means a communication disorder such as stuttering... language disorder may be characterized by difficulty in understanding and producing language, including...
Speech-Language Dissociations, Distractibility, and Childhood Stuttering

PubMed Central

Conture, Edward G.; Walden, Tedra A.; Lambert, Warren E.

2015-01-01

Purpose This study investigated the relation among speech-language dissociations, attentional distractibility, and childhood stuttering. Method Participants were 82 preschool-age children who stutter (CWS) and 120 who do not stutter (CWNS). Correlation-based statistics (Bates, Appelbaum, Salcedo, Saygin, & Pizzamiglio, 2003) identified dissociations across 5 norm-based speech-language subtests. The Behavioral Style Questionnaire Distractibility subscale measured attentional distractibility. Analyses addressed (a) between-groups differences in the number of children exhibiting speech-language dissociations; (b) between-groups distractibility differences; (c) the relation between distractibility and speech-language dissociations; and (d) whether interactions between distractibility and dissociations predicted the frequency of total, stuttered, and nonstuttered disfluencies. Results More preschool-age CWS exhibited speech-language dissociations compared with CWNS, and more boys exhibited dissociations compared with girls. In addition, male CWS were less distractible than female CWS and female CWNS. For CWS, but not CWNS, less distractibility (i.e., greater attention) was associated with more speech-language dissociations. Last, interactions between distractibility and dissociations did not predict speech disfluencies in CWS or CWNS. Conclusions The present findings suggest that for preschool-age CWS, attentional processes are associated with speech-language dissociations. Future investigations are warranted to better understand the directionality of effect of this association (e.g., inefficient attentional processes → speech-language dissociations vs. inefficient attentional processes ← speech-language dissociations). PMID:26126203
System Integration and Control in a Speech Understanding System

DTIC Science & Technology

1975-09-01

aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if...visible. a way that allows the The s~cond part of the paper describes an executive that uses information from these knowledge sources in 1ts control...strategy, System Integration and Control Page 1 A speech understanding system must use manY kinds of knowledge , each playing a particular role during
Hearing Status in Pediatric Renal Transplant Recipients.

PubMed

Gulleroglu, Kaan; Baskin, Esra; Aydin, Erdinc; Ozluoglu, Levent; Moray, Gokhan; Haberal, Mehmet

2015-08-01

Renal transplant provides a long-term survival. Hearing impairment is a major factor in subjective health status. Status of hearing and the cause of hearing impairment in the pediatric renal transplant group have not been evaluated. Here, we studied to evaluate hearing status in pediatric renal transplant patients and to determine the factors that cause hearing impairment. Twenty-seven pediatric renal transplant recipients were investigated. All patients underwent audiologic assessment by means of pure-tone audiometry. The factors on hearing impairment were performed. Sensorineural hearing impairment was found in 17 patients. There was marked hearing impairment for the higher frequencies between 4000 and 8000 Hz. Sudden hearing loss developed in 2 patients, 1 of them had tinnitus. Decrease of speech understanding was found in 8 patients. The cyclosporine level was significantly high in patients with hearing impairment compared with group without hearing impairment. Cyclosporine levels also were found to be statistically significantly high when compared with the group with decrease of speech understanding and the group without decrease of speech understanding. Similar relations cannot be found between tacrolimus levels and hearing impairment and speech understanding. Sensorineural hearing impairment prevalence was high in pediatric renal transplant recipients when compared with the general population of children. Cyclosporine may be responsible for causing hearing impairment after renal transplant. We suggest that this effect is a dose-dependent toxicity.
Using ILD or ITD Cues for Sound Source Localization and Speech Understanding in a Complex Listening Environment by Listeners With Bilateral and With Hearing-Preservation Cochlear Implants.

PubMed

Loiselle, Louise H; Dorman, Michael F; Yost, William A; Cook, Sarah J; Gifford, Rene H

2016-08-01

To assess the role of interaural time differences and interaural level differences in (a) sound-source localization, and (b) speech understanding in a cocktail party listening environment for listeners with bilateral cochlear implants (CIs) and for listeners with hearing-preservation CIs. Eleven bilateral listeners with MED-EL (Durham, NC) CIs and 8 listeners with hearing-preservation CIs with symmetrical low frequency, acoustic hearing using the MED-EL or Cochlear device were evaluated using 2 tests designed to task binaural hearing, localization, and a simulated cocktail party. Access to interaural cues for localization was constrained by the use of low-pass, high-pass, and wideband noise stimuli. Sound-source localization accuracy for listeners with bilateral CIs in response to the high-pass noise stimulus and sound-source localization accuracy for the listeners with hearing-preservation CIs in response to the low-pass noise stimulus did not differ significantly. Speech understanding in a cocktail party listening environment improved for all listeners when interaural cues, either interaural time difference or interaural level difference, were available. The findings of the current study indicate that similar degrees of benefit to sound-source localization and speech understanding in complex listening environments are possible with 2 very different rehabilitation strategies: the provision of bilateral CIs and the preservation of hearing.
A hardware experimental platform for neural circuits in the auditory cortex

NASA Astrophysics Data System (ADS)

Rodellar-Biarge, Victoria; García-Dominguez, Pablo; Ruiz-Rizaldos, Yago; Gómez-Vilda, Pedro

2011-05-01

Speech processing in the human brain is a very complex process far from being fully understood although much progress has been done recently. Neuromorphic Speech Processing is a new research orientation in bio-inspired systems approach to find solutions to automatic treatment of specific problems (recognition, synthesis, segmentation, diarization, etc) which can not be adequately solved using classical algorithms. In this paper a neuromorphic speech processing architecture is presented. The systematic bottom-up synthesis of layered structures reproduce the dynamic feature detection of speech related to plausible neural circuits which work as interpretation centres located in the Auditory Cortex. The elementary model is based on Hebbian neuron-like units. For the computation of the architecture a flexible framework is proposed in the environment of Matlab®/Simulink®/HDL, which allows building models in different description styles, complexity and implementation levels. It provides a flexible platform for experimenting on the influence of the number of neurons and interconnections, in the precision of the results and in performance evaluation. The experimentation with different architecture configurations may help both in better understanding how neural circuits may work in the brain as well as in how speech processing can benefit from this understanding.
Audiovisual sentence recognition not predicted by susceptibility to the McGurk effect.

PubMed

Van Engen, Kristin J; Xie, Zilong; Chandrasekaran, Bharath

2017-02-01

In noisy situations, visual information plays a critical role in the success of speech communication: listeners are better able to understand speech when they can see the speaker. Visual influence on auditory speech perception is also observed in the McGurk effect, in which discrepant visual information alters listeners' auditory perception of a spoken syllable. When hearing /ba/ while seeing a person saying /ga/, for example, listeners may report hearing /da/. Because these two phenomena have been assumed to arise from a common integration mechanism, the McGurk effect has often been used as a measure of audiovisual integration in speech perception. In this study, we test whether this assumed relationship exists within individual listeners. We measured participants' susceptibility to the McGurk illusion as well as their ability to identify sentences in noise across a range of signal-to-noise ratios in audio-only and audiovisual modalities. Our results do not show a relationship between listeners' McGurk susceptibility and their ability to use visual cues to understand spoken sentences in noise, suggesting that McGurk susceptibility may not be a valid measure of audiovisual integration in everyday speech processing.
Who Decides What Is Acceptable Speech on Campus? Why Restricting Free Speech Is Not the Answer.

PubMed

Ceci, Stephen J; Williams, Wendy M

2018-05-01

Recent protests on dozens of campuses have led to the cancellation of controversial talks, and violence has accompanied several of these protests. Psychological science provides an important lens through which to view, understand, and potentially reduce these conflicts. In this article, we frame opposing sides' arguments within a long-standing corpus of psychological research on selective perception, confirmation bias, myside bias, illusion of understanding, blind-spot bias, groupthink/in-group bias, motivated skepticism, and naive realism. These concepts inform dueling claims: (a) the protestors' violence was justified by a higher moral responsibility to prevent marginalized groups from being victimized by hate speech, versus (b) the students' right to hear speakers was infringed upon. Psychological science cannot, however, be the sole arbiter of these campus debates; legal and philosophical considerations are also relevant. Thus, we augment psychological science with insights from these literatures to shed light on complexities associated with positions supporting free speech and those protesting hate speech. We conclude with a set of principles, most supported by empirical research, to inform university policies and help ensure vigorous freedom of expression within the context of an inclusive, diverse community.
Relationship between perceptual learning in speech and statistical learning in younger and older adults

PubMed Central

Neger, Thordis M.; Rietveld, Toni; Janse, Esther

2014-01-01

Within a few sentences, listeners learn to understand severely degraded speech such as noise-vocoded speech. However, individuals vary in the amount of such perceptual learning and it is unclear what underlies these differences. The present study investigates whether perceptual learning in speech relates to statistical learning, as sensitivity to probabilistic information may aid identification of relevant cues in novel speech input. If statistical learning and perceptual learning (partly) draw on the same general mechanisms, then statistical learning in a non-auditory modality using non-linguistic sequences should predict adaptation to degraded speech. In the present study, 73 older adults (aged over 60 years) and 60 younger adults (aged between 18 and 30 years) performed a visual artificial grammar learning task and were presented with 60 meaningful noise-vocoded sentences in an auditory recall task. Within age groups, sentence recognition performance over exposure was analyzed as a function of statistical learning performance, and other variables that may predict learning (i.e., hearing, vocabulary, attention switching control, working memory, and processing speed). Younger and older adults showed similar amounts of perceptual learning, but only younger adults showed significant statistical learning. In older adults, improvement in understanding noise-vocoded speech was constrained by age. In younger adults, amount of adaptation was associated with lexical knowledge and with statistical learning ability. Thus, individual differences in general cognitive abilities explain listeners' variability in adapting to noise-vocoded speech. Results suggest that perceptual and statistical learning share mechanisms of implicit regularity detection, but that the ability to detect statistical regularities is impaired in older adults if visual sequences are presented quickly. PMID:25225475
Noise and communication: a three-year update.

PubMed

Brammer, Anthony J; Laroche, Chantal

2012-01-01

Noise is omnipresent and impacts us all in many aspects of daily living. Noise can interfere with communication not only in industrial workplaces, but also in other work settings (e.g. open-plan offices, construction, and mining) and within buildings (e.g. residences, arenas, and schools). The interference of noise with communication can have significant social consequences, especially for persons with hearing loss, and may compromise safety (e.g. failure to perceive auditory warning signals), influence worker productivity and learning in children, affect health (e.g. vocal pathology, noise-induced hearing loss), compromise speech privacy, and impact social participation by the elderly. For workers, attempts have been made to: 1) Better define the auditory performance needed to function effectively and to directly measure these abilities when assessing Auditory Fitness for Duty, 2) design hearing protection devices that can improve speech understanding while offering adequate protection against loud noises, and 3) improve speech privacy in open-plan offices. As the elderly are particularly vulnerable to the effects of noise, an understanding of the interplay between auditory, cognitive, and social factors and its effect on speech communication and social participation is also critical. Classroom acoustics and speech intelligibility in children have also gained renewed interest because of the importance of effective speech comprehension in noise on learning. Finally, substantial work has been made in developing models aimed at better predicting speech intelligibility. Despite progress in various fields, the design of alarm signals continues to lag behind advancements in knowledge. This summary of the last three years' research highlights some of the most recent issues for the workplace, for older adults, and for children, as well as the effectiveness of warning sounds and models for predicting speech intelligibility. Suggestions for future work are also discussed.
Relationship between perceptual learning in speech and statistical learning in younger and older adults.

PubMed

Neger, Thordis M; Rietveld, Toni; Janse, Esther

2014-01-01

Within a few sentences, listeners learn to understand severely degraded speech such as noise-vocoded speech. However, individuals vary in the amount of such perceptual learning and it is unclear what underlies these differences. The present study investigates whether perceptual learning in speech relates to statistical learning, as sensitivity to probabilistic information may aid identification of relevant cues in novel speech input. If statistical learning and perceptual learning (partly) draw on the same general mechanisms, then statistical learning in a non-auditory modality using non-linguistic sequences should predict adaptation to degraded speech. In the present study, 73 older adults (aged over 60 years) and 60 younger adults (aged between 18 and 30 years) performed a visual artificial grammar learning task and were presented with 60 meaningful noise-vocoded sentences in an auditory recall task. Within age groups, sentence recognition performance over exposure was analyzed as a function of statistical learning performance, and other variables that may predict learning (i.e., hearing, vocabulary, attention switching control, working memory, and processing speed). Younger and older adults showed similar amounts of perceptual learning, but only younger adults showed significant statistical learning. In older adults, improvement in understanding noise-vocoded speech was constrained by age. In younger adults, amount of adaptation was associated with lexical knowledge and with statistical learning ability. Thus, individual differences in general cognitive abilities explain listeners' variability in adapting to noise-vocoded speech. Results suggest that perceptual and statistical learning share mechanisms of implicit regularity detection, but that the ability to detect statistical regularities is impaired in older adults if visual sequences are presented quickly.
Perception of intelligibility and qualities of non-native accented speakers.

PubMed

Fuse, Akiko; Navichkova, Yuliya; Alloggio, Krysteena

To provide effective treatment to clients, speech-language pathologists must be understood, and be perceived to demonstrate the personal qualities necessary for therapeutic practice (e.g., resourcefulness and empathy). One factor that could interfere with the listener's perception of non-native speech is the speaker's accent. The current study explored the relationship between how accurately listeners could understand non-native speech and their perceptions of personal attributes of the speaker. Additionally, this study investigated how listeners' familiarity and experience with other languages may influence their perceptions of non-native accented speech. Through an online survey, native monolingual and bilingual English listeners rated four non-native accents (i.e., Spanish, Chinese, Russian, and Indian) on perceived intelligibility and perceived personal qualities (i.e., professionalism, intelligence, resourcefulness, empathy, and patience) necessary for speech-language pathologists. The results indicated significant relationships between the perception of intelligibility and the perception of personal qualities (i.e., professionalism, intelligence, and resourcefulness) attributed to non-native speakers. However, these findings were not supported for the Chinese accent. Bilingual listeners judged the non-native speech as more intelligible in comparison to monolingual listeners. No significant differences were found in the ratings between bilingual listeners who share the same language background as the speaker and other bilingual listeners. Based on the current findings, greater perception of intelligibility was the key to promoting a positive perception of personal qualities such as professionalism, intelligence, and resourcefulness, important for speech-language pathologists. The current study found evidence to support the claim that bilinguals have a greater ability in understanding non-native accented speech compared to monolingual listeners. The results, however, did not confirm an advantage for bilingual listeners sharing the same language backgrounds with the non-native speaker over other bilingual listeners. Copyright © 2017 Elsevier Inc. All rights reserved.
The Struggle with Hate Speech. Teaching Strategy.

ERIC Educational Resources Information Center

Bloom, Jennifer

1995-01-01

Discusses the issue of hate-motivated violence and special laws aimed at deterrence. Presents a secondary school lesson to help students define hate speech and understand constitutional issues related to the topic. Includes three student handouts, student learning objectives, instructional procedures, and a discussion guide. (CFR)
Understanding and Helping Children Who Do Not Talk in School.

ERIC Educational Resources Information Center

Gemelli, Ralph J.

1983-01-01

The role of speech in child development is examined, reasons for lack of speech in school are suggested (including anxiety, shock, overprotection, abuse, or anger), and four recommendations for teacher action are offered (including having empathy and encouraging other means of communication). (CL)
Generalization of Perceptual Learning of Vocoded Speech

ERIC Educational Resources Information Center

Hervais-Adelman, Alexis G.; Davis, Matthew H.; Johnsrude, Ingrid S.; Taylor, Karen J.; Carlyon, Robert P.

2011-01-01

Recent work demonstrates that learning to understand noise-vocoded (NV) speech alters sublexical perceptual processes but is enhanced by the simultaneous provision of higher-level, phonological, but not lexical content (Hervais-Adelman, Davis, Johnsrude, & Carlyon, 2008), consistent with top-down learning (Davis, Johnsrude, Hervais-Adelman,…

Talker-specific learning in amnesia: Insight into mechanisms of adaptive speech perception

PubMed Central

Trude, Alison M.; Duff, Melissa C.; Brown-Schmidt, Sarah

2014-01-01

A hallmark of human speech perception is the ability to comprehend speech quickly and effortlessly despite enormous variability across talkers. However, current theories of speech perception do not make specific claims about the memory mechanisms involved in this process. To examine whether declarative memory is necessary for talker-specific learning, we tested the ability of amnesic patients with severe declarative memory deficits to learn and distinguish the accents of two unfamiliar talkers by monitoring their eye-gaze as they followed spoken instructions. Analyses of the time-course of eye fixations showed that amnesic patients rapidly learned to distinguish these accents and tailored perceptual processes to the voice of each talker. These results demonstrate that declarative memory is not necessary for this ability and points to the involvement of non-declarative memory mechanisms. These results are consistent with findings that other social and accommodative behaviors are preserved in amnesia and contribute to our understanding of the interactions of multiple memory systems in the use and understanding of spoken language. PMID:24657480
Linguistic and pragmatic constraints on utterance interpretation

NASA Astrophysics Data System (ADS)

Hinkelman, Elizabeth A.

1990-05-01

In order to model how people understand language, it is necessary to understand not only grammar and logic but also how people use language to affect their environment. This area of study is known as natural language pragmatics. Speech acts, for instance, are the offers, promises, announcements, etc., that people make by talking. The same expression may be different acts in different contexts, and yet not every expression performs every act. We want to understand how people are able to recognize other's intentions and implications in saying something. Previous plan-based theories of speech act interpretation do not account for the conventional aspect of speech acts. They can, however, be made sensitive to both linguistic and propositional information. This dissertation presents a method of speech act interpretation which uses patterns of linguistic features (e.g., mood, verb form, sentence adverbials, thematic roles) to identify a range of speech act interpretations for the utterance. These are then filtered and elaborated by inferences about agents' goals and plans. In many cases the plan reasoning consists of short, local inference chains (that are in fact conversational implicatures) and, extended reasoning is necessary only for the most difficult cases. The method is able to accommodate a wide range of cases, from those which seem very idiomatic to those which must be analyzed using knowledge about the world and human behavior. It explains how, Can you pass the salt, can be a request while, Are you able to pass the salt, is not.
Is the Sensorimotor Cortex Relevant for Speech Perception and Understanding? An Integrative Review

PubMed Central

Schomers, Malte R.; Pulvermüller, Friedemann

2016-01-01

In the neuroscience of language, phonemes are frequently described as multimodal units whose neuronal representations are distributed across perisylvian cortical regions, including auditory and sensorimotor areas. A different position views phonemes primarily as acoustic entities with posterior temporal localization, which are functionally independent from frontoparietal articulatory programs. To address this current controversy, we here discuss experimental results from functional magnetic resonance imaging (fMRI) as well as transcranial magnetic stimulation (TMS) studies. On first glance, a mixed picture emerges, with earlier research documenting neurofunctional distinctions between phonemes in both temporal and frontoparietal sensorimotor systems, but some recent work seemingly failing to replicate the latter. Detailed analysis of methodological differences between studies reveals that the way experiments are set up explains whether sensorimotor cortex maps phonological information during speech perception or not. In particular, acoustic noise during the experiment and ‘motor noise’ caused by button press tasks work against the frontoparietal manifestation of phonemes. We highlight recent studies using sparse imaging and passive speech perception tasks along with multivariate pattern analysis (MVPA) and especially representational similarity analysis (RSA), which succeeded in separating acoustic-phonological from general-acoustic processes and in mapping specific phonological information on temporal and frontoparietal regions. The question about a causal role of sensorimotor cortex on speech perception and understanding is addressed by reviewing recent TMS studies. We conclude that frontoparietal cortices, including ventral motor and somatosensory areas, reflect phonological information during speech perception and exert a causal influence on language understanding. PMID:27708566
Thinking outside the (voice) box: a case study of students' perceptions of the relevance of anatomy to speech pathology.

PubMed

Weir, Kristy A

2008-01-01

Speech pathology students readily identify the importance of a sound understanding of anatomical structures central to their intended profession. In contrast, they often do not recognize the relevance of a broader understanding of structure and function. This study aimed to explore students' perceptions of the relevance of anatomy to speech pathology. The effect of two learning activities on students' perceptions was also evaluated. First, a written assignment required students to illustrate the relevance of anatomy to speech pathology by using an example selected from one of the four alternative structures. The second approach was the introduction of brief "scenarios" with directed questions into the practical class. The effects of these activities were assessed via two surveys designed to evaluate students' perceptions of the relevance of anatomy before and during the course experience. A focus group was conducted to clarify and extend discussion of issues arising from the survey data. The results showed that the students perceived some course material as irrelevant to speech pathology. The importance of relevance to the students' "state" motivation was well supported by the data. Although the students believed that the learning activities helped their understanding of the relevance of anatomy, some structures were considered less relevant at the end of the course. It is likely that the perceived amount of content and surface approach to learning may have prevented students from "thinking outside the box" regarding which anatomical structures are relevant to the profession.
[Communication and noise. Speech intelligibility of airplane pilots with and without active noise compensation].

PubMed

Matschke, R G

1994-08-01

Noise exposure measurements were performed with pilots of the German Federal Navy during flight situations. The ambient noise levels during regular flight were maintained at levels above a 90 dB A-weighted level. This noise intensity requires wearing ear protection to avoid sound-induced hearing loss. To be able to understand radio communication (ATC) in spite of a noisy environment, headphone volume must be raised above the noise of the engines. The use of ear plugs in addition to the headsets and flight helmets is only of limited value because personal ear protection affects the intelligibility of ATC. Whereas speech intelligibility of pilots with normal hearing is affected to only a smaller degree, pilots with pre-existing high-frequency hearing losses show substantial impairments of speech intelligibility that vary in proportion to the hearing deficit present. Communication abilities can be reduced drastically, which in turn can affect air traffic security. The development of active noise compensation devices (ANC) that make use of the "anti-noise" principle may be a solution to this dilemma. To evaluate the effectiveness of an ANC-system and its influence on speech intelligibility, speech audiometry was performed with a German standardized test during simulated flight conditions with helicopter pilots. Results demonstrate the helpful effect on speech understanding especially for pilots with noise-induced hearing losses. This may help to avoid pre-retirement professional disability.
Primate vocal communication: a useful tool for understanding human speech and language evolution?

PubMed

Fedurek, Pawel; Slocombe, Katie E

2011-04-01

Language is a uniquely human trait, and questions of how and why it evolved have been intriguing scientists for years. Nonhuman primates (primates) are our closest living relatives, and their behavior can be used to estimate the capacities of our extinct ancestors. As humans and many primate species rely on vocalizations as their primary mode of communication, the vocal behavior of primates has been an obvious target for studies investigating the evolutionary roots of human speech and language. By studying the similarities and differences between human and primate vocalizations, comparative research has the potential to clarify the evolutionary processes that shaped human speech and language. This review examines some of the seminal and recent studies that contribute to our knowledge regarding the link between primate calls and human language and speech. We focus on three main aspects of primate vocal behavior: functional reference, call combinations, and vocal learning. Studies in these areas indicate that despite important differences, primate vocal communication exhibits some key features characterizing human language. They also indicate, however, that some critical aspects of speech, such as vocal plasticity, are not shared with our primate cousins. We conclude that comparative research on primate vocal behavior is a very promising tool for deepening our understanding of the evolution of human speech and language, but much is still to be done as many aspects of monkey and ape vocalizations remain largely unexplored.
Perceptual weighting of the envelope and fine structure across frequency bands for sentence intelligibility: Effect of interruption at the syllabic-rate and periodic-rate of speech

PubMed Central

Fogerty, Daniel

2011-01-01

Listeners often only have fragments of speech available to understand the intended message due to competing background noise. In order to maximize successful speech recognition, listeners must allocate their perceptual resources to the most informative acoustic properties. The speech signal contains temporally-varying acoustics in the envelope and fine structure that are present across the frequency spectrum. Understanding how listeners perceptually weigh these acoustic properties in different frequency regions during interrupted speech is essential for the design of assistive listening devices. This study measured the perceptual weighting of young normal-hearing listeners for the envelope and fine structure in each of three frequency bands for interrupted sentence materials. Perceptual weights were obtained during interruption at the syllabic rate (i.e., 4 Hz) and the periodic rate (i.e., 128 Hz) of speech. Potential interruption interactions with fundamental frequency information were investigated by shifting the natural pitch contour higher relative to the interruption rate. The availability of each acoustic property was varied independently by adding noise at different levels. Perceptual weights were determined by correlating a listener’s performance with the availability of each acoustic property on a trial-by-trial basis. Results demonstrated similar relative weights across the interruption conditions, with emphasis on the envelope in high-frequencies. PMID:21786914
How our own speech rate influences our perception of others.

PubMed

Bosker, Hans Rutger

2017-08-01

In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects induced by our own speech through 6 experiments, specifically targeting rate normalization (i.e., perceiving phonetic segments relative to surrounding speech rate). Experiment 1 revealed that hearing prerecorded fast or slow context sentences altered the perception of ambiguous vowels, replicating earlier work. Experiment 2 demonstrated that talking at a fast or slow rate prior to target presentation also altered target perception, though the effect of preceding speech rate was reduced. Experiment 3 showed that silent talking (i.e., inner speech) at fast or slow rates did not modulate the perception of others, suggesting that the effect of self-produced speech rate in Experiment 2 arose through monitoring of the external speech signal. Experiment 4 demonstrated that, when participants were played back their own (fast/slow) speech, no reduction of the effect of preceding speech rate was observed, suggesting that the additional task of speech production may be responsible for the reduced effect in Experiment 2. Finally, Experiments 5 and 6 replicate Experiments 2 and 3 with new participant samples. Taken together, these results suggest that variation in speech production may induce variation in speech perception, thus carrying implications for our understanding of spoken communication in dialogue settings. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Musicians and non-musicians are equally adept at perceiving masked speech

PubMed Central

Boebinger, Dana; Evans, Samuel; Scott, Sophie K.; Rosen, Stuart; Lima, César F.; Manly, Tom

2015-01-01

There is much interest in the idea that musicians perform better than non-musicians in understanding speech in background noise. Research in this area has often used energetic maskers, which have their effects primarily at the auditory periphery. However, masking interference can also occur at more central auditory levels, known as informational masking. This experiment extends existing research by using multiple maskers that vary in their informational content and similarity to speech, in order to examine differences in perception of masked speech between trained musicians (n = 25) and non-musicians (n = 25). Although musicians outperformed non-musicians on a measure of frequency discrimination, they showed no advantage in perceiving masked speech. Further analysis revealed that nonverbal IQ, rather than musicianship, significantly predicted speech reception thresholds in noise. The results strongly suggest that the contribution of general cognitive abilities needs to be taken into account in any investigations of individual variability for perceiving speech in noise. PMID:25618067
Effects of social cognitive impairment on speech disorder in schizophrenia.

PubMed

Docherty, Nancy M; McCleery, Amanda; Divilbiss, Marielle; Schumann, Emily B; Moe, Aubrey; Shakeel, Mohammed K

2013-05-01

Disordered speech in schizophrenia impairs social functioning because it impedes communication with others. Treatment approaches targeting this symptom have been limited by an incomplete understanding of its causes. This study examined the process underpinnings of speech disorder, assessed in terms of communication failure. Contributions of impairments in 2 social cognitive abilities, emotion perception and theory of mind (ToM), to speech disorder were assessed in 63 patients with schizophrenia or schizoaffective disorder and 21 nonpsychiatric participants, after controlling for the effects of verbal intelligence and impairments in basic language-related neurocognitive abilities. After removal of the effects of the neurocognitive variables, impairments in emotion perception and ToM each explained additional variance in speech disorder in the patients but not the controls. The neurocognitive and social cognitive variables, taken together, explained 51% of the variance in speech disorder in the patients. Schizophrenic disordered speech may be less a concomitant of "positive" psychotic process than of illness-related limitations in neurocognitive and social cognitive functioning.
Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a ‘Cocktail Party’

PubMed Central

Zion Golumbic, Elana M.; Ding, Nai; Bickel, Stephan; Lakatos, Peter; Schevon, Catherine A.; McKhann, Guy M.; Goodman, Robert R.; Emerson, Ronald; Mehta, Ashesh D.; Simon, Jonathan Z.; Poeppel, David; Schroeder, Charles E.

2013-01-01

Summary The ability to focus on and understand one talker in a noisy social environment is a critical social-cognitive capacity, whose underlying neuronal mechanisms are unclear. We investigated the manner in which speech streams are represented in brain activity and the way that selective attention governs the brain’s representation of speech using a ‘Cocktail Party’ Paradigm, coupled with direct recordings from the cortical surface in surgical epilepsy patients. We find that brain activity dynamically tracks speech streams using both low frequency phase and high frequency amplitude fluctuations, and that optimal encoding likely combines the two. In and near low level auditory cortices, attention ‘modulates’ the representation by enhancing cortical tracking of attended speech streams, but ignored speech remains represented. In higher order regions, the representation appears to become more ‘selective,’ in that there is no detectable tracking of ignored speech. This selectivity itself seems to sharpen as a sentence unfolds. PMID:23473326
Noise suppression methods for robust speech processing

NASA Astrophysics Data System (ADS)

Boll, S. F.; Ravindra, H.; Randall, G.; Armantrout, R.; Power, R.

1980-05-01

Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments during this reporting period for the research program funded to develop real time, compressed speech analysis synthesis algorithms whose performance in invariant under signal contamination. Fulfillment of this requirement is necessary to insure reliable secure compressed speech transmission within realistic military command and control environments. Overall contributions resulting from this research program include the understanding of how environmental noise degrades narrow band, coded speech, development of appropriate real time noise suppression algorithms, and development of speech parameter identification methods that consider signal contamination as a fundamental element in the estimation process. This report describes the current research and results in the areas of noise suppression using the dual input adaptive noise cancellation using the short time Fourier transform algorithms, articulation rate change techniques, and a description of an experiment which demonstrated that the spectral subtraction noise suppression algorithm can improve the intelligibility of 2400 bps, LPC 10 coded, helicopter speech by 10.6 point.
Perceptual Learning of Speech under Optimal and Adverse Conditions

PubMed Central

Zhang, Xujin; Samuel, Arthur G.

2014-01-01

Humans have a remarkable ability to understand spoken language despite the large amount of variability in speech. Previous research has shown that listeners can use lexical information to guide their interpretation of atypical sounds in speech (Norris, McQueen, & Cutler, 2003). This kind of lexically induced perceptual learning enables people to adjust to the variations in utterances due to talker-specific characteristics, such as individual identity and dialect. The current study investigated perceptual learning in two optimal conditions: conversational speech (Experiment 1) vs. clear speech (Experiment 2), and three adverse conditions: noise (Experiment 3a) vs. two cognitive loads (Experiments 4a & 4b). Perceptual learning occurred in the two optimal conditions and in the two cognitive load conditions, but not in the noise condition. Furthermore, perceptual learning occurred only in the first of two sessions for each participant, and only for atypical /s/ sounds and not for atypical /f/ sounds. This pattern of learning and non-learning reflects a balance between flexibility and stability that the speech system must have to deal with speech variability in the diverse conditions that speech is encountered. PMID:23815478
Speech sound discrimination training improves auditory cortex responses in a rat model of autism

PubMed Central

Engineer, Crystal T.; Centanni, Tracy M.; Im, Kwok W.; Kilgard, Michael P.

2014-01-01

Children with autism often have language impairments and degraded cortical responses to speech. Extensive behavioral interventions can improve language outcomes and cortical responses. Prenatal exposure to the antiepileptic drug valproic acid (VPA) increases the risk for autism and language impairment. Prenatal exposure to VPA also causes weaker and delayed auditory cortex responses in rats. In this study, we document speech sound discrimination ability in VPA exposed rats and document the effect of extensive speech training on auditory cortex responses. VPA exposed rats were significantly impaired at consonant, but not vowel, discrimination. Extensive speech training resulted in both stronger and faster anterior auditory field (AAF) responses compared to untrained VPA exposed rats, and restored responses to control levels. This neural response improvement generalized to non-trained sounds. The rodent VPA model of autism may be used to improve the understanding of speech processing in autism and contribute to improving language outcomes. PMID:25140133
Population Health in Pediatric Speech and Language Disorders: Available Data Sources and a Research Agenda for the Field.

PubMed

Raghavan, Ramesh; Camarata, Stephen; White, Karl; Barbaresi, William; Parish, Susan; Krahn, Gloria

2018-05-17

The aim of the study was to provide an overview of population science as applied to speech and language disorders, illustrate data sources, and advance a research agenda on the epidemiology of these conditions. Computer-aided database searches were performed to identify key national surveys and other sources of data necessary to establish the incidence, prevalence, and course and outcome of speech and language disorders. This article also summarizes a research agenda that could enhance our understanding of the epidemiology of these disorders. Although the data yielded estimates of prevalence and incidence for speech and language disorders, existing sources of data are inadequate to establish reliable rates of incidence, prevalence, and outcomes for speech and language disorders at the population level. Greater support for inclusion of speech and language disorder-relevant questions is necessary in national health surveys to build the population science in the field.
Dopamine regulation of human speech and bird song: A critical review

PubMed Central

Simonyan, Kristina; Horwitz, Barry; Jarvis, Erich D.

2012-01-01

To understand the neural basis of human speech control, extensive research has been done using a variety of methodologies in a range of experimental models. Nevertheless, several critical questions about learned vocal motor control still remain open. One of them is the mechanism(s) by which neurotransmitters, such as dopamine, modulate speech and song production. In this review, we bring together the two fields of investigations of dopamine action on voice control in humans and songbirds, who share similar behavioral and neural mechanisms for speech and song production. While human studies investigating the role of dopamine in speech control are limited to reports in neurological patients, research on dopaminergic modulation of bird song control has recently expanded our views on how this system might be organized. We discuss the parallels between bird song and human speech from the perspective of dopaminergic control as well as outline important differences between these species. PMID:22284300
Binaural hearing with electrical stimulation

PubMed Central

Kan, Alan; Litovsky, Ruth Y.

2014-01-01

Bilateral cochlear implantation is becoming a standard of care in many clinics. While much benefit has been shown through bilateral implantation, patients who have bilateral cochlear implants (CIs) still do not perform as well as normal hearing listeners in sound localization and understanding speech in noisy environments. This difference in performance can arise from a number of different factors, including the areas of hardware and engineering, surgical precision and pathology of the auditory system in deaf persons. While surgical precision and individual pathology are factors that are beyond careful control, improvements can be made in the areas of clinical practice and the engineering of binaural speech processors. These improvements should be grounded in a good understanding of the sensitivities of bilateral CI patients to the acoustic binaural cues that are important to normal hearing listeners for sound localization and speech in noise understanding. To this end, we review the current state-of-the-art in the understanding of the sensitivities of bilateral CI patients to binaural cues in electric hearing, and highlight the important issues and challenges as they relate to clinical practice and the development of new binaural processing strategies. PMID:25193553
The Military Utility of Understanding Adversary Culture

DTIC Science & Technology

2005-01-01

squelching of Iraqi freedom of speech . Many members of the Coalition Provi- sional Authority (CPA) and Combined Joint Task Force 7 felt that anticoali- tion...an Iraqi perception that Americans do not really support freedom of speech despite their claims to the contrary, reinforcing their view of Americans
Impromptu Speech Gamification for ESL/EFL Students

ERIC Educational Resources Information Center

Girardelli, Davide

2017-01-01

Courses: Any introductory undergraduate public-speaking course, in particular in ESL/EFL contexts. Objectives: This single-class activity is intended to (1) build students' ability to communicate orally "off the cuff;" (2) foster students' understanding of the major organizational formats used in organizing speeches; and (3) increase…
Advances in natural language processing.

PubMed

Hirschberg, Julia; Manning, Christopher D

2015-07-17

Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. Copyright © 2015, American Association for the Advancement of Science.

Auditory-neurophysiological responses to speech during early childhood: Effects of background noise

PubMed Central

White-Schwoch, Travis; Davies, Evan C.; Thompson, Elaine C.; Carr, Kali Woodruff; Nicol, Trent; Bradlow, Ann R.; Kraus, Nina

2015-01-01

Early childhood is a critical period of auditory learning, during which children are constantly mapping sounds to meaning. But learning rarely occurs under ideal listening conditions—children are forced to listen against a relentless din. This background noise degrades the neural coding of these critical sounds, in turn interfering with auditory learning. Despite the importance of robust and reliable auditory processing during early childhood, little is known about the neurophysiology underlying speech processing in children so young. To better understand the physiological constraints these adverse listening scenarios impose on speech sound coding during early childhood, auditory-neurophysiological responses were elicited to a consonant-vowel syllable in quiet and background noise in a cohort of typically-developing preschoolers (ages 3–5 yr). Overall, responses were degraded in noise: they were smaller, less stable across trials, slower, and there was poorer coding of spectral content and the temporal envelope. These effects were exacerbated in response to the consonant transition relative to the vowel, suggesting that the neural coding of spectrotemporally-dynamic speech features is more tenuous in noise than the coding of static features—even in children this young. Neural coding of speech temporal fine structure, however, was more resilient to the addition of background noise than coding of temporal envelope information. Taken together, these results demonstrate that noise places a neurophysiological constraint on speech processing during early childhood by causing a breakdown in neural processing of speech acoustics. These results may explain why some listeners have inordinate difficulties understanding speech in noise. Speech-elicited auditory-neurophysiological responses offer objective insight into listening skills during early childhood by reflecting the integrity of neural coding in quiet and noise; this paper documents typical response properties in this age group. These normative metrics may be useful clinically to evaluate auditory processing difficulties during early childhood. PMID:26113025
Air Traffic Controllers’ Long-Term Speech-in-Noise Training Effects: A Control Group Study

PubMed Central

Zaballos, María T.P.; Plasencia, Daniel P.; González, María L.Z.; de Miguel, Angel R.; Macías, Ángel R.

2016-01-01

Introduction: Speech perception in noise relies on the capacity of the auditory system to process complex sounds using sensory and cognitive skills. The possibility that these can be trained during adulthood is of special interest in auditory disorders, where speech in noise perception becomes compromised. Air traffic controllers (ATC) are constantly exposed to radio communication, a situation that seems to produce auditory learning. The objective of this study has been to quantify this effect. Subjects and Methods: 19 ATC and 19 normal hearing individuals underwent a speech in noise test with three signal to noise ratios: 5, 0 and −5 dB. Noise and speech were presented through two different loudspeakers in azimuth position. Speech tokes were presented at 65 dB SPL, while white noise files were at 60, 65 and 70 dB respectively. Results: Air traffic controllers outperform the control group in all conditions [P<0.05 in ANOVA and Mann-Whitney U tests]. Group differences were largest in the most difficult condition, SNR=−5 dB. However, no correlation between experience and performance were found for any of the conditions tested. The reason might be that ceiling performance is achieved much faster than the minimum experience time recorded, 5 years, although intrinsic cognitive abilities cannot be disregarded. Discussion: ATC demonstrated enhanced ability to hear speech in challenging listening environments. This study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions, although good cognitive qualities are likely to be a basic requirement for this training to be effective. Conclusion: Our results show that ATC outperform the control group in all conditions. Thus, this study provides evidence that long-term auditory training is indeed useful in achieving better speech-in-noise understanding even in adverse conditions. PMID:27991470
Recovering With Acquired Apraxia of Speech: The First 2 Years.

PubMed

Haley, Katarina L; Shafer, Jennifer N; Harmon, Tyson G; Jacks, Adam

2016-12-01

This study was intended to document speech recovery for 1 person with acquired apraxia of speech quantitatively and on the basis of her lived experience. The second author sustained a traumatic brain injury that resulted in acquired apraxia of speech. Over a 2-year period, she documented her recovery through 22 video-recorded monologues. We analyzed these monologues using a combination of auditory perceptual, acoustic, and qualitative methods. Recovery was evident for all quantitative variables examined. For speech sound production, the recovery was most prominent during the first 3 months, but slower improvement was evident for many months. Measures of speaking rate, fluency, and prosody changed more gradually throughout the entire period. A qualitative analysis of topics addressed in the monologues was consistent with the quantitative speech recovery and indicated a subjective dynamic relationship between accuracy and rate, an observation that several factors made speech sound production variable, and a persisting need for cognitive effort while speaking. Speech features improved over an extended time, but the recovery trajectories differed, indicating dynamic reorganization of the underlying speech production system. The relationship among speech dimensions should be examined in other cases and in population samples. The combination of quantitative and qualitative analysis methods offers advantages for understanding clinically relevant aspects of recovery.
Task-dependent modulation of the visual sensory thalamus assists visual-speech recognition.

PubMed

Díaz, Begoña; Blank, Helen; von Kriegstein, Katharina

2018-05-14

The cerebral cortex modulates early sensory processing via feed-back connections to sensory pathway nuclei. The functions of this top-down modulation for human behavior are poorly understood. Here, we show that top-down modulation of the visual sensory thalamus (the lateral geniculate body, LGN) is involved in visual-speech recognition. In two independent functional magnetic resonance imaging (fMRI) studies, LGN response increased when participants processed fast-varying features of articulatory movements required for visual-speech recognition, as compared to temporally more stable features required for face identification with the same stimulus material. The LGN response during the visual-speech task correlated positively with the visual-speech recognition scores across participants. In addition, the task-dependent modulation was present for speech movements and did not occur for control conditions involving non-speech biological movements. In face-to-face communication, visual speech recognition is used to enhance or even enable understanding what is said. Speech recognition is commonly explained in frameworks focusing on cerebral cortex areas. Our findings suggest that task-dependent modulation at subcortical sensory stages has an important role for communication: Together with similar findings in the auditory modality the findings imply that task-dependent modulation of the sensory thalami is a general mechanism to optimize speech recognition. Copyright © 2018. Published by Elsevier Inc.
Identification of a pathway for intelligible speech in the left temporal lobe

PubMed Central

Scott, Sophie K.; Blank, C. Catrin; Rosen, Stuart; Wise, Richard J. S.

2017-01-01

Summary It has been proposed that the identification of sounds, including species-specific vocalizations, by primates depends on anterior projections from the primary auditory cortex, an auditory pathway analogous to the ventral route proposed for the visual identification of objects. We have identified a similar route in the human for understanding intelligible speech. Using PET imaging to identify separable neural subsystems within the human auditory cortex, we used a variety of speech and speech-like stimuli with equivalent acoustic complexity but varying intelligibility. We have demonstrated that the left superior temporal sulcus responds to the presence of phonetic information, but its anterior part only responds if the stimulus is also intelligible. This novel observation demonstrates a left anterior temporal pathway for speech comprehension. PMID:11099443
Foreign Subtitles Help but Native-Language Subtitles Harm Foreign Speech Perception

PubMed Central

Mitterer, Holger; McQueen, James M.

2009-01-01

Understanding foreign speech is difficult, in part because of unusual mappings between sounds and words. It is known that listeners in their native language can use lexical knowledge (about how words ought to sound) to learn how to interpret unusual speech-sounds. We therefore investigated whether subtitles, which provide lexical information, support perceptual learning about foreign speech. Dutch participants, unfamiliar with Scottish and Australian regional accents of English, watched Scottish or Australian English videos with Dutch, English or no subtitles, and then repeated audio fragments of both accents. Repetition of novel fragments was worse after Dutch-subtitle exposure but better after English-subtitle exposure. Native-language subtitles appear to create lexical interference, but foreign-language subtitles assist speech learning by indicating which words (and hence sounds) are being spoken. PMID:19918371
Reported Speech in Conversational Storytelling during Nursing Shift Handover Meetings

ERIC Educational Resources Information Center

Bangerter, Adrian; Mayor, Eric; Pekarek Doehler, Simona

2011-01-01

Shift handovers in nursing units involve formal transmission of information and informal conversation about non-routine events. Informal conversation often involves telling stories. Direct reported speech (DRS) was studied in handover storytelling in two nursing care units. The study goal is to contribute to a better understanding of conversation…
Kinematics of Disease Progression in Bulbar ALS

ERIC Educational Resources Information Center

Yunusova, Yana; Green, Jordan R.; Lindstrom, Mary J.; Ball, Laura J.; Pattee, Gary L.; Zinman, Lorne

2010-01-01

The goal of this study was to investigate the deterioration of lip and jaw movements during speech longitudinally in three individuals diagnosed with bulbar amyotrophic lateral sclerosis (ALS). The study was motivated by the need to understand the relationship between physiologic changes in speech movements and clinical measures of speech…
Working with Students Who Are Late-Deafened. PEPNet Tipsheet

ERIC Educational Resources Information Center

Clark, Mary

2010-01-01

Late-deafness means deafness that happened postlingually, any time after the development of speech and language in a person who has identified with hearing society through schooling, social connections, etc. Students who are late-deafened cannot understand speech without visual aids such as speechreading, sign language, and captioning (although…
Differentiating Speech Delay from Disorder: Does It Matter?

ERIC Educational Resources Information Center

Dodd, Barbara

2011-01-01

Aim: The cognitive-linguistic abilities of 2 subgroups of children with speech impairment were compared to better understand underlying deficits that might influence effective intervention. Methods: Two groups of 23 children, aged 3;3 to 5;6, performed executive function tasks assessing cognitive flexibility and nonverbal rule abstraction.…
Babbling, Chewing, and Sucking: Oromandibular Coordination at 9 Months

ERIC Educational Resources Information Center

Steeve, Roger W.; Moore, Christopher A.; Green, Jordan R.; Reilly, Kevin J.; McMurtrey, Jacki Ruark

2008-01-01

Purpose: The ontogeny of mandibular control is important for understanding the general neurophysiologic development for speech and alimentary behaviors. Prior investigations suggest that mandibular control is organized distinctively across speech and nonspeech tasks in 15-month-olds and adults and that, with development, these extant forms of…
Spanish Native-Speaker Perception of Accentedness in Learner Speech

ERIC Educational Resources Information Center

Moranski, Kara

2012-01-01

Building upon current research in native-speaker (NS) perception of L2 learner phonology (Zielinski, 2008; Derwing & Munro, 2009), the present investigation analyzed multiple dimensions of NS speech perception in order to achieve a more complete understanding of the specific linguistic elements and attitudinal variables that contribute to…
Some Problems in Psycholinguistics.

ERIC Educational Resources Information Center

Hadding-Koch, Kerstin

1968-01-01

Among the most important questions in psycholinguistics today are the following: By which processes does man organize and understand speech? Which are the smallest linguistic units and rules stored in the memory and used in the production and perception of speech? Are the same mechanisms at work in both cases? Discussed in this paper are…
Jeremiad at Harvard: Solzhenitsyn and "The World Split Apart."

ERIC Educational Resources Information Center

Stoda, Mark; Dionisopoulos, George

2000-01-01

Contributes to scholarship advancing the understanding of human communication by examining A. Solzhenitsyn's 1978 address "A World Split Apart," and the intense critical reaction that followed. Examines the speech as a Jeremiad. Suggests that even though the speech conforms to the genre's touchstones, it may have been addressed to an…
14 CFR 67.105 - Ear, nose, throat, and equilibrium.

Code of Federal Regulations, 2010 CFR

2010-01-01

... medical certificate are: (a) The person shall demonstrate acceptable hearing by at least one of the... both ears, at a distance of 6 feet from the examiner, with the back turned to the examiner. (2) Demonstrate an acceptable understanding of speech as determined by audiometric speech discrimination testing...
14 CFR 67.205 - Ear, nose, throat, and equilibrium.

Code of Federal Regulations, 2010 CFR

2010-01-01

... medical certificate are: (a) The person shall demonstrate acceptable hearing by at least one of the... both ears, at a distance of 6 feet from the examiner, with the back turned to the examiner. (2) Demonstrate an acceptable understanding of speech as determined by audiometric speech discrimination testing...
14 CFR 67.305 - Ear, nose, throat, and equilibrium.

Code of Federal Regulations, 2010 CFR

2010-01-01

... medical certificate are: (a) The person shall demonstrate acceptable hearing by at least one of the... both ears, at a distance of 6 feet from the examiner, with the back turned to the examiner. (2) Demonstrate an acceptable understanding of speech as determined by audiometric speech discrimination testing...
Speech perception and spoken word recognition: past and present.

PubMed

Jusezyk, Peter W; Luce, Paul A

2002-02-01

The scientific study of the perception of spoken language has been an exciting, prolific, and productive area of research for more than 50 yr. We have learned much about infants' and adults' remarkable capacities for perceiving and understanding the sounds of their language, as evidenced by our increasingly sophisticated theories of acquisition, process, and representation. We present a selective, but we hope, representative review of the past half century of research on speech perception, paying particular attention to the historical and theoretical contexts within which this research was conducted. Our foci in this review fall on three principle topics: early work on the discrimination and categorization of speech sounds, more recent efforts to understand the processes and representations that subserve spoken word recognition, and research on how infants acquire the capacity to perceive their native language. Our intent is to provide the reader a sense of the progress our field has experienced over the last half century in understanding the human's extraordinary capacity for the perception of spoken language.
Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task.

PubMed

Brumberg, Jonathan S; Krusienski, Dean J; Chakrabarti, Shreya; Gunduz, Aysegul; Brunner, Peter; Ritaccio, Anthony L; Schalk, Gerwin

2016-01-01

How the human brain plans, executes, and monitors continuous and fluent speech has remained largely elusive. For example, previous research has defined the cortical locations most important for different aspects of speech function, but has not yet yielded a definition of the temporal progression of involvement of those locations as speech progresses either overtly or covertly. In this paper, we uncovered the spatio-temporal evolution of neuronal population-level activity related to continuous overt speech, and identified those locations that shared activity characteristics across overt and covert speech. Specifically, we asked subjects to repeat continuous sentences aloud or silently while we recorded electrical signals directly from the surface of the brain (electrocorticography (ECoG)). We then determined the relationship between cortical activity and speech output across different areas of cortex and at sub-second timescales. The results highlight a spatio-temporal progression of cortical involvement in the continuous speech process that initiates utterances in frontal-motor areas and ends with the monitoring of auditory feedback in superior temporal gyrus. Direct comparison of cortical activity related to overt versus covert conditions revealed a common network of brain regions involved in speech that may implement orthographic and phonological processing. Our results provide one of the first characterizations of the spatiotemporal electrophysiological representations of the continuous speech process, and also highlight the common neural substrate of overt and covert speech. These results thereby contribute to a refined understanding of speech functions in the human brain.
Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task

PubMed Central

Brumberg, Jonathan S.; Krusienski, Dean J.; Chakrabarti, Shreya; Gunduz, Aysegul; Brunner, Peter; Ritaccio, Anthony L.; Schalk, Gerwin

2016-01-01

How the human brain plans, executes, and monitors continuous and fluent speech has remained largely elusive. For example, previous research has defined the cortical locations most important for different aspects of speech function, but has not yet yielded a definition of the temporal progression of involvement of those locations as speech progresses either overtly or covertly. In this paper, we uncovered the spatio-temporal evolution of neuronal population-level activity related to continuous overt speech, and identified those locations that shared activity characteristics across overt and covert speech. Specifically, we asked subjects to repeat continuous sentences aloud or silently while we recorded electrical signals directly from the surface of the brain (electrocorticography (ECoG)). We then determined the relationship between cortical activity and speech output across different areas of cortex and at sub-second timescales. The results highlight a spatio-temporal progression of cortical involvement in the continuous speech process that initiates utterances in frontal-motor areas and ends with the monitoring of auditory feedback in superior temporal gyrus. Direct comparison of cortical activity related to overt versus covert conditions revealed a common network of brain regions involved in speech that may implement orthographic and phonological processing. Our results provide one of the first characterizations of the spatiotemporal electrophysiological representations of the continuous speech process, and also highlight the common neural substrate of overt and covert speech. These results thereby contribute to a refined understanding of speech functions in the human brain. PMID:27875590

Recognition of Speech from the Television with Use of a Wireless Technology Designed for Cochlear Implants.

PubMed

Duke, Mila Morais; Wolfe, Jace; Schafer, Erin

2016-05-01

Cochlear implant (CI) recipients often experience difficulty understanding speech in noise and speech that originates from a distance. Many CI recipients also experience difficulty understanding speech originating from a television. Use of hearing assistance technology (HAT) may improve speech recognition in noise and for signals that originate from more than a few feet from the listener; however, there are no published studies evaluating the potential benefits of a wireless HAT designed to deliver audio signals from a television directly to a CI sound processor. The objective of this study was to compare speech recognition in quiet and in noise of CI recipients with the use of their CI alone and with the use of their CI and a wireless HAT (Cochlear Wireless TV Streamer). A two-way repeated measures design was used to evaluate performance differences obtained in quiet and in competing noise (65 dBA) with the CI sound processor alone and with the sound processor coupled to the Cochlear Wireless TV Streamer. Sixteen users of Cochlear Nucleus 24 Freedom, CI512, and CI422 implants were included in the study. Participants were evaluated in four conditions including use of the sound processor alone and use of the sound processor with the wireless streamer in quiet and in the presence of competing noise at 65 dBA. Speech recognition was evaluated in each condition with two full lists of Computer-Assisted Speech Perception Testing and Training Sentence-Level Test sentences presented from a light-emitting diode television. Speech recognition in noise was significantly better with use of the wireless streamer compared to participants' performance with their CI sound processor alone. There was also a nonsignificant trend toward better performance in quiet with use of the TV Streamer. Performance was significantly poorer when evaluated in noise compared to performance in quiet when the TV Streamer was not used. Use of the Cochlear Wireless TV Streamer designed to stream audio from a television directly to a CI sound processor provides better speech recognition in quiet and in noise when compared to performance obtained with use of the CI sound processor alone. American Academy of Audiology.
Masking Period Patterns and Forward Masking for Speech-Shaped Noise: Age-Related Effects.

PubMed

Grose, John H; Menezes, Denise C; Porter, Heather L; Griz, Silvana

2016-01-01

The purpose of this study was to assess age-related changes in temporal resolution in listeners with relatively normal audiograms. The hypothesis was that increased susceptibility to nonsimultaneous masking contributes to the hearing difficulties experienced by older listeners in complex fluctuating backgrounds. Participants included younger (n = 11), middle-age (n = 12), and older (n = 11) listeners with relatively normal audiograms. The first phase of the study measured masking period patterns for speech-shaped noise maskers and signals. From these data, temporal window shapes were derived. The second phase measured forward-masking functions and assessed how well the temporal window fits accounted for these data. The masking period patterns demonstrated increased susceptibility to backward masking in the older listeners, compatible with a more symmetric temporal window in this group. The forward-masking functions exhibited an age-related decline in recovery to baseline thresholds, and there was also an increase in the variability of the temporal window fits to these data. This study demonstrated an age-related increase in susceptibility to nonsimultaneous masking, supporting the hypothesis that exacerbated nonsimultaneous masking contributes to age-related difficulties understanding speech in fluctuating noise. Further support for this hypothesis comes from limited speech-in-noise data, suggesting an association between susceptibility to forward masking and speech understanding in modulated noise.
Speech Perception in Tones and Noise via Cochlear Implants Reveals Influence of Spectral Resolution on Temporal Processing

PubMed Central

Kreft, Heather A.

2014-01-01

Under normal conditions, human speech is remarkably robust to degradation by noise and other distortions. However, people with hearing loss, including those with cochlear implants, often experience great difficulty in understanding speech in noisy environments. Recent work with normal-hearing listeners has shown that the amplitude fluctuations inherent in noise contribute strongly to the masking of speech. In contrast, this study shows that speech perception via a cochlear implant is unaffected by the inherent temporal fluctuations of noise. This qualitative difference between acoustic and electric auditory perception does not seem to be due to differences in underlying temporal acuity but can instead be explained by the poorer spectral resolution of cochlear implants, relative to the normally functioning ear, which leads to an effective smoothing of the inherent temporal-envelope fluctuations of noise. The outcome suggests an unexpected trade-off between the detrimental effects of poorer spectral resolution and the beneficial effects of a smoother noise temporal envelope. This trade-off provides an explanation for the long-standing puzzle of why strong correlations between speech understanding and spectral resolution have remained elusive. The results also provide a potential explanation for why cochlear-implant users and hearing-impaired listeners exhibit reduced or absent masking release when large and relatively slow temporal fluctuations are introduced in noise maskers. The multitone maskers used here may provide an effective new diagnostic tool for assessing functional hearing loss and reduced spectral resolution. PMID:25315376
Associations between speech features and phenotypic severity in Treacher Collins syndrome

PubMed Central

2014-01-01

Background Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Methods Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5–74 years, median 34 years) divided into three groups comprising children 5–10 years (n = 4), adolescents 11–18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0–6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Results Children and adolescents presented with significantly higher speech composite scores (median 4, range 1–6) than adults (median 1, range 0–5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31–99) than in adults (98%, range 93–100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Conclusions Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services. PMID:24775909
Associations between speech features and phenotypic severity in Treacher Collins syndrome.

PubMed

Asten, Pamela; Akre, Harriet; Persson, Christina

2014-04-28

Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services.
Musician effect on perception of spectro-temporally degraded speech, vocal emotion, and music in young adolescents.

PubMed

Başkent, Deniz; Fuller, Christina D; Galvin, John J; Schepel, Like; Gaudrain, Etienne; Free, Rolien H

2018-05-01

In adult normal-hearing musicians, perception of music, vocal emotion, and speech in noise has been previously shown to be better than non-musicians, sometimes even with spectro-temporally degraded stimuli. In this study, melodic contour identification, vocal emotion identification, and speech understanding in noise were measured in young adolescent normal-hearing musicians and non-musicians listening to unprocessed or degraded signals. Different from adults, there was no musician effect for vocal emotion identification or speech in noise. Melodic contour identification with degraded signals was significantly better in musicians, suggesting potential benefits from music training for young cochlear-implant users, who experience similar spectro-temporal signal degradations.
Communication attitude and speech in 10-year-old children with cleft (lip and) palate: an ICF perspective.

PubMed

Havstam, Christina; Sandberg, Annika Dahlgren; Lohmander, Anette

2011-04-01

Many children born with cleft palate have impaired speech during their pre-school years, but usually the speech difficulties are transient and resolved by later childhood. This study investigated communication attitude with the Swedish version of the Communication Attitude Test (CAT-S) in 54 10-year-olds with cleft (lip and) palate. In addition, environmental factors were assessed via parent questionnaire. These data were compared to speech assessments by experienced listeners, who rated the children's velopharyngeal function, articulation, intelligibility, and general impression of speech at ages 5, 7, and 10 years. The children with clefts scored significantly higher on the CAT-S compared to reference data, indicating a more negative communication attitude on group level but with large individual variation. All speech variables, except velopharyngeal function at earlier ages, as well as the parent questionnaire scores, correlated significantly with the CAT-S scores. Although there was a relationship between speech and communication attitude, not all children with impaired speech developed negative communication attitudes. The assessment of communication attitude can make an important contribution to our understanding of the communicative situation for children with cleft (lip and) palate and give important indications for intervention.
Auditory Speech Perception Tests in Relation to the Coding Strategy in Cochlear Implant.

PubMed

Bazon, Aline Cristine; Mantello, Erika Barioni; Gonçales, Alina Sanches; Isaac, Myriam de Lima; Hyppolito, Miguel Angelo; Reis, Ana Cláudia Mirândola Barbosa

2016-07-01

The objective of the evaluation of auditory perception of cochlear implant users is to determine how the acoustic signal is processed, leading to the recognition and understanding of sound. To investigate the differences in the process of auditory speech perception in individuals with postlingual hearing loss wearing a cochlear implant, using two different speech coding strategies, and to analyze speech perception and handicap perception in relation to the strategy used. This study is prospective cross-sectional cohort study of a descriptive character. We selected ten cochlear implant users that were characterized by hearing threshold by the application of speech perception tests and of the Hearing Handicap Inventory for Adults. There was no significant difference when comparing the variables subject age, age at acquisition of hearing loss, etiology, time of hearing deprivation, time of cochlear implant use and mean hearing threshold with the cochlear implant with the shift in speech coding strategy. There was no relationship between lack of handicap perception and improvement in speech perception in both speech coding strategies used. There was no significant difference between the strategies evaluated and no relation was observed between them and the variables studied.
Perceptual learning of speech under optimal and adverse conditions.

PubMed

Zhang, Xujin; Samuel, Arthur G

2014-02-01

Humans have a remarkable ability to understand spoken language despite the large amount of variability in speech. Previous research has shown that listeners can use lexical information to guide their interpretation of atypical sounds in speech (Norris, McQueen, & Cutler, 2003). This kind of lexically induced perceptual learning enables people to adjust to the variations in utterances due to talker-specific characteristics, such as individual identity and dialect. The current study investigated perceptual learning in two optimal conditions: conversational speech (Experiment 1) versus clear speech (Experiment 2), and three adverse conditions: noise (Experiment 3a) versus two cognitive loads (Experiments 4a and 4b). Perceptual learning occurred in the two optimal conditions and in the two cognitive load conditions, but not in the noise condition. Furthermore, perceptual learning occurred only in the first of two sessions for each participant, and only for atypical /s/ sounds and not for atypical /f/ sounds. This pattern of learning and nonlearning reflects a balance between flexibility and stability that the speech system must have to deal with speech variability in the diverse conditions that speech is encountered. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Individual differences in language and working memory affect children's speech recognition in noise.

PubMed

McCreery, Ryan W; Spratford, Meredith; Kirby, Benjamin; Brennan, Marc

2017-05-01

We examined how cognitive and linguistic skills affect speech recognition in noise for children with normal hearing. Children with better working memory and language abilities were expected to have better speech recognition in noise than peers with poorer skills in these domains. As part of a prospective, cross-sectional study, children with normal hearing completed speech recognition in noise for three types of stimuli: (1) monosyllabic words, (2) syntactically correct but semantically anomalous sentences and (3) semantically and syntactically anomalous word sequences. Measures of vocabulary, syntax and working memory were used to predict individual differences in speech recognition in noise. Ninety-six children with normal hearing, who were between 5 and 12 years of age. Higher working memory was associated with better speech recognition in noise for all three stimulus types. Higher vocabulary abilities were associated with better recognition in noise for sentences and word sequences, but not for words. Working memory and language both influence children's speech recognition in noise, but the relationships vary across types of stimuli. These findings suggest that clinical assessment of speech recognition is likely to reflect underlying cognitive and linguistic abilities, in addition to a child's auditory skills, consistent with the Ease of Language Understanding model.
Fifty years of progress in speech and speaker recognition

NASA Astrophysics Data System (ADS)

Furui, Sadaoki

2004-10-01

Speech and speaker recognition technology has made very significant progress in the past 50 years. The progress can be summarized by the following changes: (1) from template matching to corpus-base statistical modeling, e.g., HMM and n-grams, (2) from filter bank/spectral resonance to Cepstral features (Cepstrum + DCepstrum + DDCepstrum), (3) from heuristic time-normalization to DTW/DP matching, (4) from gdistanceh-based to likelihood-based methods, (5) from maximum likelihood to discriminative approach, e.g., MCE/GPD and MMI, (6) from isolated word to continuous speech recognition, (7) from small vocabulary to large vocabulary recognition, (8) from context-independent units to context-dependent units for recognition, (9) from clean speech to noisy/telephone speech recognition, (10) from single speaker to speaker-independent/adaptive recognition, (11) from monologue to dialogue/conversation recognition, (12) from read speech to spontaneous speech recognition, (13) from recognition to understanding, (14) from single-modality (audio signal only) to multi-modal (audio/visual) speech recognition, (15) from hardware recognizer to software recognizer, and (16) from no commercial application to many practical commercial applications. Most of these advances have taken place in both the fields of speech recognition and speaker recognition. The majority of technological changes have been directed toward the purpose of increasing robustness of recognition, including many other additional important techniques not noted above.
Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners.

PubMed

Park, Hyojin; Ince, Robin A A; Schyns, Philippe G; Thut, Gregor; Gross, Joachim

2015-06-15

Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Frontal Top-Down Signals Increase Coupling of Auditory Low-Frequency Oscillations to Continuous Speech in Human Listeners

PubMed Central

Park, Hyojin; Ince, Robin A.A.; Schyns, Philippe G.; Thut, Gregor; Gross, Joachim

2015-01-01

Summary Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. PMID:26028433
Speech outcomes in Parkinson's disease after subthalamic nucleus deep brain stimulation: A systematic review.

PubMed

Aldridge, Danielle; Theodoros, Deborah; Angwin, Anthony; Vogel, Adam P

2016-12-01

Deep brain stimulation (DBS) of the subthalamic nucleus (STN) is effective in reducing motor symptoms for many individuals with Parkinson's disease (PD). However, STN DBS does not appear to influence speech in the same way, and may result in a variety of negative outcomes for people with PD (PWP). A high degree of inter-individual variability amongst PWP regarding speech outcomes following STN DBS is evident in many studies. Furthermore, speech studies in PWP following STN DBS have employed a wide variety of designs and methodologies, which complicate comparison and interpretation of outcome data amongst studies within this growing body of research. An analysis of published evidence regarding speech outcomes in PWP following STN DBS, according to design and quality, is missing. This systematic review aimed to analyse and coalesce all of the current evidence reported within observational and experimental studies investigating the effects of STN DBS on speech. It will strengthen understanding of the relationship between STN DBS and speech, and inform future research by highlighting methodological limitations of current evidence. Copyright © 2016 Elsevier Ltd. All rights reserved.
Is Listening in Noise Worth It? The Neurobiology of Speech Recognition in Challenging Listening Conditions.

PubMed

Eckert, Mark A; Teubner-Rhodes, Susan; Vaden, Kenneth I

2016-01-01

This review examines findings from functional neuroimaging studies of speech recognition in noise to provide a neural systems level explanation for the effort and fatigue that can be experienced during speech recognition in challenging listening conditions. Neuroimaging studies of speech recognition consistently demonstrate that challenging listening conditions engage neural systems that are used to monitor and optimize performance across a wide range of tasks. These systems appear to improve speech recognition in younger and older adults, but sustained engagement of these systems also appears to produce an experience of effort and fatigue that may affect the value of communication. When considered in the broader context of the neuroimaging and decision making literature, the speech recognition findings from functional imaging studies indicate that the expected value, or expected level of speech recognition given the difficulty of listening conditions, should be considered when measuring effort and fatigue. The authors propose that the behavioral economics or neuroeconomics of listening can provide a conceptual and experimental framework for understanding effort and fatigue that may have clinical significance.
Is Listening in Noise Worth It? The Neurobiology of Speech Recognition in Challenging Listening Conditions

PubMed Central

Eckert, Mark A.; Teubner-Rhodes, Susan; Vaden, Kenneth I.

2016-01-01

This review examines findings from functional neuroimaging studies of speech recognition in noise to provide a neural systems level explanation for the effort and fatigue that can be experienced during speech recognition in challenging listening conditions. Neuroimaging studies of speech recognition consistently demonstrate that challenging listening conditions engage neural systems that are used to monitor and optimize performance across a wide range of tasks. These systems appear to improve speech recognition in younger and older adults, but sustained engagement of these systems also appears to produce an experience of effort and fatigue that may affect the value of communication. When considered in the broader context of the neuroimaging and decision making literature, the speech recognition findings from functional imaging studies indicate that the expected value, or expected level of speech recognition given the difficulty of listening conditions, should be considered when measuring effort and fatigue. We propose that the behavioral economics and/or neuroeconomics of listening can provide a conceptual and experimental framework for understanding effort and fatigue that may have clinical significance. PMID:27355759
Perception of Native English Reduced Forms in Adverse Environments by Chinese Undergraduate Students

ERIC Educational Resources Information Center

Wong, Simpson W. L.; Tsui, Jenny K. Y.; Chow, Bonnie Wing-Yin; Leung, Vina W. H.; Mok, Peggy; Chung, Kevin Kien-Hoa

2017-01-01

Previous research has shown that learners of English-as-a-second-language (ESL) have difficulties in understanding connected speech spoken by native English speakers. Extending from past research limited to quiet listening condition, this study examined the perception of English connected speech presented under five adverse conditions, namely…
Older Adults Expend More Listening Effort than Young Adults Recognizing Speech in Noise

ERIC Educational Resources Information Center

Gosselin, Penny Anderson; Gagne, Jean-Pierre

2011-01-01

Purpose: Listening in noisy situations is a challenging experience for many older adults. The authors hypothesized that older adults exert more listening effort compared with young adults. Listening effort involves the attention and cognitive resources required to understand speech. The purpose was (a) to quantify the amount of listening effort…
A Computational Model Quantifies the Effect of Anatomical Variability on Velopharyngeal Function

ERIC Educational Resources Information Center

Inouye, Joshua M.; Perry, Jamie L.; Lin, Kant Y.; Blemker, Silvia S.

2015-01-01

Purpose: This study predicted the effects of velopharyngeal (VP) anatomical parameters on VP function to provide a greater understanding of speech mechanics and aid in the treatment of speech disorders. Method: We created a computational model of the VP mechanism using dimensions obtained from magnetic resonance imaging measurements of 10 healthy…
Surface Electromyography for Speech and Swallowing Systems: Measurement, Analysis, and Interpretation

ERIC Educational Resources Information Center

Stepp, Cara E.

2012-01-01

Purpose: Applying surface electromyography (sEMG) to the study of voice, speech, and swallowing is becoming increasingly popular. An improved understanding of sEMG and building a consensus as to appropriate methodology will improve future research and clinical applications. Method: An updated review of the theory behind recording sEMG for the…

Learning from Human Cadaveric Prosections: Examining Anxiety in Speech Therapy Students

ERIC Educational Resources Information Center

Criado-Álvarez, Juan Jose; González González, Jaime; Romo Barrientos, Carmen; Ubeda-Bañon, Isabel; Saiz-Sanchez, Daniel; Flores-Cuadrado, Alicia; Albertos-Marco, Juan Carlos; Martinez-Marcos, Alino; Mohedano-Moriano, Alicia

2017-01-01

Human anatomy education often utilizes the essential practices of cadaver dissection and examination of prosected specimens. However, these exposures to human cadavers and confronting death can be stressful and anxiety-inducing for students. This study aims to understand the attitudes, reactions, fears, and states of anxiety that speech therapy…
Strategies for Coping with Educational and Social Consequences of Chronic Ear Infections in Rural Communities.

ERIC Educational Resources Information Center

Pillai, Patrick

2000-01-01

Children with chronic ear infections experience a lag time in understanding speech, which inhibits classroom participation and the ability to make friends, and ultimately reduces self-esteem. Difficulty in hearing affects speech and vocabulary development, reading and writing proficiency, and academic performance, and could lead to placement in…
Modulation of Neck Intermuscular Beta Coherence during Voice and Speech Production

ERIC Educational Resources Information Center

Stepp, Cara E.; Hillman, Robert E.; Heaton, James T.

2011-01-01

Purpose: The purpose of this study was to better understand neck intermuscular beta coherence (15-35 Hz; NIBcoh) in healthy individuals, with respect to modulation by behavioral tasks. Method: Mean NIBcoh was measured using surface electromyography at 2 anterior neck locations in 10 individuals during normal speech, static nonspeech maneuvers,…
Victor of Aveyron: A Reappraisal in Light of More Recent Cases of Feral Speech.

ERIC Educational Resources Information Center

Lebrun, Yvan

1980-01-01

The language development of children who have experienced malnutrition and varying degrees of speech and sensory deprivation is examined. In language learning, such children attend only when directly addressed, cannot control pitch and intonation and lack an understanding of the relational linguistic material and the sociolinguistic rules of…
Toward an Understanding of Successful Career Placement by Undergraduate Speech Communication Departments.

ERIC Educational Resources Information Center

Cahn, Dudley D.

Noting that placement of graduating speech communication students is an important measure of the success of career programs, and that faculty and department heads who are presently developing, recommending, or supervising career programs may be interested in useful career attitudes and placement activities, a study was conducted to determine what…
Perception of Spectral Contrast by Hearing-Impaired Listeners

ERIC Educational Resources Information Center

Dreisbach, Laura E.; Leek, Marjorie R.; Lentz, Jennifer J.

2005-01-01

The ability to discriminate the spectral shapes of complex sounds is critical to accurate speech perception. Part of the difficulty experienced by listeners with hearing loss in understanding speech sounds in noise may be related to a smearing of the internal representation of the spectral peaks and valleys because of the loss of sensitivity and…
Responses to Intensity-Shifted Auditory Feedback during Running Speech

ERIC Educational Resources Information Center

Patel, Rupal; Reilly, Kevin J.; Archibald, Erin; Cai, Shanqing; Guenther, Frank H.

2015-01-01

Purpose: Responses to intensity perturbation during running speech were measured to understand whether prosodic features are controlled in an independent or integrated manner. Method: Nineteen English-speaking healthy adults (age range = 21-41 years) produced 480 sentences in which emphatic stress was placed on either the 1st or 2nd word. One…
Style and Content in the Rhetoric of Early Afro-American Feminists.

ERIC Educational Resources Information Center

Campbell, Karlyn Kohrs

1986-01-01

Analyzes selected speeches by feminists active in the early Afro-American protest, revealing differences in their rhetoric and that of White feminists of the period. Argues that a simultaneous analysis and synthesis is necessary to understand these differences. Illustrates speeches by Sojourner Truth, Ida B. Wells, and Mary Church Terrell. (JD)
Apraxia of Speech: Concepts and Controversies

ERIC Educational Resources Information Center

Ziegler, Wolfram; Aichert, Ingrid; Staiger, Anja

2012-01-01

Purpose: This article was written as an editorial to a collection of original articles on apraxia of speech (AOS) in which some of the more recent advancements in the understanding of this syndrome are discussed. It covers controversial issues concerning the theoretical foundations of AOS. Our approach was motivated by a change of perspective on…
Black History Speech

ERIC Educational Resources Information Center

Noldon, Carl

2007-01-01

The author argues in this speech that one cannot expect students in the school system to know and understand the genius of Black history if the curriculum is Eurocentric, which is a residue of racism. He states that his comments are designed for the enlightenment of those who suffer from a school system that "hypocritically manipulates Black…
Hold Fast Your Dreams: Twenty Commencement Speeches.

ERIC Educational Resources Information Center

Boyko, Carrie, Comp.; Colen, Kimberly, Comp.

This anthology contains 20 commencement addresses delivered by prominent and successful Americans from many different fields of endeavor--all the addresses have in common an understanding of the audience's thoughts and feelings at the important moment of a college graduation. Each speech in the anthology is preceded by a brief biography of the…
Investigating the Psycholinguistic Correlates of Speechreading in Preschool Age Children

ERIC Educational Resources Information Center

Davies, Rebecca; Kidd, Evan; Lander, Karen

2009-01-01

Background: Previous research has found that newborn infants can match phonetic information in the lips and voice from as young as ten weeks old. There is evidence that access to visual speech is necessary for normal speech development. Although we have an understanding of this early sensitivity, very little research has investigated older…
Free Speech Tensions: Responding to Bias on College and University Campuses

ERIC Educational Resources Information Center

Miller, Ryan A.; Guida, Tonia; Smith, Stella; Ferguson, S. Kiersten; Medina, Elizabeth

2018-01-01

Despite the increasing development of bias response teams on college and university campuses, little scholarship has examined these teams and, in particular, team leaders' approaches to understanding the role of free speech in responding to bias. Through semi-structured interviews, administrators who served on bias response teams at 19…
Brain Mechanisms Underlying Speech and Language; Conference Proceedings (Princeton, New Jersey, November 9-12, 1965).

ERIC Educational Resources Information Center

Darley, Frederic L., Ed.

The conference proceedings of scientists specializing in language processes and neurophysiological mechanisms are reported to stimulate a cross-over of interest and research in the central brain phenomena (reception, understanding, retention, integration, formulation, and expression) as they relate to speech and language. Eighteen research reports…
Mode of Communication, Perceived Level of Understanding, and Perceived Quality of Life in Youth Who Are Deaf or Hard of Hearing

PubMed Central

Kushalnagar, P.; Topolski, T. D.; Schick, B.; Edwards, T. C.; Skalicky, A. M.; Patrick, D. L.

2011-01-01

Given the important role of parent–youth communication in adolescent well-being and quality of life, we sought to examine the relationship between specific communication variables and youth perceived quality of life in general and as a deaf or hard-of-hearing (DHH) individual. A convenience sample of 230 youth (mean age = 14.1, standard deviation = 2.2; 24% used sign only, 40% speech only, and 36% sign + speech) was surveyed on communication-related issues, generic and DHH-specific quality of life, and depression symptoms. Higher youth perception of their ability to understand parents’ communication was significantly correlated with perceived quality of life as well as lower reported depressive symptoms and lower perceived stigma. Youth who use speech as their single mode of communication were more likely to report greater stigma associated with being DHH than youth who used both speech and sign. These findings demonstrate the importance of youths’ perceptions of communication with their parents on generic and DHH-specific youth quality of life. PMID:21536686
Talker-specific learning in amnesia: Insight into mechanisms of adaptive speech perception.

PubMed

Trude, Alison M; Duff, Melissa C; Brown-Schmidt, Sarah

2014-05-01

A hallmark of human speech perception is the ability to comprehend speech quickly and effortlessly despite enormous variability across talkers. However, current theories of speech perception do not make specific claims about the memory mechanisms involved in this process. To examine whether declarative memory is necessary for talker-specific learning, we tested the ability of amnesic patients with severe declarative memory deficits to learn and distinguish the accents of two unfamiliar talkers by monitoring their eye-gaze as they followed spoken instructions. Analyses of the time-course of eye fixations showed that amnesic patients rapidly learned to distinguish these accents and tailored perceptual processes to the voice of each talker. These results demonstrate that declarative memory is not necessary for this ability and points to the involvement of non-declarative memory mechanisms. These results are consistent with findings that other social and accommodative behaviors are preserved in amnesia and contribute to our understanding of the interactions of multiple memory systems in the use and understanding of spoken language. Copyright © 2014 Elsevier Ltd. All rights reserved.
Prosody and Semantics Are Separate but Not Separable Channels in the Perception of Emotional Speech: Test for Rating of Emotions in Speech.

PubMed

Ben-David, Boaz M; Multani, Namita; Shakuf, Vered; Rudzicz, Frank; van Lieshout, Pascal H H M

2016-02-01

Our aim is to explore the complex interplay of prosody (tone of speech) and semantics (verbal content) in the perception of discrete emotions in speech. We implement a novel tool, the Test for Rating of Emotions in Speech. Eighty native English speakers were presented with spoken sentences made of different combinations of 5 discrete emotions (anger, fear, happiness, sadness, and neutral) presented in prosody and semantics. Listeners were asked to rate the sentence as a whole, integrating both speech channels, or to focus on one channel only (prosody or semantics). We observed supremacy of congruency, failure of selective attention, and prosodic dominance. Supremacy of congruency means that a sentence that presents the same emotion in both speech channels was rated highest; failure of selective attention means that listeners were unable to selectively attend to one channel when instructed; and prosodic dominance means that prosodic information plays a larger role than semantics in processing emotional speech. Emotional prosody and semantics are separate but not separable channels, and it is difficult to perceive one without the influence of the other. Our findings indicate that the Test for Rating of Emotions in Speech can reveal specific aspects in the processing of emotional speech and may in the future prove useful for understanding emotion-processing deficits in individuals with pathologies.
Cued Speech for Enhancing Speech Perception and First Language Development of Children With Cochlear Implants

PubMed Central

Leybaert, Jacqueline; LaSasso, Carol J.

2010-01-01

Nearly 300 million people worldwide have moderate to profound hearing loss. Hearing impairment, if not adequately managed, has strong socioeconomic and affective impact on individuals. Cochlear implants have become the most effective vehicle for helping profoundly deaf children and adults to understand spoken language, to be sensitive to environmental sounds, and, to some extent, to listen to music. The auditory information delivered by the cochlear implant remains non-optimal for speech perception because it delivers a spectrally degraded signal and lacks some of the fine temporal acoustic structure. In this article, we discuss research revealing the multimodal nature of speech perception in normally-hearing individuals, with important inter-subject variability in the weighting of auditory or visual information. We also discuss how audio-visual training, via Cued Speech, can improve speech perception in cochlear implantees, particularly in noisy contexts. Cued Speech is a system that makes use of visual information from speechreading combined with hand shapes positioned in different places around the face in order to deliver completely unambiguous information about the syllables and the phonemes of spoken language. We support our view that exposure to Cued Speech before or after the implantation could be important in the aural rehabilitation process of cochlear implantees. We describe five lines of research that are converging to support the view that Cued Speech can enhance speech perception in individuals with cochlear implants. PMID:20724357
Auditory cortex activation to natural speech and simulated cochlear implant speech measured with functional near-infrared spectroscopy.

PubMed

Pollonini, Luca; Olds, Cristen; Abaya, Homer; Bortfeld, Heather; Beauchamp, Michael S; Oghalai, John S

2014-03-01

The primary goal of most cochlear implant procedures is to improve a patient's ability to discriminate speech. To accomplish this, cochlear implants are programmed so as to maximize speech understanding. However, programming a cochlear implant can be an iterative, labor-intensive process that takes place over months. In this study, we sought to determine whether functional near-infrared spectroscopy (fNIRS), a non-invasive neuroimaging method which is safe to use repeatedly and for extended periods of time, can provide an objective measure of whether a subject is hearing normal speech or distorted speech. We used a 140 channel fNIRS system to measure activation within the auditory cortex in 19 normal hearing subjects while they listed to speech with different levels of intelligibility. Custom software was developed to analyze the data and compute topographic maps from the measured changes in oxyhemoglobin and deoxyhemoglobin concentration. Normal speech reliably evoked the strongest responses within the auditory cortex. Distorted speech produced less region-specific cortical activation. Environmental sounds were used as a control, and they produced the least cortical activation. These data collected using fNIRS are consistent with the fMRI literature and thus demonstrate the feasibility of using this technique to objectively detect differences in cortical responses to speech of different intelligibility. Copyright © 2013 Elsevier B.V. All rights reserved.
An informatics approach to integrating genetic and neurological data in speech and language neuroscience.

PubMed

Bohland, Jason W; Myers, Emma M; Kim, Esther

2014-01-01

A number of heritable disorders impair the normal development of speech and language processes and occur in large numbers within the general population. While candidate genes and loci have been identified, the gap between genotype and phenotype is vast, limiting current understanding of the biology of normal and disordered processes. This gap exists not only in our scientific knowledge, but also in our research communities, where genetics researchers and speech, language, and cognitive scientists tend to operate independently. Here we describe a web-based, domain-specific, curated database that represents information about genotype-phenotype relations specific to speech and language disorders, as well as neuroimaging results demonstrating focal brain differences in relevant patients versus controls. Bringing these two distinct data types into a common database ( http://neurospeech.org/sldb ) is a first step toward bringing molecular level information into cognitive and computational theories of speech and language function. One bridge between these data types is provided by densely sampled profiles of gene expression in the brain, such as those provided by the Allen Brain Atlases. Here we present results from exploratory analyses of human brain gene expression profiles for genes implicated in speech and language disorders, which are annotated in our database. We then discuss how such datasets can be useful in the development of computational models that bridge levels of analysis, necessary to provide a mechanistic understanding of heritable language disorders. We further describe our general approach to information integration, discuss important caveats and considerations, and offer a specific but speculative example based on genes implicated in stuttering and basal ganglia function in speech motor control.

The contribution of visual information to the perception of speech in noise with and without informative temporal fine structure

PubMed Central

Stacey, Paula C.; Kitterick, Pádraig T.; Morris, Saffron D.; Sumner, Christian J.

2017-01-01

Understanding what is said in demanding listening situations is assisted greatly by looking at the face of a talker. Previous studies have observed that normal-hearing listeners can benefit from this visual information when a talker's voice is presented in background noise. These benefits have also been observed in quiet listening conditions in cochlear-implant users, whose device does not convey the informative temporal fine structure cues in speech, and when normal-hearing individuals listen to speech processed to remove these informative temporal fine structure cues. The current study (1) characterised the benefits of visual information when listening in background noise; and (2) used sine-wave vocoding to compare the size of the visual benefit when speech is presented with or without informative temporal fine structure. The accuracy with which normal-hearing individuals reported words in spoken sentences was assessed across three experiments. The availability of visual information and informative temporal fine structure cues was varied within and across the experiments. The results showed that visual benefit was observed using open- and closed-set tests of speech perception. The size of the benefit increased when informative temporal fine structure cues were removed. This finding suggests that visual information may play an important role in the ability of cochlear-implant users to understand speech in many everyday situations. Models of audio-visual integration were able to account for the additional benefit of visual information when speech was degraded and suggested that auditory and visual information was being integrated in a similar way in all conditions. The modelling results were consistent with the notion that audio-visual benefit is derived from the optimal combination of auditory and visual sensory cues. PMID:27085797
Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

PubMed

Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

2015-05-01

Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time. Because ClearVoice does not degrade performance in quiet settings, clinicians should consider recommending ClearVoice for routine, full-time use for AB implant recipients. Roger should be used in all instances in which remote microphone technology may assist the user in understanding speech in the presence of noise. American Academy of Audiology.
The normalities and abnormalities associated with speech in psychometrically-defined schizotypy.

PubMed

Cohen, Alex S; Auster, Tracey L; McGovern, Jessica E; MacAulay, Rebecca K

2014-12-01

Speech deficits are thought to be an important feature of schizotypy--defined as the personality organization reflecting a putative liability for schizophrenia. There is reason to suspect that these deficits manifest as a function of limited cognitive resources. To evaluate this idea, we examined speech from individuals with psychometrically-defined schizotypy during a low cognitively-demanding task versus a relatively high cognitively-demanding task. A range of objective, computer-based measures of speech tapping speech production (silence, number and length of pauses, number and length of utterances), speech variability (global and local intonation and emphasis) and speech content (word fillers, idea density) were employed. Data for control (n=37) and schizotypy (n=39) groups were examined. Results did not confirm our hypotheses. While the cognitive-load task reduced speech expressivity for subjects as a group for most variables, the schizotypy group was not more pathological in speech characteristics compared to the control group. Interestingly, some aspects of speech in schizotypal versus control subjects were healthier under high cognitive load. Moreover, schizotypal subjects performed better, at a trend level, than controls on the cognitively demanding task. These findings hold important implications for our understanding of the neurocognitive architecture associated with the schizophrenia-spectrum. Of particular note concerns the apparent mismatch between self-reported schizotypal traits and objective performance, and the resiliency of speech under cognitive stress in persons with high levels of schizotypy. Copyright © 2014 Elsevier B.V. All rights reserved.
GRIN2A: an aptly named gene for speech dysfunction.

PubMed

Turner, Samantha J; Mayes, Angela K; Verhoeven, Andrea; Mandelstam, Simone A; Morgan, Angela T; Scheffer, Ingrid E

2015-02-10

To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. © 2015 American Academy of Neurology.
Speech Comprehension Difficulties in Chronic Tinnitus and Its Relation to Hyperacusis

PubMed Central

Vielsmeier, Veronika; Kreuzer, Peter M.; Haubner, Frank; Steffens, Thomas; Semmler, Philipp R. O.; Kleinjung, Tobias; Schlee, Winfried; Langguth, Berthold; Schecklmann, Martin

2016-01-01

Objective: Many tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1) estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2) compare subjective reports of speech comprehension difficulties with behavioral measurements in a standardized speech comprehension test and to (3) explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram), as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and Methods: Speech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessments (pure tone audiometry, tinnitus pitch, and loudness matching), the Goettingen sentence test (in quiet) for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments (“How would you rate your ability to understand speech?”; “How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?”). Results: Subjectively-reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation). 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test). Subjective speech comprehension complaints (both for general and in noisy environment) were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated with speech comprehension difficulties in noisy environments, but not with speech comprehension difficulties in general. Conclusion: Speech comprehension deficits are frequent among tinnitus patients. Whereas speech comprehension deficits in quiet environments are primarily due to peripheral hearing loss, speech comprehension deficits in noisy environments are related to both peripheral hearing loss and dysfunctional central auditory processing. Disturbed speech comprehension in noisy environments might be modulated by a central inhibitory deficit. In addition, attentional and cognitive aspects may play a role. PMID:28018209
Speech Comprehension Difficulties in Chronic Tinnitus and Its Relation to Hyperacusis.

PubMed

Vielsmeier, Veronika; Kreuzer, Peter M; Haubner, Frank; Steffens, Thomas; Semmler, Philipp R O; Kleinjung, Tobias; Schlee, Winfried; Langguth, Berthold; Schecklmann, Martin

2016-01-01

Objective: Many tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1) estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2) compare subjective reports of speech comprehension difficulties with behavioral measurements in a standardized speech comprehension test and to (3) explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram), as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and Methods: Speech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessments (pure tone audiometry, tinnitus pitch, and loudness matching), the Goettingen sentence test (in quiet) for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments ("How would you rate your ability to understand speech?"; "How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?"). Results: Subjectively-reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation). 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test). Subjective speech comprehension complaints (both for general and in noisy environment) were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated with speech comprehension difficulties in noisy environments, but not with speech comprehension difficulties in general. Conclusion: Speech comprehension deficits are frequent among tinnitus patients. Whereas speech comprehension deficits in quiet environments are primarily due to peripheral hearing loss, speech comprehension deficits in noisy environments are related to both peripheral hearing loss and dysfunctional central auditory processing. Disturbed speech comprehension in noisy environments might be modulated by a central inhibitory deficit. In addition, attentional and cognitive aspects may play a role.
Psychoacoustic cues to emotion in speech prosody and music.

PubMed

Coutinho, Eduardo; Dibben, Nicola

2013-01-01

There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners' second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.
The brain’s conversation with itself: neural substrates of dialogic inner speech

PubMed Central

Weis, Susanne; McCarthy-Jones, Simon; Moseley, Peter; Smailes, David; Fernyhough, Charles

2016-01-01

Inner speech has been implicated in important aspects of normal and atypical cognition, including the development of auditory hallucinations. Studies to date have focused on covert speech elicited by simple word or sentence repetition, while ignoring richer and arguably more psychologically significant varieties of inner speech. This study compared neural activation for inner speech involving conversations (‘dialogic inner speech’) with single-speaker scenarios (‘monologic inner speech’). Inner speech-related activation differences were then compared with activations relating to Theory-of-Mind (ToM) reasoning and visual perspective-taking in a conjunction design. Generation of dialogic (compared with monologic) scenarios was associated with a widespread bilateral network including left and right superior temporal gyri, precuneus, posterior cingulate and left inferior and medial frontal gyri. Activation associated with dialogic scenarios and ToM reasoning overlapped in areas of right posterior temporal cortex previously linked to mental state representation. Implications for understanding verbal cognition in typical and atypical populations are discussed. PMID:26197805
Iconic Gestures Facilitate Discourse Comprehension in Individuals With Superior Immediate Memory for Body Configurations.

PubMed

Wu, Ying Choon; Coulson, Seana

2015-11-01

To understand a speaker's gestures, people may draw on kinesthetic working memory (KWM)-a system for temporarily remembering body movements. The present study explored whether sensitivity to gesture meaning was related to differences in KWM capacity. KWM was evaluated through sequences of novel movements that participants viewed and reproduced with their own bodies. Gesture sensitivity was assessed through a priming paradigm. Participants judged whether multimodal utterances containing congruent, incongruent, or no gestures were related to subsequent picture probes depicting the referents of those utterances. Individuals with low KWM were primarily inhibited by incongruent speech-gesture primes, whereas those with high KWM showed facilitation-that is, they were able to identify picture probes more quickly when preceded by congruent speech and gestures than by speech alone. Group differences were most apparent for discourse with weakly congruent speech and gestures. Overall, speech-gesture congruency effects were positively correlated with KWM abilities, which may help listeners match spatial properties of gestures to concepts evoked by speech. © The Author(s) 2015.
Musical experience strengthens the neural representation of sounds important for communication in middle-aged adults

PubMed Central

Parbery-Clark, Alexandra; Anderson, Samira; Hittner, Emily; Kraus, Nina

2012-01-01

Older adults frequently complain that while they can hear a person talking, they cannot understand what is being said; this difficulty is exacerbated by background noise. Peripheral hearing loss cannot fully account for this age-related decline in speech-in-noise ability, as declines in central processing also contribute to this problem. Given that musicians have enhanced speech-in-noise perception, we aimed to define the effects of musical experience on subcortical responses to speech and speech-in-noise perception in middle-aged adults. Results reveal that musicians have enhanced neural encoding of speech in quiet and noisy settings. Enhancements include faster neural response timing, higher neural response consistency, more robust encoding of speech harmonics, and greater neural precision. Taken together, we suggest that musical experience provides perceptual benefits in an aging population by strengthening the underlying neural pathways necessary for the accurate representation of important temporal and spectral features of sound. PMID:23189051
The evolution of speech: a comparative review.

PubMed

Fitch

2000-07-01

The evolution of speech can be studied independently of the evolution of language, with the advantage that most aspects of speech acoustics, physiology and neural control are shared with animals, and thus open to empirical investigation. At least two changes were necessary prerequisites for modern human speech abilities: (1) modification of vocal tract morphology, and (2) development of vocal imitative ability. Despite an extensive literature, attempts to pinpoint the timing of these changes using fossil data have proven inconclusive. However, recent comparative data from nonhuman primates have shed light on the ancestral use of formants (a crucial cue in human speech) to identify individuals and gauge body size. Second, comparative analysis of the diverse vertebrates that have evolved vocal imitation (humans, cetaceans, seals and birds) provides several distinct, testable hypotheses about the adaptive function of vocal mimicry. These developments suggest that, for understanding the evolution of speech, comparative analysis of living species provides a viable alternative to fossil data. However, the neural basis for vocal mimicry and for mimesis in general remains unknown.
Processing Mechanisms in Hearing-Impaired Listeners: Evidence from Reaction Times and Sentence Interpretation.

PubMed

Carroll, Rebecca; Uslar, Verena; Brand, Thomas; Ruigendijk, Esther

The authors aimed to determine whether hearing impairment affects sentence comprehension beyond phoneme or word recognition (i.e., on the sentence level), and to distinguish grammatically induced processing difficulties in structurally complex sentences from perceptual difficulties associated with listening to degraded speech. Effects of hearing impairment or speech in noise were expected to reflect hearer-specific speech recognition difficulties. Any additional processing time caused by the sustained perceptual challenges across the sentence may either be independent of or interact with top-down processing mechanisms associated with grammatical sentence structure. Forty-nine participants listened to canonical subject-initial or noncanonical object-initial sentences that were presented either in quiet or in noise. Twenty-four participants had mild-to-moderate hearing impairment and received hearing-loss-specific amplification. Twenty-five participants were age-matched peers with normal hearing status. Reaction times were measured on-line at syntactically critical processing points as well as two control points to capture differences in processing mechanisms. An off-line comprehension task served as an additional indicator of sentence (mis)interpretation, and enforced syntactic processing. The authors found general effects of hearing impairment and speech in noise that negatively affected perceptual processing, and an effect of word order, where complex grammar locally caused processing difficulties for the noncanonical sentence structure. Listeners with hearing impairment were hardly affected by noise at the beginning of the sentence, but were affected markedly toward the end of the sentence, indicating a sustained perceptual effect of speech recognition. Comprehension of sentences with noncanonical word order was negatively affected by degraded signals even after sentence presentation. Hearing impairment adds perceptual processing load during sentence processing, but affects grammatical processing beyond the word level to the same degree as in normal hearing, with minor differences in processing mechanisms. The data contribute to our understanding of individual differences in speech perception and language understanding. The authors interpret their results within the ease of language understanding model.
A Model of Auditory-Cognitive Processing and Relevance to Clinical Applicability.

PubMed

Edwards, Brent

2016-01-01

Hearing loss and cognitive function interact in both a bottom-up and top-down relationship. Listening effort is tied to these interactions, and models have been developed to explain their relationship. The Ease of Language Understanding model in particular has gained considerable attention in its explanation of the effect of signal distortion on speech understanding. Signal distortion can also affect auditory scene analysis ability, however, resulting in a distorted auditory scene that can affect cognitive function, listening effort, and the allocation of cognitive resources. These effects are explained through an addition to the Ease of Language Understanding model. This model can be generalized to apply to all sounds, not only speech, representing the increased effort required for auditory environmental awareness and other nonspeech auditory tasks. While the authors have measures of speech understanding and cognitive load to quantify these interactions, they are lacking measures of the effect of hearing aid technology on auditory scene analysis ability and how effort and attention varies with the quality of an auditory scene. Additionally, the clinical relevance of hearing aid technology on cognitive function and the application of cognitive measures in hearing aid fittings will be limited until effectiveness is demonstrated in real-world situations.
Content analysis of the professional journal of the College of Speech Therapists II: coming of age and growing maturity, 1946-65.

PubMed

Stansfield, Jois; Armstrong, Linda

2016-07-01

Following a content analysis of the first 10 years of the UK professional journal Speech, this study was conducted to survey the published work of the speech (and language) therapy profession in the 20 years following the unification of two separate professional bodies into the College of Speech Therapists. To understand better the development of the speech (and language) therapy profession in the UK in order to support the development of an online history of the speech and language therapy profession in the UK. The 40 issues of the professional journal of the College of Speech Therapists published between 1946 and 1965 (Speech and later Speech Pathology and Therapy) were examined using content analysis and the content compared with that of the same journal as it appeared from 1935 to the end of the Second World War (1945). Many aspects of the journal and its authored papers were retained from the earlier years, for example, the range of authors' professions, their location mainly in the UK, their number of contributions and the length of papers. Changes and developments included the balance of original to republished papers, the description and discussion of new professional issues, and an extended range of client groups/disorders. The journal and its articles reflect the growing maturity of the newly unified profession of speech therapy and give an indication both of the expanding depth of knowledge available to speech therapists and of the rapidly increasing breadth of their work over this period. © 2016 Royal College of Speech and Language Therapists.
The eye as a window to the listening brain: neural correlates of pupil size as a measure of cognitive listening load.

PubMed

Zekveld, Adriana A; Heslenfeld, Dirk J; Johnsrude, Ingrid S; Versfeld, Niek J; Kramer, Sophia E

2014-11-01

An important aspect of hearing is the degree to which listeners have to deploy effort to understand speech. One promising measure of listening effort is task-evoked pupil dilation. Here, we use functional magnetic resonance imaging (fMRI) to identify the neural correlates of pupil dilation during comprehension of degraded spoken sentences in 17 normal-hearing listeners. Subjects listened to sentences degraded in three different ways: the target female speech was masked by fluctuating noise, by speech from a single male speaker, or the target speech was noise-vocoded. The degree of degradation was individually adapted such that 50% or 84% of the sentences were intelligible. Control conditions included clear speech in quiet, and silent trials. The peak pupil dilation was larger for the 50% compared to the 84% intelligibility condition, and largest for speech masked by the single-talker masker, followed by speech masked by fluctuating noise, and smallest for noise-vocoded speech. Activation in the bilateral superior temporal gyrus (STG) showed the same pattern, with most extensive activation for speech masked by the single-talker masker. Larger peak pupil dilation was associated with more activation in the bilateral STG, bilateral ventral and dorsal anterior cingulate cortex and several frontal brain areas. A subset of the temporal region sensitive to pupil dilation was also sensitive to speech intelligibility and degradation type. These results show that pupil dilation during speech perception in challenging conditions reflects both auditory and cognitive processes that are recruited to cope with degraded speech and the need to segregate target speech from interfering sounds. Copyright © 2014 Elsevier Inc. All rights reserved.
Improving speech perception in noise for children with cochlear implants.

PubMed

Gifford, René H; Olund, Amy P; Dejong, Melissa

2011-10-01

Current cochlear implant recipients are achieving increasingly higher levels of speech recognition; however, the presence of background noise continues to significantly degrade speech understanding for even the best performers. Newer generation Nucleus cochlear implant sound processors can be programmed with SmartSound strategies that have been shown to improve speech understanding in noise for adult cochlear implant recipients. The applicability of these strategies for use in children, however, is not fully understood nor widely accepted. To assess speech perception for pediatric cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE™) array in order to determine whether Nucleus sound processor SmartSound strategies yield improved sentence recognition in noise for children who learn language through the implant. Single subject, repeated measures design. Twenty-two experimental subjects with cochlear implants (mean age 11.1 yr) and 25 control subjects with normal hearing (mean age 9.6 yr) participated in this prospective study. Speech reception thresholds (SRT) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the experimental subjects' everyday program incorporating Adaptive Dynamic Range Optimization (ADRO) as well as with the addition of Autosensitivity control (ASC). Adaptive SRTs with the Hearing In Noise Test (HINT) sentences were obtained for all 22 experimental subjects, and performance-in percent correct-was assessed in a fixed +6 dB SNR (signal-to-noise ratio) for a six-subject subset. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the SmartSound setting on the SRT in noise. The primary findings mirrored those reported previously with adult cochlear implant recipients in that the addition of ASC to ADRO significantly improved speech recognition in noise for pediatric cochlear implant recipients. The mean degree of improvement in the SRT with the addition of ASC to ADRO was 3.5 dB for a mean SRT of 10.9 dB SNR. Thus, despite the fact that these children have acquired auditory/oral speech and language through the use of their cochlear implant(s) equipped with ADRO, the addition of ASC significantly improved their ability to recognize speech in high levels of diffuse background noise. The mean SRT for the control subjects with normal hearing was 0.0 dB SNR. Given that the mean SRT for the experimental group was 10.9 dB SNR, despite the improvements in performance observed with the addition of ASC, cochlear implants still do not completely overcome the speech perception deficit encountered in noisy environments accompanying the diagnosis of severe-to-profound hearing loss. SmartSound strategies currently available in latest generation Nucleus cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise for pediatric cochlear implant recipients. Despite the reluctance of pediatric audiologists to utilize SmartSound settings for regular use, the results of the current study support the addition of ASC to ADRO for everyday listening environments to improve speech perception in a child's typical everyday program. American Academy of Audiology.
Understanding environmental sounds in sentence context.

PubMed

Uddin, Sophia; Heald, Shannon L M; Van Hedger, Stephen C; Klos, Serena; Nusbaum, Howard C

2018-03-01

There is debate about how individuals use context to successfully predict and recognize words. One view argues that context supports neural predictions that make use of the speech motor system, whereas other views argue for a sensory or conceptual level of prediction. While environmental sounds can convey clear referential meaning, they are not linguistic signals, and are thus neither produced with the vocal tract nor typically encountered in sentence context. We compared the effect of spoken sentence context on recognition and comprehension of spoken words versus nonspeech, environmental sounds. In Experiment 1, sentence context decreased the amount of signal needed for recognition of spoken words and environmental sounds in similar fashion. In Experiment 2, listeners judged sentence meaning in both high and low contextually constraining sentence frames, when the final word was present or replaced with a matching environmental sound. Results showed that sentence constraint affected decision time similarly for speech and nonspeech, such that high constraint sentences (i.e., frame plus completion) were processed faster than low constraint sentences for speech and nonspeech. Linguistic context facilitates the recognition and understanding of nonspeech sounds in much the same way as for spoken words. This argues against a simple form of a speech-motor explanation of predictive coding in spoken language understanding, and suggests support for conceptual-level predictions. Copyright © 2017 Elsevier B.V. All rights reserved.
V2S: Voice to Sign Language Translation System for Malaysian Deaf People

NASA Astrophysics Data System (ADS)

Mean Foong, Oi; Low, Tang Jung; La, Wai Wan

The process of learning and understand the sign language may be cumbersome to some, and therefore, this paper proposes a solution to this problem by providing a voice (English Language) to sign language translation system using Speech and Image processing technique. Speech processing which includes Speech Recognition is the study of recognizing the words being spoken, regardless of whom the speaker is. This project uses template-based recognition as the main approach in which the V2S system first needs to be trained with speech pattern based on some generic spectral parameter set. These spectral parameter set will then be stored as template in a database. The system will perform the recognition process through matching the parameter set of the input speech with the stored templates to finally display the sign language in video format. Empirical results show that the system has 80.3% recognition rate.
'That doesn't translate': the role of evidence-based practice in disempowering speech pathologists in acute aphasia management.

PubMed

Foster, Abby; Worrall, Linda; Rose, Miranda; O'Halloran, Robyn

2015-07-01

An evidence-practice gap has been identified in current acute aphasia management practice, with the provision of services to people with aphasia in the acute hospital widely considered in the literature to be inconsistent with best-practice recommendations. The reasons for this evidence-practice gap are unclear; however, speech pathologists practising in this setting have articulated a sense of dissonance regarding their limited service provision to this population. A clearer understanding of why this evidence-practice gap exists is essential in order to support and promote evidence-based approaches to the care of people with aphasia in acute care settings. To provide an understanding of speech pathologists' conceptualization of evidence-based practice for acute post-stroke aphasia, and its implementation. This study adopted a phenomenological approach, underpinned by a social constructivist paradigm. In-depth interviews were conducted with 14 Australian speech pathologists, recruited using a purposive sampling technique. An inductive thematic analysis of the data was undertaken. A single, overarching theme emerged from the data. Speech pathologists demonstrated a sense of disempowerment as a result of their relationship with evidence-based practice for acute aphasia management. Three subthemes contributed to this theme. The first described a restricted conceptualization of evidence-based practice. The second revealed speech pathologists' strained relationships with the research literature. The third elucidated a sense of professional unease over their perceived inability to enact evidence-based clinical recommendations, despite their desire to do so. Speech pathologists identified a current knowledge-practice gap in their management of aphasia in acute hospital settings. Speech pathologists place significant emphasis on the research evidence; however, their engagement with the research is limited, in part because it is perceived to lack clinical utility. A sense of professional dissonance arises from the conflict between a desire to provide best practice and the perceived barriers to implementing evidence-based recommendations clinically, resulting in evidence-based practice becoming a disempowering concept for some. © 2015 Royal College of Speech and Language Therapists.
Cochlear implants: a remarkable past and a brilliant future

PubMed Central

Wilson, Blake S.; Dorman, Michael F.

2013-01-01

The aims of this paper are to (i) provide a brief history of cochlear implants; (ii) present a status report on the current state of implant engineering and the levels of speech understanding enabled by that engineering; (iii) describe limitations of current signal processing strategies and (iv) suggest new directions for research. With current technology the “average” implant patient, when listening to predictable conversations in quiet, is able to communicate with relative ease. However, in an environment typical of a workplace the average patient has a great deal of difficulty. Patients who are “above average” in terms of speech understanding, can achieve 100% correct scores on the most difficult tests of speech understanding in quiet but also have significant difficulty when signals are presented in noise. The major factors in these outcomes appear to be (i) a loss of low-frequency, fine structure information possibly due to the envelope extraction algorithms common to cochlear implant signal processing; (ii) a limitation in the number of effective channels of stimulation due to overlap in electric fields from electrodes, and (iii) central processing deficits, especially for patients with poor speech understanding. Two recent developments, bilateral implants and combined electric and acoustic stimulation, have promise to remediate some of the difficulties experienced by patients in noise and to reinstate low-frequency fine structure information. If other possibilities are realized, e.g., electrodes that emit drugs to inhibit cell death following trauma and to induce the growth of neurites toward electrodes, then the future is very bright indeed. PMID:18616994

Children Discover the Spectral Skeletons in Their Native Language before the Amplitude Envelopes

ERIC Educational Resources Information Center

Nittrouer, Susan; Lowenstein, Joanna H.; Packer, Robert R.

2009-01-01

Much of speech perception research has focused on brief spectro-temporal properties in the signal, but some studies have shown that adults can recover linguistic form when those properties are absent. In this experiment, 7-year-old English-speaking children demonstrated adultlike abilities to understand speech when only sine waves (SWs)…
The Affordance of Speech Recognition Technology for EFL Learning in an Elementary School Setting

ERIC Educational Resources Information Center

Liaw, Meei-Ling

2014-01-01

This study examined the use of speech recognition (SR) technology to support a group of elementary school children's learning of English as a foreign language (EFL). SR technology has been used in various language learning contexts. Its application to EFL teaching and learning is still relatively recent, but a solid understanding of its…
Understanding Why Speech-Language Pathologists Rarely Pursue a PhD in Communication Sciences and Disorders

ERIC Educational Resources Information Center

Myotte, Theodore; Hutchins, Tiffany L.; Cannizzaro, Michael S.; Belin, Gayle

2011-01-01

Masters-level speech-language pathologists in communication sciences and disorders (n = 122) completed a survey soliciting their reasons for not pursuing doctoral study. Factor analysis revealed a four-factor solution including one reflecting a lack of interest in doctoral study (Factor 2) and one reflecting practical financial concerns (Factor…
Educational Audiologists: Their Access, Benefit, and Collaborative Assistance to Speech-Language Pathologists in Schools

ERIC Educational Resources Information Center

Richburg, Cynthia McCormick; Knickelbein, Becky A.

2011-01-01

Purpose: The main goals of this study were to determine if school-based speech-language pathologists (SLPs) have access to the services of an audiologist and if those SLPs felt they obtained benefit from the audiologist's services. Additional goals included gathering information about SLPs' (a) understanding of basic audiological concepts typical…
FY 1992-1993 RDT&E Descriptive Summaries: DARPA

DTIC Science & Technology

1991-02-01

combining natural language and user workflow model information. * Determine effectiveness of auditory models as preprocessors for robust speech...for indexing and retrieving design knowledge. * Evaluate ability of message understanding systems to extract crisis -situation data from news wires...energy effects , underwater vehicles, neutrino detection, speech, tailored nuclear weapons, hypervelocity, nanosecond timing, and MAD/RPV. FY 1991 Planned
Detecting and Understanding the Impact of Cognitive and Interpersonal Conflict in Computer Supported Collaborative Learning Environments

ERIC Educational Resources Information Center

Prata, David Nadler; Baker, Ryan S. J. d.; Costa, Evandro d. B.; Rose, Carolyn P.; Cui, Yue; de Carvalho, Adriana M. J. B.

2009-01-01

This paper presents a model which can automatically detect a variety of student speech acts as students collaborate within a computer supported collaborative learning environment. In addition, an analysis is presented which gives substantial insight as to how students' learning is associated with students' speech acts, knowledge that will…
The Effect of Temporal Gap Identification on Speech Perception by Users of Cochlear Implants

ERIC Educational Resources Information Center

Sagi, Elad; Kaiser, Adam R.; Meyer, Ted A.; Svirsky, Mario A.

2009-01-01

Purpose: This study examined the ability of listeners using cochlear implants (CIs) and listeners with normal hearing (NH) to identify silent gaps of different duration and the relation of this ability to speech understanding in CI users. Method: Sixteen NH adults and 11 postlingually deafened adults with CIs identified synthetic vowel-like…
Temporal Envelope Changes of Compression and Speech Rate: Combined Effects on Recognition for Older Adults

ERIC Educational Resources Information Center

Jenstad, Lorienne M.; Souza, Pamela E.

2007-01-01

Purpose: When understanding speech in complex listening situations, older adults with hearing loss face the double challenge of cochlear hearing loss and deficits of the aging auditory system. Wide-dynamic range compression (WDRC) is used in hearing aids as remediation for the loss of audibility associated with hearing loss. WDRC processing has…
Crisis Speeches Delivered during World War II: A Historical and Rhetorical Perspective

ERIC Educational Resources Information Center

Ramos, Tomas E.

2010-01-01

Rhetorical analyses of speeches made by United States presidents and world leaders abound, particularly studies about addresses to nations in times of crisis. These are important because what presidents say amidst uncertainty and chaos defines their leadership in the eyes of the public. But with new forms of crisis rhetoric, our understanding of…
Language Policy, Tacit Knowledge, and Institutional Learning: The Case of the Swiss Public Service Broadcaster SRG SSR

ERIC Educational Resources Information Center

Perrin, Daniel

2011-01-01

"Promoting public understanding" is what the programming mandate asks the Swiss public broadcasting company SRG SSR to do. From a sociolinguistic perspective, this means linking speech communities with other speech communities, both between and within the German-, French-, Italian-, and Romansh-speaking parts of Switzerland. In the…
Well-Being and Resilience in Children with Speech and Language Disorders

ERIC Educational Resources Information Center

Lyons, Rena; Roulstone, Sue

2018-01-01

Purpose: Children with speech and language disorders are at risk in relation to psychological and social well-being. The aim of this study was to understand the experiences of these children from their own perspectives focusing on risks to their well-being and protective indicators that may promote resilience. Method: Eleven 9- to 12-year-old…
Understanding Susan Sontag's Critique of Communism and the Democratic Left: Confession? Conversion? Conundrum?

ERIC Educational Resources Information Center

Page, Judy Lynn

Provoking violent controversy, Susan Sontag's speech, "The Lesson of Poland," is an example of subversive rhetoric. Delivered at a February 6, 1982, show of support for the recently oppressed Polish people, Sontag's speech, like other modernist writing, did not seek a consensus with the audience, but challenged its whole scheme of…
Psychometric Characteristics of Single-Word Tests of Children's Speech Sound Production

ERIC Educational Resources Information Center

Flipsen, Peter, Jr.; Ogiela, Diane A.

2015-01-01

Purpose: Our understanding of test construction has improved since the now-classic review by McCauley and Swisher (1984) . The current review article examines the psychometric characteristics of current single-word tests of speech sound production in an attempt to determine whether our tests have improved since then. It also provides a resource…
How Can Comorbidity with Attention-Deficit/Hyperactivity Disorder Aid Understanding of Language and Speech Disorders?

ERIC Educational Resources Information Center

Tomblin, J. Bruce; Mueller, Kathyrn L.

2012-01-01

This article provides a background for the topic of comorbidity of attention-deficit/hyperactivity disorder and spoken and written language and speech disorders that extends through this issue of "Topics in Language Disorders." Comorbidity is common within developmental disorders and may be explained by many possible reasons. Some of these can be…
Experiences of Student Speech-Language Pathology Clinicians in the Initial Clinical Practicum: A Phenomenological Study

ERIC Educational Resources Information Center

Nelson, Lori A.

2011-01-01

Speech-language pathology literature is limited in describing the clinical practicum process from the student perspective. Much of the supervision literature in this field focuses on quantitative research and/or the point of view of the supervisor. Understanding the student experience serves to enhance the quality of clinical supervision. Of…
Lexical Profiles of Comprehensible Second Language Speech: The Role of Appropriateness, Fluency, Variation, Sophistication, Abstractness, and Sense Relations

ERIC Educational Resources Information Center

Saito, Kazuya; Webb, Stuart; Trofimovich, Pavel; Isaacs, Talia

2016-01-01

This study examined contributions of lexical factors to native-speaking raters' assessments of comprehensibility (ease of understanding) of second language (L2) speech. Extemporaneous oral narratives elicited from 40 French speakers of L2 English were transcribed and evaluated for comprehensibility by 10 raters. Subsequently, the samples were…
On the Use of the Distortion-Sensitivity Approach in Examining the Role of Linguistic Abilities in Speech Understanding in Noise

ERIC Educational Resources Information Center

Goverts, S. Theo; Huysmans, Elke; Kramer, Sophia E.; de Groot, Annette M. B.; Houtgast, Tammo

2011-01-01

Purpose: Researchers have used the distortion-sensitivity approach in the psychoacoustical domain to investigate the role of auditory processing abilities in speech perception in noise (van Schijndel, Houtgast, & Festen, 2001; Goverts & Houtgast, 2010). In this study, the authors examined the potential applicability of the…
Anger among Allies: Audre Lorde's 1981 Keynote Admonishing the National Women's Studies Association

ERIC Educational Resources Information Center

Olson, Lester C.

2011-01-01

This essay argues that Audre Lorde's 1981 keynote speech, "The Uses of Anger: Women Responding to Racism," has much to contribute to communication scholars' understanding of human biases and rhetorical artistry. The significance of Lorde's subject is one reason for devoting critical attention to her speech, because, in contemporary public life in…
Separating the Problem and the Person: Insights from Narrative Therapy with People Who Stutter

ERIC Educational Resources Information Center

Ryan, Fiona; O'Dwyer, Mary; Leahy, Margaret M.

2015-01-01

Stuttering is a complex disorder of speech that encompasses motor speech and emotional and cognitive factors. The use of narrative therapy is described here, focusing on the stories that clients tell about the problems associated with stuttering that they have encountered in their lives. Narrative therapy uses these stories to understand, analyze,…
Design and development of an AAC app based on a speech-to-symbol technology.

PubMed

Radici, Elena; Bonacina, Stefano; De Leo, Gianluca

2016-08-01

The purpose of this paper is to present the design and the development of an Augmentative and Alternative Communication app that uses a speech to symbol technology to model language, i.e. to recognize the speech and display the text or the graphic content related to it. Our app is intended to be adopted by communication partners who want to engage in interventions focused on improving communication skills. Our app has the goal of translating simple speech sentences in a set of symbols that are understandable by children with complex communication needs. We moderated a focus group among six AAC communication partners. Then, we developed a prototype. We are currently starting testing our app in an AAC Centre in Milan, Italy.

Positron Emission Tomography Imaging Reveals Auditory and Frontal Cortical Regions Involved with Speech Perception and Loudness Adaptation.

PubMed

Berding, Georg; Wilke, Florian; Rode, Thilo; Haense, Cathleen; Joseph, Gert; Meyer, Geerd J; Mamach, Martin; Lenarz, Minoo; Geworski, Lilli; Bengel, Frank M; Lenarz, Thomas; Lim, Hubert H

2015-01-01

Considerable progress has been made in the treatment of hearing loss with auditory implants. However, there are still many implanted patients that experience hearing deficiencies, such as limited speech understanding or vanishing perception with continuous stimulation (i.e., abnormal loudness adaptation). The present study aims to identify specific patterns of cerebral cortex activity involved with such deficiencies. We performed O-15-water positron emission tomography (PET) in patients implanted with electrodes within the cochlea, brainstem, or midbrain to investigate the pattern of cortical activation in response to speech or continuous multi-tone stimuli directly inputted into the implant processor that then delivered electrical patterns through those electrodes. Statistical parametric mapping was performed on a single subject basis. Better speech understanding was correlated with a larger extent of bilateral auditory cortex activation. In contrast to speech, the continuous multi-tone stimulus elicited mainly unilateral auditory cortical activity in which greater loudness adaptation corresponded to weaker activation and even deactivation. Interestingly, greater loudness adaptation was correlated with stronger activity within the ventral prefrontal cortex, which could be up-regulated to suppress the irrelevant or aberrant signals into the auditory cortex. The ability to detect these specific cortical patterns and differences across patients and stimuli demonstrates the potential for using PET to diagnose auditory function or dysfunction in implant patients, which in turn could guide the development of appropriate stimulation strategies for improving hearing rehabilitation. Beyond hearing restoration, our study also reveals a potential role of the frontal cortex in suppressing irrelevant or aberrant activity within the auditory cortex, and thus may be relevant for understanding and treating tinnitus.
Positron Emission Tomography Imaging Reveals Auditory and Frontal Cortical Regions Involved with Speech Perception and Loudness Adaptation

PubMed Central

Berding, Georg; Wilke, Florian; Rode, Thilo; Haense, Cathleen; Joseph, Gert; Meyer, Geerd J.; Mamach, Martin; Lenarz, Minoo; Geworski, Lilli; Bengel, Frank M.; Lenarz, Thomas; Lim, Hubert H.

2015-01-01

Considerable progress has been made in the treatment of hearing loss with auditory implants. However, there are still many implanted patients that experience hearing deficiencies, such as limited speech understanding or vanishing perception with continuous stimulation (i.e., abnormal loudness adaptation). The present study aims to identify specific patterns of cerebral cortex activity involved with such deficiencies. We performed O-15-water positron emission tomography (PET) in patients implanted with electrodes within the cochlea, brainstem, or midbrain to investigate the pattern of cortical activation in response to speech or continuous multi-tone stimuli directly inputted into the implant processor that then delivered electrical patterns through those electrodes. Statistical parametric mapping was performed on a single subject basis. Better speech understanding was correlated with a larger extent of bilateral auditory cortex activation. In contrast to speech, the continuous multi-tone stimulus elicited mainly unilateral auditory cortical activity in which greater loudness adaptation corresponded to weaker activation and even deactivation. Interestingly, greater loudness adaptation was correlated with stronger activity within the ventral prefrontal cortex, which could be up-regulated to suppress the irrelevant or aberrant signals into the auditory cortex. The ability to detect these specific cortical patterns and differences across patients and stimuli demonstrates the potential for using PET to diagnose auditory function or dysfunction in implant patients, which in turn could guide the development of appropriate stimulation strategies for improving hearing rehabilitation. Beyond hearing restoration, our study also reveals a potential role of the frontal cortex in suppressing irrelevant or aberrant activity within the auditory cortex, and thus may be relevant for understanding and treating tinnitus. PMID:26046763
Language learning impairments: integrating basic science, technology, and remediation.

PubMed

Tallal, P; Merzenich, M M; Miller, S; Jenkins, W

1998-11-01

One of the fundamental goals of the modern field of neuroscience is to understand how neuronal activity gives rise to higher cortical function. However, to bridge the gap between neurobiology and behavior, we must understand higher cortical functions at the behavioral level at least as well as we have come to understand neurobiological processes at the cellular and molecular levels. This is certainly the case in the study of speech processing, where critical studies of behavioral dysfunction have provided key insights into the basic neurobiological mechanisms relevant to speech perception and production. Much of this progress derives from a detailed analysis of the sensory, perceptual, cognitive, and motor abilities of children who fail to acquire speech, language, and reading skills normally within the context of otherwise normal development. Current research now shows that a dysfunction in normal phonological processing, which is critical to the development of oral and written language, may derive, at least in part, from difficulties in perceiving and producing basic sensory-motor information in rapid succession--within tens of ms (see Tallal et al. 1993a for a review). There is now substantial evidence supporting the hypothesis that basic temporal integration processes play a fundamental role in establishing neural representations for the units of speech (phonemes), which must be segmented from the (continuous) speech stream and combined to form words, in order for the normal development of oral and written language to proceed. Results from magnetic resonance imaging (MRI) and positron emission tomography (PET) studies, as well as studies of behavioral performance in normal and language impaired children and adults, will be reviewed to support the view that the integration of rapidly changing successive acoustic events plays a primary role in phonological development and disorders. Finally, remediation studies based on this research, coupled with neuroplasticity research, will be presented.
Using statistical deformable models to reconstruct vocal tract shape from magnetic resonance images.

PubMed

Vasconcelos, M J M; Rua Ventura, S M; Freitas, D R S; Tavares, J M R S

2010-10-01

The mechanisms involved in speech production are complex and have thus been subject to growing attention by the scientific community. It has been demonstrated that magnetic resonance imaging (MRI) is a powerful means in the understanding of the morphology of the vocal tract. Over the last few years, statistical deformable models have been successfully used to identify and characterize bones and organs in medical images and point distribution models (PDMs) have gained particular relevance. In this work, the suitability of these models has been studied to characterize and further reconstruct the shape of the vocal tract in the articulation of Portuguese European (EP) speech sounds, one of the most spoken languages worldwide, with the aid of MR images. Therefore, a PDM has been built from a set of MR images acquired during the artificially sustained articulation of 25 EP speech sounds. Following this, the capacity of this statistical model to characterize the shape deformation of the vocal tract during the production of sounds was analysed. Next, the model was used to reconstruct five EP oral vowels and the EP fricative consonants. As far as a study on speech production is concerned, this study is considered to be the first approach to characterize and reconstruct the vocal tract shape from MR images by using PDMs. In addition, the findings achieved permit one to conclude that this modelling technique compels an enhanced understanding of the dynamic speech events involved in sustained articulations based on MRI, which are of particular interest for speech rehabilitation and simulation.
Research in speech communication.

PubMed

Flanagan, J

1995-10-24

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.
Internal and external attention in speech anxiety.

PubMed

Deiters, Désirée D; Stevens, Stephan; Hermann, Christiane; Gerlach, Alexander L

2013-06-01

Cognitive models of social phobia propose that socially anxious individuals engage in heightened self-focused attention. Evidence for this assumption was provided by dot probe and feedback tasks measuring attention and reactions to internal cues. However, it is unclear whether similar patterns of attentional processing can be revealed while participants actually engage in a social situation. The current study used a novel paradigm, simultaneously measuring attention to internal and external stimuli in anticipation of and during a speech task. Participants with speech anxiety and non-anxious controls were asked to press a button in response to external or internal probes, while giving a speech on a controversial topic in front of an audience. The external probe consisted of a LED attached to the head of one spectator and the internal probe was a light vibration, which ostensibly signaled changes in participants' pulse or skin conductance. The results indicate that during speech anticipation, high speech anxious participants responded significantly faster to internal probes than low speech anxious participants, while during the speech no differences were revealed between internal and external probes. Generalization of our results is restricted to speech anxious individuals. Our results provide support for the pivotal role of self-focused attention in anticipatory social anxiety. Furthermore, they provide a new framework for understanding interaction effects of internal and external attention in anticipation of and during actual social situations. Copyright © 2012 Elsevier Ltd. All rights reserved.
Individual differences in selective attention predict speech identification at a cocktail party.

PubMed

Oberfeld, Daniel; Klöckner-Nowotny, Felicitas

2016-08-31

Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise.
Spoken language achieves robustness and evolvability by exploiting degeneracy and neutrality.

PubMed

Winter, Bodo

2014-10-01

As with biological systems, spoken languages are strikingly robust against perturbations. This paper shows that languages achieve robustness in a way that is highly similar to many biological systems. For example, speech sounds are encoded via multiple acoustically diverse, temporally distributed and functionally redundant cues, characteristics that bear similarities to what biologists call "degeneracy". Speech is furthermore adequately characterized by neutrality, with many different tongue configurations leading to similar acoustic outputs, and different acoustic variants understood as the same by recipients. This highlights the presence of a large neutral network of acoustic neighbors for every speech sound. Such neutrality ensures that a steady backdrop of variation can be maintained without impeding communication, assuring that there is "fodder" for subsequent evolution. Thus, studying linguistic robustness is not only important for understanding how linguistic systems maintain their functioning upon the background of noise, but also for understanding the preconditions for language evolution. © 2014 WILEY Periodicals, Inc.
Behavioral and Neural Discrimination of Speech Sounds After Moderate or Intense Noise Exposure in Rats

PubMed Central

Reed, Amanda C.; Centanni, Tracy M.; Borland, Michael S.; Matney, Chanel J.; Engineer, Crystal T.; Kilgard, Michael P.

2015-01-01

Objectives Hearing loss is a commonly experienced disability in a variety of populations including veterans and the elderly and can often cause significant impairment in the ability to understand spoken language. In this study, we tested the hypothesis that neural and behavioral responses to speech will be differentially impaired in an animal model after two forms of hearing loss. Design Sixteen female Sprague–Dawley rats were exposed to one of two types of broadband noise which was either moderate or intense. In nine of these rats, auditory cortex recordings were taken 4 weeks after noise exposure (NE). The other seven were pretrained on a speech sound discrimination task prior to NE and were then tested on the same task after hearing loss. Results Following intense NE, rats had few neural responses to speech stimuli. These rats were able to detect speech sounds but were no longer able to discriminate between speech sounds. Following moderate NE, rats had reorganized cortical maps and altered neural responses to speech stimuli but were still able to accurately discriminate between similar speech sounds during behavioral testing. Conclusions These results suggest that rats are able to adjust to the neural changes after moderate NE and discriminate speech sounds, but they are not able to recover behavioral abilities after intense NE. Animal models could help clarify the adaptive and pathological neural changes that contribute to speech processing in hearing-impaired populations and could be used to test potential behavioral and pharmacological therapies. PMID:25072238
GRIN2A

PubMed Central

Turner, Samantha J.; Mayes, Angela K.; Verhoeven, Andrea; Mandelstam, Simone A.; Morgan, Angela T.

2015-01-01

Objective: To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. Methods: We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Results: Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Conclusions: Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. PMID:25596506
Masking Period Patterns & Forward Masking for Speech-Shaped Noise: Age-related effects

PubMed Central

Grose, John H.; Menezes, Denise C.; Porter, Heather L.; Griz, Silvana

2015-01-01

Objective The purpose of this study was to assess age-related changes in temporal resolution in listeners with relatively normal audiograms. The hypothesis was that increased susceptibility to non-simultaneous masking contributes to the hearing difficulties experienced by older listeners in complex fluctuating backgrounds. Design Participants included younger (n = 11), middle-aged (n = 12), and older (n = 11) listeners with relatively normal audiograms. The first phase of the study measured masking period patterns for speech-shaped noise maskers and signals. From these data, temporal window shapes were derived. The second phase measured forward-masking functions, and assessed how well the temporal window fits accounted for these data. Results The masking period patterns demonstrated increased susceptibility to backward masking in the older listeners, compatible with a more symmetric temporal window in this group. The forward-masking functions exhibited an age-related decline in recovery to baseline thresholds, and there was also an increase in the variability of the temporal window fits to these data. Conclusions This study demonstrated an age-related increase in susceptibility to non-simultaneous masking, supporting the hypothesis that exacerbated non-simultaneous masking contributes to age-related difficulties understanding speech in fluctuating noise. Further support for this hypothesis comes from limited speech-in-noise data suggesting an association between susceptibility to forward masking and speech understanding in modulated noise. PMID:26230495
Frontal and temporal contributions to understanding the iconic co-speech gestures that accompany speech.

PubMed

Dick, Anthony Steven; Mok, Eva H; Raja Beharelle, Anjali; Goldin-Meadow, Susan; Small, Steven L

2014-03-01

In everyday conversation, listeners often rely on a speaker's gestures to clarify any ambiguities in the verbal message. Using fMRI during naturalistic story comprehension, we examined which brain regions in the listener are sensitive to speakers' iconic gestures. We focused on iconic gestures that contribute information not found in the speaker's talk, compared with those that convey information redundant with the speaker's talk. We found that three regions-left inferior frontal gyrus triangular (IFGTr) and opercular (IFGOp) portions, and left posterior middle temporal gyrus (MTGp)--responded more strongly when gestures added information to nonspecific language, compared with when they conveyed the same information in more specific language; in other words, when gesture disambiguated speech as opposed to reinforced it. An increased BOLD response was not found in these regions when the nonspecific language was produced without gesture, suggesting that IFGTr, IFGOp, and MTGp are involved in integrating semantic information across gesture and speech. In addition, we found that activity in the posterior superior temporal sulcus (STSp), previously thought to be involved in gesture-speech integration, was not sensitive to the gesture-speech relation. Together, these findings clarify the neurobiology of gesture-speech integration and contribute to an emerging picture of how listeners glean meaning from gestures that accompany speech. Copyright © 2012 Wiley Periodicals, Inc.
Brain-to-text: decoding spoken phrases from phone representations in the brain.

PubMed

Herff, Christian; Heger, Dominic; de Pesters, Adriana; Telaar, Dominic; Brunner, Peter; Schalk, Gerwin; Schultz, Tanja

2015-01-01

It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings.Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech.
Brain-to-text: decoding spoken phrases from phone representations in the brain

PubMed Central

Herff, Christian; Heger, Dominic; de Pesters, Adriana; Telaar, Dominic; Brunner, Peter; Schalk, Gerwin; Schultz, Tanja

2015-01-01

It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings.Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech. PMID:26124702
Eyes and ears: Using eye tracking and pupillometry to understand challenges to speech recognition.

PubMed

Van Engen, Kristin J; McLaughlin, Drew J

2018-05-04

Although human speech recognition is often experienced as relatively effortless, a number of common challenges can render the task more difficult. Such challenges may originate in talkers (e.g., unfamiliar accents, varying speech styles), the environment (e.g. noise), or in listeners themselves (e.g., hearing loss, aging, different native language backgrounds). Each of these challenges can reduce the intelligibility of spoken language, but even when intelligibility remains high, they can place greater processing demands on listeners. Noisy conditions, for example, can lead to poorer recall for speech, even when it has been correctly understood. Speech intelligibility measures, memory tasks, and subjective reports of listener difficulty all provide critical information about the effects of such challenges on speech recognition. Eye tracking and pupillometry complement these methods by providing objective physiological measures of online cognitive processing during listening. Eye tracking records the moment-to-moment direction of listeners' visual attention, which is closely time-locked to unfolding speech signals, and pupillometry measures the moment-to-moment size of listeners' pupils, which dilate in response to increased cognitive load. In this paper, we review the uses of these two methods for studying challenges to speech recognition. Copyright © 2018. Published by Elsevier B.V.
Visual speech perception in foveal and extrafoveal vision: further implications for divisions in hemispheric projections.

PubMed

Jordan, Timothy R; Sheen, Mercedes; Abedipour, Lily; Paterson, Kevin B

2014-01-01

When observing a talking face, it has often been argued that visual speech to the left and right of fixation may produce differences in performance due to divided projections to the two cerebral hemispheres. However, while it seems likely that such a division in hemispheric projections exists for areas away from fixation, the nature and existence of a functional division in visual speech perception at the foveal midline remains to be determined. We investigated this issue by presenting visual speech in matched hemiface displays to the left and right of a central fixation point, either exactly abutting the foveal midline or else located away from the midline in extrafoveal vision. The location of displays relative to the foveal midline was controlled precisely using an automated, gaze-contingent eye-tracking procedure. Visual speech perception showed a clear right hemifield advantage when presented in extrafoveal locations but no hemifield advantage (left or right) when presented abutting the foveal midline. Thus, while visual speech observed in extrafoveal vision appears to benefit from unilateral projections to left-hemisphere processes, no evidence was obtained to indicate that a functional division exists when visual speech is observed around the point of fixation. Implications of these findings for understanding visual speech perception and the nature of functional divisions in hemispheric projection are discussed.
Using auditory-visual speech to probe the basis of noise-impaired consonant-vowel perception in dyslexia and auditory neuropathy

NASA Astrophysics Data System (ADS)

Ramirez, Joshua; Mann, Virginia

2005-08-01

Both dyslexics and auditory neuropathy (AN) subjects show inferior consonant-vowel (CV) perception in noise, relative to controls. To better understand these impairments, natural acoustic speech stimuli that were masked in speech-shaped noise at various intensities were presented to dyslexic, AN, and control subjects either in isolation or accompanied by visual articulatory cues. AN subjects were expected to benefit from the pairing of visual articulatory cues and auditory CV stimuli, provided that their speech perception impairment reflects a relatively peripheral auditory disorder. Assuming that dyslexia reflects a general impairment of speech processing rather than a disorder of audition, dyslexics were not expected to similarly benefit from an introduction of visual articulatory cues. The results revealed an increased effect of noise masking on the perception of isolated acoustic stimuli by both dyslexic and AN subjects. More importantly, dyslexics showed less effective use of visual articulatory cues in identifying masked speech stimuli and lower visual baseline performance relative to AN subjects and controls. Last, a significant positive correlation was found between reading ability and the ameliorating effect of visual articulatory cues on speech perception in noise. These results suggest that some reading impairments may stem from a central deficit of speech processing.
Electrophysiological Correlates of Semantic Dissimilarity Reflect the Comprehension of Natural, Narrative Speech.

PubMed

Broderick, Michael P; Anderson, Andrew J; Di Liberto, Giovanni M; Crosse, Michael J; Lalor, Edmund C

2018-03-05

People routinely hear and understand speech at rates of 120-200 words per minute [1, 2]. Thus, speech comprehension must involve rapid, online neural mechanisms that process words' meanings in an approximately time-locked fashion. However, electrophysiological evidence for such time-locked processing has been lacking for continuous speech. Although valuable insights into semantic processing have been provided by the "N400 component" of the event-related potential [3-6], this literature has been dominated by paradigms using incongruous words within specially constructed sentences, with less emphasis on natural, narrative speech comprehension. Building on the discovery that cortical activity "tracks" the dynamics of running speech [7-9] and psycholinguistic work demonstrating [10-12] and modeling [13-15] how context impacts on word processing, we describe a new approach for deriving an electrophysiological correlate of natural speech comprehension. We used a computational model [16] to quantify the meaning carried by words based on how semantically dissimilar they were to their preceding context and then regressed this measure against electroencephalographic (EEG) data recorded from subjects as they listened to narrative speech. This produced a prominent negativity at a time lag of 200-600 ms on centro-parietal EEG channels, characteristics common to the N400. Applying this approach to EEG datasets involving time-reversed speech, cocktail party attention, and audiovisual speech-in-noise demonstrated that this response was very sensitive to whether or not subjects understood the speech they heard. These findings demonstrate that, when successfully comprehending natural speech, the human brain responds to the contextual semantic content of each word in a relatively time-locked fashion. Copyright © 2018 Elsevier Ltd. All rights reserved.
Speech perception as an active cognitive process

PubMed Central

Heald, Shannon L. M.; Nusbaum, Howard C.

2014-01-01

One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processing with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or therapy. PMID:24672438
Robust relationship between reading span and speech recognition in noise

PubMed Central

Souza, Pamela; Arehart, Kathryn

2015-01-01

Objective Working memory refers to a cognitive system that manages information processing and temporary storage. Recent work has demonstrated that individual differences in working memory capacity measured using a reading span task are related to ability to recognize speech in noise. In this project, we investigated whether the specific implementation of the reading span task influenced the strength of the relationship between working memory capacity and speech recognition. Design The relationship between speech recognition and working memory capacity was examined for two different working memory tests that varied in approach, using a within-subject design. Data consisted of audiometric results along with the two different working memory tests; one speech-in-noise test; and a reading comprehension test. Study sample The test group included 94 older adults with varying hearing loss and 30 younger adults with normal hearing. Results Listeners with poorer working memory capacity had more difficulty understanding speech in noise after accounting for age and degree of hearing loss. That relationship did not differ significantly between the two different implementations of reading span. Conclusions Our findings suggest that different implementations of a verbal reading span task do not affect the strength of the relationship between working memory capacity and speech recognition. PMID:25975360

Robust relationship between reading span and speech recognition in noise.

PubMed

Souza, Pamela; Arehart, Kathryn

2015-01-01

Working memory refers to a cognitive system that manages information processing and temporary storage. Recent work has demonstrated that individual differences in working memory capacity measured using a reading span task are related to ability to recognize speech in noise. In this project, we investigated whether the specific implementation of the reading span task influenced the strength of the relationship between working memory capacity and speech recognition. The relationship between speech recognition and working memory capacity was examined for two different working memory tests that varied in approach, using a within-subject design. Data consisted of audiometric results along with the two different working memory tests; one speech-in-noise test; and a reading comprehension test. The test group included 94 older adults with varying hearing loss and 30 younger adults with normal hearing. Listeners with poorer working memory capacity had more difficulty understanding speech in noise after accounting for age and degree of hearing loss. That relationship did not differ significantly between the two different implementations of reading span. Our findings suggest that different implementations of a verbal reading span task do not affect the strength of the relationship between working memory capacity and speech recognition.
Vocal Features of Song and Speech: Insights from Schoenberg's Pierrot Lunaire.

PubMed

Merrill, Julia; Larrouy-Maestri, Pauline

2017-01-01

Similarities and differences between speech and song are often examined. However, the perceptual definition of these two types of vocalization is challenging. Indeed, the prototypical characteristics of speech or song support top-down processes, which influence listeners' perception of acoustic information. In order to examine vocal features associated with speaking and singing, we propose an innovative approach designed to facilitate bottom-up mechanisms in perceiving vocalizations by using material situated between speech and song: Speechsong. 25 participants were asked to evaluate 20 performances of a speechsong composition by Arnold Schoenberg, "Pierrot lunaire" op. 21 from 1912, evaluating 20 features of vocal-articulatory expression. Raters provided reliable judgments concerning the vocal features used by the performers and did not show strong appeal or specific expectations in reference to Schoenberg's piece. By examining the relationship between the vocal features and the impression of song or speech, the results confirm the importance of pitch (height, contour, range), but also point to the relevance of register, timbre, tension and faucal distance. Besides highlighting vocal features associated with speech and song, this study supports the relevance of the present approach of focusing on a theoretical middle category in order to better understand vocal expression in song and speech.
Speech and swallowing disorders in Parkinson disease.

PubMed

Sapir, Shimon; Ramig, Lorraine; Fox, Cynthia

2008-06-01

To review recent research and clinical studies pertaining to the nature, diagnosis, and treatment of speech and swallowing disorders in Parkinson disease. Although some studies indicate improvement in voice and speech with dopamine therapy and deep brain stimulation of the subthalamic nucleus, others show minimal or adverse effects. Repetitive transcranial magnetic stimulation of the mouth motor cortex and injection of collagen in the vocal folds have preliminary data supporting improvement in phonation in people with Parkinson disease. Treatments focusing on vocal loudness, specifically LSVT LOUD (Lee Silverman Voice Treatment), have been effective for the treatment of speech disorders in Parkinson disease. Changes in brain activity due to LSVT LOUD provide preliminary evidence for neural plasticity. Computer-based technology makes the Lee Silverman Voice Treatment available to a large number of users. A rat model for studying neuropharmacologic effects on vocalization in Parkinson disease has been developed. New diagnostic methods of speech and swallowing are also available as the result of recent studies. Speech rehabilitation with the LSVT LOUD is highly efficacious and scientifically tested. There is a need for more studies to improve understanding, diagnosis, prevention, and treatment of speech and swallowing disorders in Parkinson disease.
Automatic lip reading by using multimodal visual features

NASA Astrophysics Data System (ADS)

Takahashi, Shohei; Ohya, Jun

2013-12-01

Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.
Accent, intelligibility, and comprehensibility in the perception of foreign-accented Lombard speech

NASA Astrophysics Data System (ADS)

Li, Chi-Nin

2003-10-01

Speech produced in noise (Lombard speech) has been reported to be more intelligible than speech produced in quiet (normal speech). This study examined the perception of non-native Lombard speech in terms of intelligibility, comprehensibility, and degree of foreign accent. Twelve Cantonese speakers and a comparison group of English speakers read simple true and false English statements in quiet and in 70 dB of masking noise. Lombard and normal utterances were mixed with noise at a constant signal-to-noise ratio, and presented along with noise-free stimuli to eight new English listeners who provided transcription scores, comprehensibility ratings, and accent ratings. Analyses showed that, as expected, utterances presented in noise were less well perceived than were noise-free sentences, and that the Cantonese speakers' productions were more accented, but less intelligible and less comprehensible than those of the English speakers. For both groups of speakers, the Lombard sentences were correctly transcribed more often than their normal utterances in noisy conditions. However, the Cantonese-accented Lombard sentences were not rated as easier to understand than was the normal speech in all conditions. The assigned accent ratings were similar throughout all listening conditions. Implications of these findings will be discussed.
A Binaural Cochlear Implant Sound Coding Strategy Inspired by the Contralateral Medial Olivocochlear Reflex

PubMed Central

Eustaquio-Martín, Almudena; Stohl, Joshua S.; Wolford, Robert D.; Schatzer, Reinhold; Wilson, Blake S.

2016-01-01

Objectives: In natural hearing, cochlear mechanical compression is dynamically adjusted via the efferent medial olivocochlear reflex (MOCR). These adjustments probably help understanding speech in noisy environments and are not available to the users of current cochlear implants (CIs). The aims of the present study are to: (1) present a binaural CI sound processing strategy inspired by the control of cochlear compression provided by the contralateral MOCR in natural hearing; and (2) assess the benefits of the new strategy for understanding speech presented in competition with steady noise with a speech-like spectrum in various spatial configurations of the speech and noise sources. Design: Pairs of CI sound processors (one per ear) were constructed to mimic or not mimic the effects of the contralateral MOCR on compression. For the nonmimicking condition (standard strategy or STD), the two processors in a pair functioned similarly to standard clinical processors (i.e., with fixed back-end compression and independently of each other). When configured to mimic the effects of the MOCR (MOC strategy), the two processors communicated with each other and the amount of back-end compression in a given frequency channel of each processor in the pair decreased/increased dynamically (so that output levels dropped/increased) with increases/decreases in the output energy from the corresponding frequency channel in the contralateral processor. Speech reception thresholds in speech-shaped noise were measured for 3 bilateral CI users and 2 single-sided deaf unilateral CI users. Thresholds were compared for the STD and MOC strategies in unilateral and bilateral listening conditions and for three spatial configurations of the speech and noise sources in simulated free-field conditions: speech and noise sources colocated in front of the listener, speech on the left ear with noise in front of the listener, and speech on the left ear with noise on the right ear. In both bilateral and unilateral listening, the electrical stimulus delivered to the test ear(s) was always calculated as if the listeners were wearing bilateral processors. Results: In both unilateral and bilateral listening conditions, mean speech reception thresholds were comparable with the two strategies for colocated speech and noise sources, but were at least 2 dB lower (better) with the MOC than with the STD strategy for spatially separated speech and noise sources. In unilateral listening conditions, mean thresholds improved with increasing the spatial separation between the speech and noise sources regardless of the strategy but the improvement was significantly greater with the MOC strategy. In bilateral listening conditions, thresholds improved significantly with increasing the speech-noise spatial separation only with the MOC strategy. Conclusions: The MOC strategy (1) significantly improved the intelligibility of speech presented in competition with a spatially separated noise source, both in unilateral and bilateral listening conditions; (2) produced significant spatial release from masking in bilateral listening conditions, something that did not occur with fixed compression; and (3) enhanced spatial release from masking in unilateral listening conditions. The MOC strategy as implemented here, or a modified version of it, may be usefully applied in CIs and in hearing aids. PMID:26862711
A Binaural Cochlear Implant Sound Coding Strategy Inspired by the Contralateral Medial Olivocochlear Reflex.

PubMed

Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Wilson, Blake S

2016-01-01

In natural hearing, cochlear mechanical compression is dynamically adjusted via the efferent medial olivocochlear reflex (MOCR). These adjustments probably help understanding speech in noisy environments and are not available to the users of current cochlear implants (CIs). The aims of the present study are to: (1) present a binaural CI sound processing strategy inspired by the control of cochlear compression provided by the contralateral MOCR in natural hearing; and (2) assess the benefits of the new strategy for understanding speech presented in competition with steady noise with a speech-like spectrum in various spatial configurations of the speech and noise sources. Pairs of CI sound processors (one per ear) were constructed to mimic or not mimic the effects of the contralateral MOCR on compression. For the nonmimicking condition (standard strategy or STD), the two processors in a pair functioned similarly to standard clinical processors (i.e., with fixed back-end compression and independently of each other). When configured to mimic the effects of the MOCR (MOC strategy), the two processors communicated with each other and the amount of back-end compression in a given frequency channel of each processor in the pair decreased/increased dynamically (so that output levels dropped/increased) with increases/decreases in the output energy from the corresponding frequency channel in the contralateral processor. Speech reception thresholds in speech-shaped noise were measured for 3 bilateral CI users and 2 single-sided deaf unilateral CI users. Thresholds were compared for the STD and MOC strategies in unilateral and bilateral listening conditions and for three spatial configurations of the speech and noise sources in simulated free-field conditions: speech and noise sources colocated in front of the listener, speech on the left ear with noise in front of the listener, and speech on the left ear with noise on the right ear. In both bilateral and unilateral listening, the electrical stimulus delivered to the test ear(s) was always calculated as if the listeners were wearing bilateral processors. In both unilateral and bilateral listening conditions, mean speech reception thresholds were comparable with the two strategies for colocated speech and noise sources, but were at least 2 dB lower (better) with the MOC than with the STD strategy for spatially separated speech and noise sources. In unilateral listening conditions, mean thresholds improved with increasing the spatial separation between the speech and noise sources regardless of the strategy but the improvement was significantly greater with the MOC strategy. In bilateral listening conditions, thresholds improved significantly with increasing the speech-noise spatial separation only with the MOC strategy. The MOC strategy (1) significantly improved the intelligibility of speech presented in competition with a spatially separated noise source, both in unilateral and bilateral listening conditions; (2) produced significant spatial release from masking in bilateral listening conditions, something that did not occur with fixed compression; and (3) enhanced spatial release from masking in unilateral listening conditions. The MOC strategy as implemented here, or a modified version of it, may be usefully applied in CIs and in hearing aids.
The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

PubMed

Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

2009-06-01

The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p < 0.001) than with the 30 IIDR. Group mean CNC scores at 60 dB SPL, loudness ratings, and the signal to noise ratios-50 for Bamford-Kowal-Bench Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.
Rasch Analysis of Word Identification and Magnitude Estimation Scaling Responses in Measuring Naive Listeners' Judgments of Speech Intelligibility of Children with Severe-to-Profound Hearing Impairments

ERIC Educational Resources Information Center

Beltyukova, Svetlana A.; Stone, Gregory M.; Ellis, Lee W.

2008-01-01

Purpose: Speech intelligibility research typically relies on traditional evidence of reliability and validity. This investigation used Rasch analysis to enhance understanding of the functioning and meaning of scores obtained with 2 commonly used procedures: word identification (WI) and magnitude estimation scaling (MES). Method: Narrative samples…
The Neurobiology of Speech Perception and Production-Can Functional Imaging Tell Us Anything We Did Not Already Know?

ERIC Educational Resources Information Center

Scott, Sophie K.

2012-01-01

Our understanding of the neurobiological basis for human speech production and perception has benefited from insights from psychology, neuropsychology and neurology. In this overview, I outline some of the ways that functional imaging has added to this knowledge and argue that, as a neuroanatomical tool, functional imaging has led to some…
L2 Learners' Assessments of Accentedness, Fluency, and Comprehensibility of Native and Nonnative German Speech

ERIC Educational Resources Information Center

O'Brien, Mary Grantham

2014-01-01

In early stages of classroom language learning, many adult second language (L2) learners communicate primarily with one another, yet we know little about which speech stream characteristics learners tune into or the extent to which they understand this lingua franca communication. In the current study, 25 native English speakers learning German as…
Speech Understanding in Noise in Elderly Adults: The Effect of Inhibitory Control and Syntactic Complexity

ERIC Educational Resources Information Center

van Knijff, Eline C.; Coene, Martine; Govaerts, Paul J.

2018-01-01

Background: Previous research has suggested that speech perception in elderly adults is influenced not only by age-related hearing loss or presbycusis but also by declines in cognitive abilities, by background noise and by the syntactic complexity of the message. Aims: To gain further insight into the influence of these cognitive as well as…
The Impact of the Picture Exchange Communication System on Requesting and Speech Development in Preschoolers with Autism Spectrum Disorders and Similar Characteristics

ERIC Educational Resources Information Center

Ganz, Jennifer B.; Simpson, Richard L.; Corbin-Newsome, Jawanda

2008-01-01

By definition children with autism spectrum disorders (ASD) experience difficulty understanding and using language. Accordingly, visual and picture-based strategies such as the Picture Exchange Communication System (PECS) show promise in ameliorating speech and language deficits. This study reports the results of a multiple baseline across…
Team-Based Learning in a Capstone Course in Speech-Language Pathology: Learning Outcomes and Student Perceptions

ERIC Educational Resources Information Center

Wallace, Sarah E.

2015-01-01

Team-based learning (TBL), although found to increase student engagement and higher-level thinking, has not been examined in the field of speech-language pathology. The purpose of this study was to examine the effect of integrating TBL into a capstone course in evidence-based practice (EBP). The researcher evaluated 27 students' understanding of…
Accuracy of Consonant-Vowel Syllables in Young Cochlear Implant Recipients and Hearing Children in the Single-Word Period

ERIC Educational Resources Information Center

Warner-Czyz, Andrea D.; Davis, Barbara L.; MacNeilage, Peter F.

2010-01-01

Purpose: Attaining speech accuracy requires that children perceive and attach meanings to vocal output on the basis of production system capacities. Because auditory perception underlies speech accuracy, profiles for children with hearing loss (HL) differ from those of children with normal hearing (NH). Method: To understand the impact of auditory…
Understanding the New Black Poetry: Black Speech and Black Music as Poetic References.

ERIC Educational Resources Information Center

Henderson, Stephen

Oral tradition, both rural and urban, forms an infrastructure for this anthology, which presents selections of black poetry with an emphasis on the poetry of the sixties. Based on the thesis that the new black poetry's main referents are black speech and black music, the anthology includes examples from the oral tradition of folk sermon,…
Cochlear implantation with hearing preservation yields significant benefit for speech recognition in complex listening environments.

PubMed

Gifford, René H; Dorman, Michael F; Skarzynski, Henryk; Lorens, Artur; Polak, Marek; Driscoll, Colin L W; Roland, Peter; Buchman, Craig A

2013-01-01

The aim of this study was to assess the benefit of having preserved acoustic hearing in the implanted ear for speech recognition in complex listening environments. The present study included a within-subjects, repeated-measures design including 21 English-speaking and 17 Polish-speaking cochlear implant (CI) recipients with preserved acoustic hearing in the implanted ear. The patients were implanted with electrodes that varied in insertion depth from 10 to 31 mm. Mean preoperative low-frequency thresholds (average of 125, 250, and 500 Hz) in the implanted ear were 39.3 and 23.4 dB HL for the English- and Polish-speaking participants, respectively. In one condition, speech perception was assessed in an eight-loudspeaker environment in which the speech signals were presented from one loudspeaker and restaurant noise was presented from all loudspeakers. In another condition, the signals were presented in a simulation of a reverberant environment with a reverberation time of 0.6 sec. The response measures included speech reception thresholds (SRTs) and percent correct sentence understanding for two test conditions: CI plus low-frequency hearing in the contralateral ear (bimodal condition) and CI plus low-frequency hearing in both ears (best-aided condition). A subset of six English-speaking listeners were also assessed on measures of interaural time difference thresholds for a 250-Hz signal. Small, but significant, improvements in performance (1.7-2.1 dB and 6-10 percentage points) were found for the best-aided condition versus the bimodal condition. Postoperative thresholds in the implanted ear were correlated with the degree of electric and acoustic stimulation (EAS) benefit for speech recognition in diffuse noise. There was no reliable relationship among measures of audiometric threshold in the implanted ear nor elevation in threshold after surgery and improvement in speech understanding in reverberation. There was a significant correlation between interaural time difference threshold at 250 Hz and EAS-related benefit for the adaptive speech reception threshold. The findings of this study suggest that (1) preserved low-frequency hearing improves speech understanding for CI recipients, (2) testing in complex listening environments, in which binaural timing cues differ for signal and noise, may best demonstrate the value of having two ears with low-frequency acoustic hearing, and (3) preservation of binaural timing cues, although poorer than observed for individuals with normal hearing, is possible after unilateral cochlear implantation with hearing preservation and is associated with EAS benefit. The results of this study demonstrate significant communicative benefit for hearing preservation in the implanted ear and provide support for the expansion of CI criteria to include individuals with low-frequency thresholds in even the normal to near-normal range.
Experience-Related Structural Changes of Degenerated Occipital White Matter in Late-Blind Humans – A Diffusion Tensor Imaging Study

PubMed Central

Dietrich, Susanne; Hertrich, Ingo; Kumar, Vinod; Ackermann, Hermann

2015-01-01

Late-blind humans can learn to understand speech at ultra-fast syllable rates (ca. 20 syllables/s), a capability associated with hemodynamic activation of the central-visual system. Thus, the observed functional cross-modal recruitment of occipital cortex might facilitate ultra-fast speech processing in these individuals. To further elucidate the structural prerequisites of this skill, diffusion tensor imaging (DTI) was conducted in late-blind subjects differing in their capability of understanding ultra-fast speech. Fractional anisotropy (FA) was determined as a quantitative measure of the directionality of water diffusion, indicating fiber tract characteristics that might be influenced by blindness as well as the acquired perceptual skills. Analysis of the diffusion images revealed reduced FA in late-blind individuals relative to sighted controls at the level of the optic radiations at either side and the right-hemisphere dorsal thalamus (pulvinar). Moreover, late-blind subjects showed significant positive correlations between FA and the capacity of ultra-fast speech comprehension within right-hemisphere optic radiation and thalamus. Thus, experience-related structural alterations occurred in late-blind individuals within visual pathways that, presumably, are linked to higher order frontal language areas. PMID:25830371
Text as a Supplement to Speech in Young and Older Adults a)

PubMed Central

Krull, Vidya; Humes, Larry E.

2015-01-01

Objective The purpose of this experiment was to quantify the contribution of visual text to auditory speech recognition in background noise. Specifically, we tested the hypothesis that partially accurate visual text from an automatic speech recognizer could be used successfully to supplement speech understanding in difficult listening conditions in older adults, with normal or impaired hearing. Our working hypotheses were based on what is known regarding audiovisual speech perception in the elderly from speechreading literature. We hypothesized that: 1) combining auditory and visual text information will result in improved recognition accuracy compared to auditory or visual text information alone; 2) benefit from supplementing speech with visual text (auditory and visual enhancement) in young adults will be greater than that in older adults; and 3) individual differences in performance on perceptual measures would be associated with cognitive abilities. Design Fifteen young adults with normal hearing, fifteen older adults with normal hearing, and fifteen older adults with hearing loss participated in this study. All participants completed sentence recognition tasks in auditory-only, text-only, and combined auditory-text conditions. The auditory sentence stimuli were spectrally shaped to restore audibility for the older participants with impaired hearing. All participants also completed various cognitive measures, including measures of working memory, processing speed, verbal comprehension, perceptual and cognitive speed, processing efficiency, inhibition, and the ability to form wholes from parts. Group effects were examined for each of the perceptual and cognitive measures. Audiovisual benefit was calculated relative to performance on auditory-only and visual-text only conditions. Finally, the relationship between perceptual measures and other independent measures were examined using principal-component factor analyses, followed by regression analyses. Results Both young and older adults performed similarly on nine out of ten perceptual measures (auditory, visual, and combined measures). Combining degraded speech with partially correct text from an automatic speech recognizer improved the understanding of speech in both young and older adults, relative to both auditory- and text-only performance. In all subjects, cognition emerged as a key predictor for a general speech-text integration ability. Conclusions These results suggest that neither age nor hearing loss affected the ability of subjects to benefit from text when used to support speech, after ensuring audibility through spectral shaping. These results also suggest that the benefit obtained by supplementing auditory input with partially accurate text is modulated by cognitive ability, specifically lexical and verbal skills. PMID:26458131
Auditory and Cognitive Factors Associated with Speech-in-Noise Complaints following Mild Traumatic Brain Injury.

PubMed

Hoover, Eric C; Souza, Pamela E; Gallun, Frederick J

2017-04-01

Auditory complaints following mild traumatic brain injury (MTBI) are common, but few studies have addressed the role of auditory temporal processing in speech recognition complaints. In this study, deficits understanding speech in a background of speech noise following MTBI were evaluated with the goal of comparing the relative contributions of auditory and nonauditory factors. A matched-groups design was used in which a group of listeners with a history of MTBI were compared to a group matched in age and pure-tone thresholds, as well as a control group of young listeners with normal hearing (YNH). Of the 33 listeners who participated in the study, 13 were included in the MTBI group (mean age = 46.7 yr), 11 in the Matched group (mean age = 49 yr), and 9 in the YNH group (mean age = 20.8 yr). Speech-in-noise deficits were evaluated using subjective measures as well as monaural word (Words-in-Noise test) and sentence (Quick Speech-in-Noise test) tasks, and a binaural spatial release task. Performance on these measures was compared to psychophysical tasks that evaluate monaural and binaural temporal fine-structure tasks and spectral resolution. Cognitive measures of attention, processing speed, and working memory were evaluated as possible causes of differences between MTBI and Matched groups that might contribute to speech-in-noise perception deficits. A high proportion of listeners in the MTBI group reported difficulty understanding speech in noise (84%) compared to the Matched group (9.1%), and listeners who reported difficulty were more likely to have abnormal results on objective measures of speech in noise. No significant group differences were found between the MTBI and Matched listeners on any of the measures reported, but the number of abnormal tests differed across groups. Regression analysis revealed that a combination of auditory and auditory processing factors contributed to monaural speech-in-noise scores, but the benefit of spatial separation was related to a combination of working memory and peripheral auditory factors across all listeners in the study. The results of this study are consistent with previous findings that a subset of listeners with MTBI has objective auditory deficits. Speech-in-noise performance was related to a combination of auditory and nonauditory factors, confirming the important role of audiology in MTBI rehabilitation. Further research is needed to evaluate the prevalence and causal relationship of auditory deficits following MTBI. American Academy of Audiology

Anticipatory Posturing of the Vocal Tract Reveals Dissociation of Speech Movement Plans from Linguistic Units

PubMed Central

Tilsen, Sam; Spincemaille, Pascal; Xu, Bo; Doerschuk, Peter; Luh, Wen-Ming; Feldman, Elana; Wang, Yi

2016-01-01

Models of speech production typically assume that control over the timing of speech movements is governed by the selection of higher-level linguistic units, such as segments or syllables. This study used real-time magnetic resonance imaging of the vocal tract to investigate the anticipatory movements speakers make prior to producing a vocal response. Two factors were varied: preparation (whether or not speakers had foreknowledge of the target response) and pre-response constraint (whether or not speakers were required to maintain a specific vocal tract posture prior to the response). In prepared responses, many speakers were observed to produce pre-response anticipatory movements with a variety of articulators, showing that that speech movements can be readily dissociated from higher-level linguistic units. Substantial variation was observed across speakers with regard to the articulators used for anticipatory posturing and the contexts in which anticipatory movements occurred. The findings of this study have important consequences for models of speech production and for our understanding of the normal range of variation in anticipatory speech behaviors. PMID:26760511
Anticipatory Posturing of the Vocal Tract Reveals Dissociation of Speech Movement Plans from Linguistic Units.

PubMed

Tilsen, Sam; Spincemaille, Pascal; Xu, Bo; Doerschuk, Peter; Luh, Wen-Ming; Feldman, Elana; Wang, Yi

2016-01-01

Models of speech production typically assume that control over the timing of speech movements is governed by the selection of higher-level linguistic units, such as segments or syllables. This study used real-time magnetic resonance imaging of the vocal tract to investigate the anticipatory movements speakers make prior to producing a vocal response. Two factors were varied: preparation (whether or not speakers had foreknowledge of the target response) and pre-response constraint (whether or not speakers were required to maintain a specific vocal tract posture prior to the response). In prepared responses, many speakers were observed to produce pre-response anticipatory movements with a variety of articulators, showing that that speech movements can be readily dissociated from higher-level linguistic units. Substantial variation was observed across speakers with regard to the articulators used for anticipatory posturing and the contexts in which anticipatory movements occurred. The findings of this study have important consequences for models of speech production and for our understanding of the normal range of variation in anticipatory speech behaviors.
[Effect of speech estimation on social anxiety].

PubMed

Shirotsuki, Kentaro; Sasagawa, Satoko; Nomura, Shinobu

2009-02-01

This study investigates the effect of speech estimation on social anxiety to further understanding of this characteristic of Social Anxiety Disorder (SAD). In the first study, we developed the Speech Estimation Scale (SES) to assess negative estimation before giving a speech which has been reported to be the most fearful social situation in SAD. Undergraduate students (n = 306) completed a set of questionnaires, which consisted of the Short Fear of Negative Evaluation Scale (SFNE), the Social Interaction Anxiety Scale (SIAS), the Social Phobia Scale (SPS), and the SES. Exploratory factor analysis showed an adequate one-factor structure with eight items. Further analysis indicated that the SES had good reliability and validity. In the second study, undergraduate students (n = 315) completed the SFNE, SIAS, SPS, SES, and the Self-reported Depression Scale (SDS). The results of path analysis showed that fear of negative evaluation from others (FNE) predicted social anxiety, and speech estimation mediated the relationship between FNE and social anxiety. These results suggest that speech estimation might maintain SAD symptoms, and could be used as a specific target for cognitive intervention in SAD.
Audiovisual integration in children listening to spectrally degraded speech.

PubMed

Maidment, David W; Kang, Hi Jee; Stewart, Hannah J; Amitay, Sygal

2015-02-01

The study explored whether visual information improves speech identification in typically developing children with normal hearing when the auditory signal is spectrally degraded. Children (n=69) and adults (n=15) were presented with noise-vocoded sentences from the Children's Co-ordinate Response Measure (Rosen, 2011) in auditory-only or audiovisual conditions. The number of bands was adaptively varied to modulate the degradation of the auditory signal, with the number of bands required for approximately 79% correct identification calculated as the threshold. The youngest children (4- to 5-year-olds) did not benefit from accompanying visual information, in comparison to 6- to 11-year-old children and adults. Audiovisual gain also increased with age in the child sample. The current data suggest that children younger than 6 years of age do not fully utilize visual speech cues to enhance speech perception when the auditory signal is degraded. This evidence not only has implications for understanding the development of speech perception skills in children with normal hearing but may also inform the development of new treatment and intervention strategies that aim to remediate speech perception difficulties in pediatric cochlear implant users.
Binaural hearing with electrical stimulation.

PubMed

Kan, Alan; Litovsky, Ruth Y

2015-04-01

Bilateral cochlear implantation is becoming a standard of care in many clinics. While much benefit has been shown through bilateral implantation, patients who have bilateral cochlear implants (CIs) still do not perform as well as normal hearing listeners in sound localization and understanding speech in noisy environments. This difference in performance can arise from a number of different factors, including the areas of hardware and engineering, surgical precision and pathology of the auditory system in deaf persons. While surgical precision and individual pathology are factors that are beyond careful control, improvements can be made in the areas of clinical practice and the engineering of binaural speech processors. These improvements should be grounded in a good understanding of the sensitivities of bilateral CI patients to the acoustic binaural cues that are important to normal hearing listeners for sound localization and speech in noise understanding. To this end, we review the current state-of-the-art in the understanding of the sensitivities of bilateral CI patients to binaural cues in electric hearing, and highlight the important issues and challenges as they relate to clinical practice and the development of new binaural processing strategies. This article is part of a Special Issue entitled . Copyright © 2014 Elsevier B.V. All rights reserved.
A meme's eye view of speech-language pathology.

PubMed

Kamhi, Alan G

2004-04-01

In this article, the reason why certain terms, labels, and ideas prevail, whereas others fail to gain acceptance, will be considered. Borrowing the concept of "meme" from the study of evolution of ideas, it will be clear why language-based and phonological disorders have less widespread appeal than, for example, auditory processing and sensory integration disorders. Discussion will also center on why most speech-language pathologists refer to themselves as speech therapists or speech pathologists, and why it is more desirable to have dyslexia than to have a reading disability. In a meme's eye view, science and logic do not always win out because selection favors ideas (memes) that are easy to understand, remember, and copy. An unfortunate consequence of these selection forces is that successful memes typically provide superficially plausible answers for complex questions.
Examining the Echolalia Literature: Where Do Speech-Language Pathologists Stand?

PubMed

Stiegler, Lillian N

2015-11-01

Echolalia is a common element in the communication of individuals with autism spectrum disorders. Recent contributions to the literature reflect significant disagreement regarding how echolalia should be defined, understood, and managed. The purpose of this review article is to give speech-language pathologists and others a comprehensive view of the available perspectives on echolalia. Published literature from the disciplines of behavioral intervention, linguistics, and speech-language intervention is discussed. Special areas of focus include operational definitions, rationales associated with various approaches, specific procedures used to treat or study echolalic behavior, and reported conclusions. Dissimilarities in the definition and understanding of echolalia have led to vastly different approaches to management. Evidence-based practice protocols are available to guide speech-language interventionists in their work with individuals with autism spectrum disorders.
A dynamic auditory-cognitive system supports speech-in-noise perception in older adults

PubMed Central

Anderson, Samira; White-Schwoch, Travis; Parbery-Clark, Alexandra; Kraus, Nina

2013-01-01

Understanding speech in noise is one of the most complex activities encountered in everyday life, relying on peripheral hearing, central auditory processing, and cognition. These abilities decline with age, and so older adults are often frustrated by a reduced ability to communicate effectively in noisy environments. Many studies have examined these factors independently; in the last decade, however, the idea of the auditory-cognitive system has emerged, recognizing the need to consider the processing of complex sounds in the context of dynamic neural circuits. Here, we use structural equation modeling to evaluate interacting contributions of peripheral hearing, central processing, cognitive ability, and life experiences to understanding speech in noise. We recruited 120 older adults (ages 55 to 79) and evaluated their peripheral hearing status, cognitive skills, and central processing. We also collected demographic measures of life experiences, such as physical activity, intellectual engagement, and musical training. In our model, central processing and cognitive function predicted a significant proportion of variance in the ability to understand speech in noise. To a lesser extent, life experience predicted hearing-in-noise ability through modulation of brainstem function. Peripheral hearing levels did not significantly contribute to the model. Previous musical experience modulated the relative contributions of cognitive ability and lifestyle factors to hearing in noise. Our models demonstrate the complex interactions required to hear in noise and the importance of targeting cognitive function, lifestyle, and central auditory processing in the management of individuals who are having difficulty hearing in noise. PMID:23541911
Nasalance and nasality at experimental velopharyngeal openings in palatal prosthesis: a case study

PubMed Central

LIMA-GREGIO, Aveliny Mantovan; MARINO, Viviane Cristina de Castro; PEGORARO-KROOK, Maria Inês; BARBOSA, Plinio Almeida; AFERRI, Homero Carneiro; DUTKA, Jeniffer de Cassia Rillo

2011-01-01

The use of prosthetic devices for correction of velopharyngeal insufficiency (VPI) is an alternative treatment for patients with conditions that preclude surgery and for those individuals with a hypofunctional velopharynx (HV) with a poor prognosis for the surgical repair of VPI. Understanding the role and measuring the outcome of prosthetic treatment of velopharyngeal dysfunction requires the use of tools that allow for documenting pre- and post-treatment outcomes. Experimental openings in speech bulbs have been used for simulating VPI in studies documenting changes in aerodynamic, acoustic and kinematics aspects of speech associated with the use of palatal prosthetic devices. The use of nasometry to document changes in speech associated with experimental openings in speech bulbs, however, has not been described in the literature. Objective This single-subject study investigated nasalance and nasality at the presence of experimental openings drilled through the speech bulb of a patient with HV. Material and Methods Nasometric recordings of the word "pato" were obtained under 4 velopharyngeal conditions: no-opening (control condition), no speech bulb, speech bulb with a 20 mm2 opening, and speech bulb with 30 mm2 opening. Five speech-language pathologists performed auditory-perceptual ratings while the subject read an oral passage under all conditions. Results Kruskal-Wallis test showed significant difference among conditions (p=0.0002), with Scheffé post hoc test indicating difference from the no-opening condition. Conclusion The changes in nasalance observed after drilling holes of known sizes in a speech bulb suggest that nasometry reflect changes in transfer of sound energy related to different sizes of velopharyngeal opening. PMID:22230996
[Analysis of the speech discrimination scores of patients with congenital unilateral microtia and external auditory canal atresia in noise].

PubMed

Zhang, Y; Li, D D; Chen, X W

2017-06-20

Objective: Case-control study analysis of the speech discrimination of unilateral microtia and external auditory canal atresia patients with normal hearing subjects in quiet and noisy environment. To understand the speech recognition results of patients with unilateral external auditory canal atresia and provide scientific basis for clinical early intervention. Method: Twenty patients with unilateral congenital microtia malformation combined external auditory canal atresia, 20 age matched normal subjects as control group. All subjects used Mandarin speech audiometry material, to test the speech discrimination scores (SDS) in quiet and noisy environment in sound field. Result: There's no significant difference of speech discrimination scores under the condition of quiet between two groups. There's a statistically significant difference when the speech signal in the affected side and noise in the nomalside (single syllable, double syllable, statements; S/N=0 and S/N=-10) ( P <0.05). There's no significant difference of speech discrimination scores when the speech signal in the nomalside and noise in the affected side. There's a statistically significant difference in condition of the signal and noise in the same side when used one-syllable word recognition (S/N=0 and S/N=-5) ( P <0.05), while double syllable word and statement has no statistically significant difference ( P >0.05). Conclusion: The speech discrimination scores of unilateral congenital microtia malformation patients with external auditory canal atresia under the condition of noise is lower than the normal subjects. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.
When infants talk, infants listen: pre-babbling infants prefer listening to speech with infant vocal properties.

PubMed

Masapollo, Matthew; Polka, Linda; Ménard, Lucie

2016-03-01

To learn to produce speech, infants must effectively monitor and assess their own speech output. Yet very little is known about how infants perceive speech produced by an infant, which has higher voice pitch and formant frequencies compared to adult or child speech. Here, we tested whether pre-babbling infants (at 4-6 months) prefer listening to vowel sounds with infant vocal properties over vowel sounds with adult vocal properties. A listening preference favoring infant vowels may derive from their higher voice pitch, which has been shown to attract infant attention in infant-directed speech (IDS). In addition, infants' nascent articulatory abilities may induce a bias favoring infant speech given that 4- to 6-month-olds are beginning to produce vowel sounds. We created infant and adult /i/ ('ee') vowels using a production-based synthesizer that simulates the act of speaking in talkers at different ages and then tested infants across four experiments using a sequential preferential listening task. The findings provide the first evidence that infants preferentially attend to vowel sounds with infant voice pitch and/or formants over vowel sounds with no infant-like vocal properties, supporting the view that infants' production abilities influence how they process infant speech. The findings with respect to voice pitch also reveal parallels between IDS and infant speech, raising new questions about the role of this speech register in infant development. Research exploring the underpinnings and impact of this perceptual bias can expand our understanding of infant language development. © 2015 John Wiley & Sons Ltd.
Relatively effortless listening promotes understanding and recall of medical instructions in older adults

PubMed Central

DiDonato, Roberta M.; Surprenant, Aimée M.

2015-01-01

Communication success under adverse conditions requires efficient and effective recruitment of both bottom-up (sensori-perceptual) and top-down (cognitive-linguistic) resources to decode the intended auditory-verbal message. Employing these limited capacity resources has been shown to vary across the lifespan, with evidence indicating that younger adults out-perform older adults for both comprehension and memory of the message. This study examined how sources of interference arising from the speaker (message spoken with conversational vs. clear speech technique), the listener (hearing-listening and cognitive-linguistic factors), and the environment (in competing speech babble noise vs. quiet) interact and influence learning and memory performance using more ecologically valid methods than has been done previously. The results suggest that when older adults listened to complex medical prescription instructions with “clear speech,” (presented at audible levels through insertion earphones) their learning efficiency, immediate, and delayed memory performance improved relative to their performance when they listened with a normal conversational speech rate (presented at audible levels in sound field). This better learning and memory performance for clear speech listening was maintained even in the presence of speech babble noise. The finding that there was the largest learning-practice effect on 2nd trial performance in the conversational speech when the clear speech listening condition was first is suggestive of greater experience-dependent perceptual learning or adaptation to the speaker's speech and voice pattern in clear speech. This suggests that experience-dependent perceptual learning plays a role in facilitating the language processing and comprehension of a message and subsequent memory encoding. PMID:26106353
Attention Is Required for Knowledge-Based Sequential Grouping: Insights from the Integration of Syllables into Words.

PubMed

Ding, Nai; Pan, Xunyi; Luo, Cheng; Su, Naifei; Zhang, Wen; Zhang, Jianfeng

2018-01-31

How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top-down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top-down attention. SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention. Copyright © 2018 the authors 0270-6474/18/381178-11$15.00/0.
Methods for eliciting, annotating, and analyzing databases for child speech development.

PubMed

Beckman, Mary E; Plummer, Andrew R; Munson, Benjamin; Reidy, Patrick F

2017-09-01

Methods from automatic speech recognition (ASR), such as segmentation and forced alignment, have facilitated the rapid annotation and analysis of very large adult speech databases and databases of caregiver-infant interaction, enabling advances in speech science that were unimaginable just a few decades ago. This paper centers on two main problems that must be addressed in order to have analogous resources for developing and exploiting databases of young children's speech. The first problem is to understand and appreciate the differences between adult and child speech that cause ASR models developed for adult speech to fail when applied to child speech. These differences include the fact that children's vocal tracts are smaller than those of adult males and also changing rapidly in size and shape over the course of development, leading to between-talker variability across age groups that dwarfs the between-talker differences between adult men and women. Moreover, children do not achieve fully adult-like speech motor control until they are young adults, and their vocabularies and phonological proficiency are developing as well, leading to considerably more within-talker variability as well as more between-talker variability. The second problem then is to determine what annotation schemas and analysis techniques can most usefully capture relevant aspects of this variability. Indeed, standard acoustic characterizations applied to child speech reveal that adult-centered annotation schemas fail to capture phenomena such as the emergence of covert contrasts in children's developing phonological systems, while also revealing children's nonuniform progression toward community speech norms as they acquire the phonological systems of their native languages. Both problems point to the need for more basic research into the growth and development of the articulatory system (as well as of the lexicon and phonological system) that is oriented explicitly toward the construction of age-appropriate computational models.
Speech and pause characteristics associated with voluntary rate reduction in Parkinson's disease and Multiple Sclerosis.

PubMed

Tjaden, Kris; Wilding, Greg

2011-01-01

The primary purpose of this study was to investigate how speakers with Parkinson's disease (PD) and Multiple Sclerosis (MS) accomplish voluntary reductions in speech rate. A group of talkers with no history of neurological disease was included for comparison. This study was motivated by the idea that knowledge of how speakers with dysarthria voluntarily accomplish a reduced speech rate would contribute toward a descriptive model of speaking rate change in dysarthria. Such a model has the potential to assist in identifying rate control strategies to receive focus in clinical treatment programs and also would advance understanding of global speech timing in dysarthria. All speakers read a passage in Habitual and Slow conditions. Speech rate, articulation rate, pause duration, and pause frequency were measured. All speaker groups adjusted articulation time as well as pause time to reduce overall speech rate. Group differences in how voluntary rate reduction was accomplished were primarily one of quantity or degree. Overall, a slower-than-normal rate was associated with a reduced articulation rate, shorter speech runs that included fewer syllables, and longer more frequent pauses. Taken together, these results suggest that existing skills or strategies used by patients should be emphasized in dysarthria training programs focusing on rate reduction. Results further suggest that a model of voluntary speech rate reduction based on neurologically normal speech shows promise as being applicable for mild to moderate dysarthria. The reader will be able to: (1) describe the importance of studying voluntary adjustments in speech rate in dysarthria, (2) discuss how speakers with Parkinson's disease and Multiple Sclerosis adjust articulation time and pause time to slow speech rate. Copyright © 2011 Elsevier Inc. All rights reserved.
Research in speech communication.

PubMed Central

Flanagan, J

1995-01-01

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13 PMID:7479806
Improving Speech Perception in Noise with Current Focusing in Cochlear Implant Users

PubMed Central

Srinivasan, Arthi G.; Padilla, Monica; Shannon, Robert V.; Landsberger, David M.

2013-01-01

Cochlear implant (CI) users typically have excellent speech recognition in quiet but struggle with understanding speech in noise. It is thought that broad current spread from stimulating electrodes causes adjacent electrodes to activate overlapping populations of neurons which results in interactions across adjacent channels. Current focusing has been studied as a way to reduce spread of excitation, and therefore, reduce channel interactions. In particular, partial tripolar stimulation has been shown to reduce spread of excitation relative to monopolar stimulation. However, the crucial question is whether this benefit translates to improvements in speech perception. In this study, we compared speech perception in noise with experimental monopolar and partial tripolar speech processing strategies. The two strategies were matched in terms of number of active electrodes, microphone, filterbanks, stimulation rate and loudness (although both strategies used a lower stimulation rate than typical clinical strategies). The results of this study showed a significant improvement in speech perception in noise with partial tripolar stimulation. All subjects benefited from the current focused speech processing strategy. There was a mean improvement in speech recognition threshold of 2.7 dB in a digits in noise task and a mean improvement of 3 dB in a sentences in noise task with partial tripolar stimulation relative to monopolar stimulation. Although the experimental monopolar strategy was worse than the clinical, presumably due to different microphones, frequency allocations and stimulation rates, the experimental partial-tripolar strategy, which had the same changes, showed no acute deficit relative to the clinical. PMID:23467170
Promoting consistent use of the communication function classification system (CFCS).

PubMed

Cunningham, Barbara Jane; Rosenbaum, Peter; Hidecker, Mary Jo Cooley

2016-01-01

We developed a Knowledge Translation (KT) intervention to standardize the way speech-language pathologists working in Ontario Canada's Preschool Speech and Language Program (PSLP) used the Communication Function Classification System (CFCS). This tool was being used as part of a provincial program evaluation and standardizing its use was critical for establishing reliability and validity within the provincial dataset. Two theoretical foundations - Diffusion of Innovations and the Communication Persuasion Matrix - were used to develop and disseminate the intervention to standardize use of the CFCS among a cohort speech-language pathologists. A descriptive pre-test/post-test study was used to evaluate the intervention. Fifty-two participants completed an electronic pre-test survey, reviewed intervention materials online, and then immediately completed an electronic post-test survey. The intervention improved clinicians' understanding of how the CFCS should be used, their intentions to use the tool in the standardized way, and their abilities to make correct classifications using the tool. Findings from this work will be shared with representatives of the Ontario PSLP. The intervention may be disseminated to all speech-language pathologists working in the program. This study can be used as a model for developing and disseminating KT interventions for clinicians in paediatric rehabilitation. The Communication Function Classification System (CFCS) is a new tool that allows speech-language pathologists to classify children's skills into five meaningful levels of function. There is uncertainty and inconsistent practice in the field about the methods for using this tool. This study used combined two theoretical frameworks to develop an intervention to standardize use of the CFCS among a cohort of speech-language pathologists. The intervention effectively increased clinicians' understanding of the methods for using the CFCS, ability to make correct classifications, and intention to use the tool in the standardized way in the future.
Network Modeling for Functional Magnetic Resonance Imaging (fMRI) Signals during Ultra-Fast Speech Comprehension in Late-Blind Listeners

PubMed Central

Dietrich, Susanne; Hertrich, Ingo; Ackermann, Hermann

2015-01-01

In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken language at very high speaking rates up to ca. 20 syllables/s (syl/s). FMRI data showed that hemodynamic activation within right-hemispheric primary visual cortex (V1), bilateral pulvinar (Pv), and left-hemispheric supplementary motor area (pre-SMA) covaried with their capability of ultra-fast speech (16 syllables/s) comprehension. It has been suggested that right V1 plays an important role with respect to the perception of ultra-fast speech features, particularly the detection of syllable onsets. Furthermore, left pre-SMA seems to be an interface between these syllabic representations and the frontal speech processing and working memory network. So far, little is known about the networks linking V1 to Pv, auditory cortex (A1), and (mesio-) frontal areas. Dynamic causal modeling (DCM) was applied to investigate (i) the input structure from A1 and Pv toward right V1 and (ii) output from right V1 and A1 to left pre-SMA. As concerns the input Pv was significantly connected to V1, in addition to A1, in blind participants, but not in sighted controls. Regarding the output V1 was significantly connected to pre-SMA in blind individuals, and the strength of V1-SMA connectivity correlated with the performance of ultra-fast speech comprehension. By contrast, in sighted controls, not understanding ultra-fast speech, pre-SMA did neither receive input from A1 nor V1. Taken together, right V1 might facilitate the “parsing” of the ultra-fast speech stream in blind subjects by receiving subcortical auditory input via the Pv (= secondary visual pathway) and transmitting this information toward contralateral pre-SMA. PMID:26148062
Network Modeling for Functional Magnetic Resonance Imaging (fMRI) Signals during Ultra-Fast Speech Comprehension in Late-Blind Listeners.

PubMed

Dietrich, Susanne; Hertrich, Ingo; Ackermann, Hermann

2015-01-01

In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken language at very high speaking rates up to ca. 20 syllables/s (syl/s). FMRI data showed that hemodynamic activation within right-hemispheric primary visual cortex (V1), bilateral pulvinar (Pv), and left-hemispheric supplementary motor area (pre-SMA) covaried with their capability of ultra-fast speech (16 syllables/s) comprehension. It has been suggested that right V1 plays an important role with respect to the perception of ultra-fast speech features, particularly the detection of syllable onsets. Furthermore, left pre-SMA seems to be an interface between these syllabic representations and the frontal speech processing and working memory network. So far, little is known about the networks linking V1 to Pv, auditory cortex (A1), and (mesio-) frontal areas. Dynamic causal modeling (DCM) was applied to investigate (i) the input structure from A1 and Pv toward right V1 and (ii) output from right V1 and A1 to left pre-SMA. As concerns the input Pv was significantly connected to V1, in addition to A1, in blind participants, but not in sighted controls. Regarding the output V1 was significantly connected to pre-SMA in blind individuals, and the strength of V1-SMA connectivity correlated with the performance of ultra-fast speech comprehension. By contrast, in sighted controls, not understanding ultra-fast speech, pre-SMA did neither receive input from A1 nor V1. Taken together, right V1 might facilitate the "parsing" of the ultra-fast speech stream in blind subjects by receiving subcortical auditory input via the Pv (= secondary visual pathway) and transmitting this information toward contralateral pre-SMA.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.