danish speech intelligibility: Topics by Science.gov

Sample records for danish speech intelligibility

Acceptable noise level (ANL) with Danish and non-semantic speech materials in adult hearing-aid users.

PubMed

Olsen, Steen Østergaard; Lantz, Johannes; Nielsen, Lars Holme; Brännström, K Jonas

2012-09-01

The acceptable noise level (ANL) test is used for quantification of the amount of background noise subjects accept when listening to speech. This study investigates Danish hearing-aid users' ANL performance using Danish and non-semantic speech signals, the repeatability of ANL, and the association between ANL and outcome of the international outcome inventory for hearing aids (IOI-HA). ANL was measured in three conditions in both ears at two test sessions. Subjects completed the IOI-HA and the ANL questionnaire. Sixty-three Danish hearing-aid users; fifty-seven subjects were full time users and 6 were part time/non users of hearing aids according to the ANL questionnaire. ANLs were similar to results with American English speech material. The coefficient of repeatability (CR) was 6.5-8.8 dB. IOI-HA scores were not associated to ANL. Danish and non-semantic ANL versions yield results similar to the American English version. The magnitude of the CR indicates that ANL with Danish and non-semantic speech materials is not suitable for prediction of individual patterns of future hearing-aid use or evaluation of individual benefit from hearing-aid features. The ANL with Danish and non-semantic speech materials is not related to IOI-HA outcome.
Chinese speech intelligibility and its relationship with the speech transmission index for children in elementary school classrooms.

PubMed

Peng, Jianxin; Yan, Nanjie; Wang, Dan

2015-01-01

The present study investigated Chinese speech intelligibility in 28 classrooms from nine different elementary schools in Guangzhou, China. The subjective Chinese speech intelligibility in the classrooms was evaluated with children in grades 2, 4, and 6 (7 to 12 years old). Acoustical measurements were also performed in these classrooms. Subjective Chinese speech intelligibility scores and objective speech intelligibility parameters, such as speech transmission index (STI), were obtained at each listening position for all tests. The relationship between subjective Chinese speech intelligibility scores and STI was revealed and analyzed. The effects of age on Chinese speech intelligibility scores were compared. Results indicate high correlations between subjective Chinese speech intelligibility scores and STI for grades 2, 4, and 6 children. Chinese speech intelligibility scores increase with increase of age under the same STI condition. The differences in scores among different age groups decrease as STI increases. To achieve 95% Chinese speech intelligibility scores, the STIs required for grades 2, 4, and 6 children are 0.75, 0.69, and 0.63, respectively.
Relationship between speech motor control and speech intelligibility in children with speech sound disorders.

PubMed

Namasivayam, Aravind Kumar; Pukonen, Margit; Goshulak, Debra; Yu, Vickie Y; Kadis, Darren S; Kroll, Robert; Pang, Elizabeth W; De Nil, Luc F

2013-01-01

The current study was undertaken to investigate the impact of speech motor issues on the speech intelligibility of children with moderate to severe speech sound disorders (SSD) within the context of the PROMPT intervention approach. The word-level Children's Speech Intelligibility Measure (CSIM), the sentence-level Beginner's Intelligibility Test (BIT) and tests of speech motor control and articulation proficiency were administered to 12 children (3:11 to 6:7 years) before and after PROMPT therapy. PROMPT treatment was provided for 45 min twice a week for 8 weeks. Twenty-four naïve adult listeners aged 22-46 years judged the intelligibility of the words and sentences. For CSIM, each time a recorded word was played to the listeners they were asked to look at a list of 12 words (multiple-choice format) and circle the word while for BIT sentences, the listeners were asked to write down everything they heard. Words correctly circled (CSIM) or transcribed (BIT) were averaged across three naïve judges to calculate percentage speech intelligibility. Speech intelligibility at both the word and sentence level was significantly correlated with speech motor control, but not articulatory proficiency. Further, the severity of speech motor planning and sequencing issues may potentially be a limiting factor in connected speech intelligibility and highlights the need to target these issues early and directly in treatment. The reader will be able to: (1) outline the advantages and disadvantages of using word- and sentence-level speech intelligibility tests; (2) describe the impact of speech motor control and articulatory proficiency on speech intelligibility; and (3) describe how speech motor control and speech intelligibility data may provide critical information to aid treatment planning. Copyright © 2013 Elsevier Inc. All rights reserved.
Female voice communications in high level aircraft cockpit noises--part II: vocoder and automatic speech recognition systems.

PubMed

Nixon, C; Anderson, T; Morris, L; McCavitt, A; McKinley, R; Yeager, D; McDaniel, M

1998-11-01

The intelligibility of female and male speech is equivalent under most ordinary living conditions. However, due to small differences between their acoustic speech signals, called speech spectra, one can be more or less intelligible than the other in certain situations such as high levels of noise. Anecdotal information, supported by some empirical observations, suggests that some of the high intensity noise spectra of military aircraft cockpits may degrade the intelligibility of female speech more than that of male speech. In an applied research study, the intelligibility of female and male speech was measured in several high level aircraft cockpit noise conditions experienced in military aviation. In Part I, (Nixon CW, et al. Aviat Space Environ Med 1998; 69:675-83) female speech intelligibility measured in the spectra and levels of aircraft cockpit noises and with noise-canceling microphones was lower than that of the male speech in all conditions. However, the differences were small and only those at some of the highest noise levels were significant. Although speech intelligibility of both genders was acceptable during normal cruise noises, improvements are required in most of the highest levels of noise created during maximum aircraft operating conditions. These results are discussed in a Part I technical report. This Part II report examines the intelligibility in the same aircraft cockpit noises of vocoded female and male speech and the accuracy with which female and male speech in some of the cockpit noises were understood by automatic speech recognition systems. The intelligibility of vocoded female speech was generally the same as that of vocoded male speech. No significant differences were measured between the recognition accuracy of male and female speech by the automatic speech recognition systems. The intelligibility of female and male speech was equivalent for these conditions.
The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts

PubMed Central

Hayes-Harb, Rachel; Smith, Bruce L.; Bent, Tessa; Bradlow, Ann R.

2009-01-01

This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit. PMID:19606271
Effects of interior aircraft noise on speech intelligibility and annoyance

NASA Technical Reports Server (NTRS)

Pearsons, K. S.; Bennett, R. L.

1977-01-01

Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceived annoyance in 36 subjects. Both speech intelligibility and background level significantly affected judged annoyance. However, the interaction between the two variables showed that above an 85 db background level the speech intelligibility results had a minimal effect on annoyance ratings. Below this level, people rated the background as less annoying if there was adequate speech intelligibility.
Predicting speech intelligibility with a multiple speech subsystems approach in children with cerebral palsy.

PubMed

Lee, Jimin; Hustad, Katherine C; Weismer, Gary

2014-10-01

Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (speech motor impairment [SMI] group) and 9 judged to be free of dysarthria (no SMI [NSMI] group). Data from children with CP were compared to data from age-matched typically developing children. Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and typically developing groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (adjusted R2 = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R2 analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Children in the SMI group had articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems.
Investigation of in-vehicle speech intelligibility metrics for normal hearing and hearing impaired listeners

NASA Astrophysics Data System (ADS)

Samardzic, Nikolina

The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly, as an example of the significance of speech intelligibility evaluation in the context of an applicable listening environment, as indicated in this research, it was found that the jury test participants required on average an approximate 3 dB increase in sound pressure level of speech material while driving and listening compared to when just listening, for an equivalent speech intelligibility performance and the same listening task.
Predicting Speech Intelligibility with A Multiple Speech Subsystems Approach in Children with Cerebral Palsy

PubMed Central

Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

2014-01-01

Purpose Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystem approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (SMI), and nine judged to be free of dysarthria (NSMI). Data from children with CP were compared to data from age-matched typically developing children (TD). Results Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and TD groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (Adjusted R-squared = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R-squared analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Conclusions Children in the SMI group have articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems. PMID:24824584
Evaluation of the importance of time-frequency contributions to speech intelligibility in noise

PubMed Central

Yu, Chengzhu; Wójcicki, Kamil K.; Loizou, Philipos C.; Hansen, John H. L.; Johnson, Michael T.

2014-01-01

Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures. PMID:24815280
The Effect of Noise on Relationships Between Speech Intelligibility and Self-Reported Communication Measures in Tracheoesophageal Speakers.

PubMed

Eadie, Tanya L; Otero, Devon Sawin; Bolt, Susan; Kapsner-Smith, Mara; Sullivan, Jessica R

2016-08-01

The purpose of this study was to examine how sentence intelligibility relates to self-reported communication in tracheoesophageal speakers when speech intelligibility is measured in quiet and noise. Twenty-four tracheoesophageal speakers who were at least 1 year postlaryngectomy provided audio recordings of 5 sentences from the Sentence Intelligibility Test. Speakers also completed self-reported measures of communication-the Voice Handicap Index-10 and the Communicative Participation Item Bank short form. Speech recordings were presented to 2 groups of inexperienced listeners who heard sentences in quiet or noise. Listeners transcribed the sentences to yield speech intelligibility scores. Very weak relationships were found between intelligibility in quiet and measures of voice handicap and communicative participation. Slightly stronger, but still weak and nonsignificant, relationships were observed between measures of intelligibility in noise and both self-reported measures. However, 12 speakers who were more than 65% intelligible in noise showed strong and statistically significant relationships with both self-reported measures (R2 = .76-.79). Speech intelligibility in quiet is a weak predictor of self-reported communication measures in tracheoesophageal speakers. Speech intelligibility in noise may be a better metric of self-reported communicative function for speakers who demonstrate higher speech intelligibility in noise.
Speech Intelligibility in Persian Hearing Impaired Children with Cochlear Implants and Hearing Aids.

PubMed

Rezaei, Mohammad; Emadi, Maryam; Zamani, Peyman; Farahani, Farhad; Lotfi, Gohar

2017-04-01

The aim of present study is to evaluate and compare speech intelligibility in hearing impaired children with cochlear implants (CI) and hearing aid (HA) users and children with normal hearing (NH). The sample consisted of 45 Persian-speaking children aged 3 to 5-years-old. They were divided into three groups, and each group had 15, children, children with CI and children using hearing aids in Hamadan. Participants was evaluated by the test of speech intelligibility level. Results of ANOVA on speech intelligibility test showed that NH children had significantly better reading performance than hearing impaired children with CI and HA. Post-hoc analysis, using Scheffe test, indicated that the mean score of speech intelligibility of normal children was higher than the HA and CI groups; but the difference was not significant between mean of speech intelligibility in children with hearing loss that use cochlear implant and those using HA. It is clear that even with remarkabkle advances in HA technology, many hearing impaired children continue to find speech production a challenging problem. Given that speech intelligibility is a key element in proper communication and social interaction, consequently, educational and rehabilitation programs are essential to improve speech intelligibility of children with hearing loss.
Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

PubMed

Schoenmaker, Esther; van de Par, Steven

2016-01-01

Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.
Intelligibility of clear speech: effect of instruction.

PubMed

Lam, Jennifer; Tjaden, Kris

2013-10-01

The authors investigated how clear speech instructions influence sentence intelligibility. Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis investigated percentage-correct intelligibility scores as a function of the 4 conditions and speaker sex. Additional analyses included listener response variability, individual speaker trends, and an alternate intelligibility measure: proportion of content words correct. Relative to the habitual condition, the overenunciate condition was associated with the greatest intelligibility benefit, followed by the hearing impaired and clear conditions. Ten speakers followed this trend. The results indicated different patterns of clear speech benefit for male and female speakers. Greater listener variability was observed for speakers with inherently low habitual intelligibility compared to speakers with inherently high habitual intelligibility. Stable proportions of content words were observed across conditions. Clear speech instructions affected the magnitude of the intelligibility benefit. The instruction to overenunciate may be most effective in clear speech training programs. The findings may help explain the range of clear speech intelligibility benefit previously reported. Listener variability analyses suggested the importance of obtaining multiple listener judgments of intelligibility, especially for speakers with inherently low habitual intelligibility.
Optimizing acoustical conditions for speech intelligibility in classrooms

NASA Astrophysics Data System (ADS)

Yang, Wonyoung

High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with SNS = 4 dB and increased to 0.8 and 1.2 s with decreased SNS = 0 dB, for both normal and hearing-impaired listeners. Hearing-impaired listeners required more early energy than normal-hearing listeners. Reflective ceiling barriers and ceiling reflectors---in particular, parallel front-back rows of semi-circular reflectors---achieved the goal of decreasing reverberation with the least speech-level reduction.
Speech communications in noise

NASA Technical Reports Server (NTRS)

1984-01-01

The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.
Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.

PubMed

Jørgensen, Søren; Dau, Torsten

2011-09-01

A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNR(env), at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility. © 2011 Acoustical Society of America
Development of the Russian matrix sentence test.

PubMed

Warzybok, Anna; Zokoll, Melanie; Wardenga, Nina; Ozimek, Edward; Boboshko, Maria; Kollmeier, Birger

2015-01-01

To develop the Russian matrix sentence test for speech intelligibility measurements in noise. Test development included recordings, optimization of speech material, and evaluation to investigate the equivalency of the test lists and training. For each of the 500 test items, the speech intelligibility function, speech reception threshold (SRT: signal-to-noise ratio, SNR, that provides 50% speech intelligibility), and slope was obtained. The speech material was homogenized by applying level corrections. In evaluation measurements, speech intelligibility was measured at two fixed SNRs to compare list-specific intelligibility functions. To investigate the training effect and establish reference data, speech intelligibility was measured adaptively. Overall, 77 normal-hearing native Russian listeners. The optimization procedure decreased the spread in SRTs across words from 2.8 to 0.6 dB. Evaluation measurements confirmed that the 16 test lists were equivalent, with a mean SRT of -9.5 ± 0.2 dB and a slope of 13.8 ± 1.6%/dB. The reference SRT, -8.8 ± 0.8 dB for the open-set and -9.4 ± 0.8 dB for the closed-set format, increased slightly for noise levels above 75 dB SPL. The Russian matrix sentence test is suitable for accurate and reliable speech intelligibility measurements in noise.
Speech intelligibility enhancement after maxillary denture treatment and its impact on quality of life.

PubMed

Knipfer, Christian; Riemann, Max; Bocklet, Tobias; Noeth, Elmar; Schuster, Maria; Sokol, Biljana; Eitner, Stephan; Nkenke, Emeka; Stelzle, Florian

2014-01-01

Tooth loss and its prosthetic rehabilitation significantly affect speech intelligibility. However, little is known about the influence of speech deficiencies on oral health-related quality of life (OHRQoL). The aim of this study was to investigate whether speech intelligibility enhancement through prosthetic rehabilitation significantly influences OHRQoL in patients wearing complete maxillary dentures. Speech intelligibility by means of an automatic speech recognition system (ASR) was prospectively evaluated and compared with subjectively assessed Oral Health Impact Profile (OHIP) scores. Speech was recorded in 28 edentulous patients 1 week prior to the fabrication of new complete maxillary dentures and 6 months thereafter. Speech intelligibility was computed based on the word accuracy (WA) by means of an ASR and compared with a matched control group. One week before and 6 months after rehabilitation, patients assessed themselves for OHRQoL. Speech intelligibility improved significantly after 6 months. Subjects reported a significantly higher OHRQoL after maxillary rehabilitation with complete dentures. No significant correlation was found between the OHIP sum score or its subscales to the WA. Speech intelligibility enhancement achieved through the fabrication of new complete maxillary dentures might not be in the forefront of the patients' perception of their quality of life. For the improvement of OHRQoL in patients wearing complete maxillary dentures, food intake and mastication as well as freedom from pain play a more prominent role.
Formant trajectory characteristics in speakers with dysarthria and homogeneous speech intelligibility scores: Further data

NASA Astrophysics Data System (ADS)

Kim, Yunjung; Weismer, Gary; Kent, Ray D.

2005-09-01

In previous work [J. Acoust. Soc. Am. 117, 2605 (2005)], we reported on formant trajectory characteristics of a relatively large number of speakers with dysarthria and near-normal speech intelligibility. The purpose of that analysis was to begin a documentation of the variability, within relatively homogeneous speech-severity groups, of acoustic measures commonly used to predict across-speaker variation in speech intelligibility. In that study we found that even with near-normal speech intelligibility (90%-100%), many speakers had reduced formant slopes for some words and distributional characteristics of acoustic measures that were different than values obtained from normal speakers. In the current report we extend those findings to a group of speakers with dysarthria with somewhat poorer speech intelligibility than the original group. Results are discussed in terms of the utility of certain acoustic measures as indices of speech intelligibility, and as explanatory data for theories of dysarthria. [Work supported by NIH Award R01 DC00319.

Variability and Diagnostic Accuracy of Speech Intelligibility Scores in Children

ERIC Educational Resources Information Center

Hustad, Katherine C.; Oakes, Ashley; Allison, Kristen

2015-01-01

Purpose: We examined variability of speech intelligibility scores and how well intelligibility scores predicted group membership among 5-year-old children with speech motor impairment (SMI) secondary to cerebral palsy and an age-matched group of typically developing (TD) children. Method: Speech samples varying in length from 1-4 words were…
Speech Intelligibility and Personality Peer-Ratings of Young Adults with Cochlear Implants

ERIC Educational Resources Information Center

Freeman, Valerie

2018-01-01

Speech intelligibility, or how well a speaker's words are understood by others, affects listeners' judgments of the speaker's competence and personality. Deaf cochlear implant (CI) users vary widely in speech intelligibility, and their speech may have a noticeable "deaf" quality, both of which could evoke negative stereotypes or…
Speech Characteristics and Intelligibility in Adults with Mild and Moderate Intellectual Disabilities

PubMed Central

Coppens-Hofman, Marjolein C.; Terband, Hayo; Snik, Ad F.M.; Maassen, Ben A.M.

2017-01-01

Purpose Adults with intellectual disabilities (ID) often show reduced speech intelligibility, which affects their social interaction skills. This study aims to establish the main predictors of this reduced intelligibility in order to ultimately optimise management. Method Spontaneous speech and picture naming tasks were recorded in 36 adults with mild or moderate ID. Twenty-five naïve listeners rated the intelligibility of the spontaneous speech samples. Performance on the picture-naming task was analysed by means of a phonological error analysis based on expert transcriptions. Results The transcription analyses showed that the phonemic and syllabic inventories of the speakers were complete. However, multiple errors at the phonemic and syllabic level were found. The frequencies of specific types of errors were related to intelligibility and quality ratings. Conclusions The development of the phonemic and syllabic repertoire appears to be completed in adults with mild-to-moderate ID. The charted speech difficulties can be interpreted to indicate speech motor control and planning difficulties. These findings may aid the development of diagnostic tests and speech therapies aimed at improving speech intelligibility in this specific group. PMID:28118637
Speech Prosody Across Stimulus Types for Individuals with Parkinson's Disease.

PubMed

K-Y Ma, Joan; Schneider, Christine B; Hoffmann, Rüdiger; Storch, Alexander

2015-01-01

Up to 89% of the individuals with Parkinson's disease (PD) experience speech problem over the course of the disease. Speech prosody and intelligibility are two of the most affected areas in hypokinetic dysarthria. However, assessment of these areas could potentially be problematic as speech prosody and intelligibility could be affected by the type of speech materials employed. To comparatively explore the effects of different types of speech stimulus on speech prosody and intelligibility in PD speakers. Speech prosody and intelligibility of two groups of individuals with varying degree of dysarthria resulting from PD was compared to that of a group of control speakers using sentence reading, passage reading and monologue. Acoustic analysis including measures on fundamental frequency (F0), intensity and speech rate was used to form a prosodic profile for each individual. Speech intelligibility was measured for the speakers with dysarthria using direct magnitude estimation. Difference in F0 variability between the speakers with dysarthria and control speakers was only observed in sentence reading task. Difference in the average intensity level was observed for speakers with mild dysarthria to that of the control speakers. Additionally, there were stimulus effect on both intelligibility and prosodic profile. The prosodic profile of PD speakers was different from that of the control speakers in the more structured task, and lower intelligibility was found in less structured task. This highlighted the value of both structured and natural stimulus to evaluate speech production in PD speakers.
Effects of speaking task on intelligibility in Parkinson’s disease

PubMed Central

TJADEN, KRIS; WILDING, GREG

2017-01-01

Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare intelligibility estimates obtained for a reading passage and an extemporaneous monologue produced by12 speakers with Parkinson’s disease (PD). The relationship between structural characteristics of utterances and scaled intelligibility was explored within speakers. Speakers were audio-recorded while reading a paragraph and producing a monologue. Speech samples were separated into individual utterances for presentation to 70 listeners who judged intelligibility using orthographic transcription and direct magnitude estimation (DME). Results suggest that scaled estimates of intelligibility for reading show potential for indexing intelligibility of an extemporaneous monologue. Within-speaker variation in scaled intelligibility also was related to the number of words per speech run for extemporaneous speech. PMID:20887216
Assessment of Intelligibility Using Children's Spontaneous Speech: Methodological Aspects

ERIC Educational Resources Information Center

Lagerberg, Tove B.; Åsberg, Jakob; Hartelius, Lena; Persson, Christina

2014-01-01

Background: Intelligibility is a speaker's ability to convey a message to a listener. Including an assessment of intelligibility is essential in both research and clinical work relating to individuals with communication disorders due to speech impairment. Assessment of the intelligibility of spontaneous speech can be used as an overall…
Impact of clear, loud, and slow speech on scaled intelligibility and speech severity in Parkinson's disease and multiple sclerosis.

PubMed

Tjaden, Kris; Sussman, Joan E; Wilding, Gregory E

2014-06-01

The perceptual consequences of rate reduction, increased vocal intensity, and clear speech were studied in speakers with multiple sclerosis (MS), Parkinson's disease (PD), and healthy controls. Seventy-eight speakers read sentences in habitual, clear, loud, and slow conditions. Sentences were equated for peak amplitude and mixed with multitalker babble for presentation to listeners. Using a computerized visual analog scale, listeners judged intelligibility or speech severity as operationally defined in Sussman and Tjaden (2012). Loud and clear but not slow conditions improved intelligibility relative to the habitual condition. With the exception of the loud condition for the PD group, speech severity did not improve above habitual and was reduced relative to habitual in some instances. Intelligibility and speech severity were strongly related, but relationships for disordered speakers were weaker in clear and slow conditions versus habitual. Both clear and loud speech show promise for improving intelligibility and maintaining or improving speech severity in multitalker babble for speakers with mild dysarthria secondary to MS or PD, at least as these perceptual constructs were defined and measured in this study. Although scaled intelligibility and speech severity overlap, the metrics further appear to have some separate value in documenting treatment-related speech changes.
Prior exposure to a reverberant listening environment improves speech intelligibility in adult cochlear implant listeners.

PubMed

Srinivasan, Nirmal Kumar; Tobey, Emily A; Loizou, Philipos C

2016-01-01

The goal of this study is to investigate whether prior exposure to reverberant listening environment improves speech intelligibility of adult cochlear implant (CI) users. Six adult CI users participated in this study. Speech intelligibility was measured in five different simulated reverberant listening environments with two different speech corpuses. Within each listening environment, prior exposure was varied by either having the same environment across all trials (blocked presentation) or having different environment from trial to trial (unblocked). Speech intelligibility decreased as reverberation time increased. Although substantial individual variability was observed, all CI listeners showed an increase in the blocked presentation condition as compared to the unblocked presentation condition for both speech corpuses. Prior listening exposure to a reverberant listening environment improves speech intelligibility in adult CI listeners. Further research is required to understand the underlying mechanism of adaptation to listening environment.
The Effect of Background Noise on Intelligibility of Dysphonic Speech

ERIC Educational Resources Information Center

Ishikawa, Keiko; Boyce, Suzanne; Kelchner, Lisa; Powell, Maria Golla; Schieve, Heidi; de Alarcon, Alessandro; Khosla, Sid

2017-01-01

Purpose: The aim of this study is to determine the effect of background noise on the intelligibility of dysphonic speech and to examine the relationship between intelligibility in noise and an acoustic measure of dysphonia--cepstral peak prominence (CPP). Method: A study of speech perception was conducted using speech samples from 6 adult speakers…
Effects of linear and nonlinear speech rate changes on speech intelligibility in stationary and fluctuating maskers

PubMed Central

Cooke, Martin; Aubanel, Vincent

2017-01-01

Algorithmic modifications to the durational structure of speech designed to avoid intervals of intense masking lead to increases in intelligibility, but the basis for such gains is not clear. The current study addressed the possibility that the reduced information load produced by speech rate slowing might explain some or all of the benefits of durational modifications. The study also investigated the influence of masker stationarity on the effectiveness of durational changes. Listeners identified keywords in sentences that had undergone linear and nonlinear speech rate changes resulting in overall temporal lengthening in the presence of stationary and fluctuating maskers. Relative to unmodified speech, a slower speech rate produced no intelligibility gains for the stationary masker, suggesting that a reduction in information rate does not underlie intelligibility benefits of durationally modified speech. However, both linear and nonlinear modifications led to substantial intelligibility increases in fluctuating noise. One possibility is that overall increases in speech duration provide no new phonetic information in stationary masking conditions, but that temporal fluctuations in the background increase the likelihood of glimpsing additional salient speech cues. Alternatively, listeners may have benefitted from an increase in the difference in speech rates between the target and background. PMID:28618803
Quantifying the intelligibility of speech in noise for non-native listeners.

PubMed

van Wijngaarden, Sander J; Steeneken, Herman J M; Houtgast, Tammo

2002-04-01

When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.
Quantifying the intelligibility of speech in noise for non-native listeners

NASA Astrophysics Data System (ADS)

van Wijngaarden, Sander J.; Steeneken, Herman J. M.; Houtgast, Tammo

2002-04-01

When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.
Speech intelligibility after glossectomy and speech rehabilitation.

PubMed

Furia, C L; Kowalski, L P; Latorre, M R; Angelis, E C; Martins, N M; Barros, A P; Ribeiro, K C

2001-07-01

Oral tumor resections cause articulation deficiencies, depending on the site, extent of resection, type of reconstruction, and tongue stump mobility. To evaluate the speech intelligibility of patients undergoing total, subtotal, or partial glossectomy, before and after speech therapy. Twenty-seven patients (24 men and 3 women), aged 34 to 77 years (mean age, 56.5 years), underwent glossectomy. Tumor stages were T1 in 3 patients, T2 in 4, T3 in 8, T4 in 11, and TX in 1; node stages, N0 in 15 patients, N1 in 5, N2a-c in 6, and N3 in 1. No patient had metastases (M0). Patients were divided into 3 groups by extent of tongue resection, ie, total (group 1; n = 6), subtotal (group 2; n = 9), and partial (group 3; n = 12). Different phonological tasks were recorded and analyzed by 3 experienced judges, including sustained 7 oral vowels, vowel in a syllable, and the sequence vowel-consonant-vowel (VCV). The intelligibility of spontaneous speech (sequence story) was scored from 1 to 4 in consensus. All patients underwent a therapeutic program to activate articulatory adaptations, compensations, and maximization of the remaining structures for 3 to 6 months. The tasks were recorded after speech therapy. To compare mean changes, analyses of variance and Wilcoxon tests were used. Patients of groups 1 and 2 significantly improved their speech intelligibility (P<.05). Group 1 improved vowels, VCV, and spontaneous speech; group 2, syllable, VCV, and spontaneous speech. Group 3 demonstrated better intelligibility in the pretherapy phase, but the improvement after therapy was not significant. Speech therapy was effective in improving speech intelligibility of patients undergoing glossectomy, even after major resection. Different pretherapy ability between groups was seen, with improvement of speech intelligibility in groups 1 and 2. The improvement of speech intelligibility in group 3 was not statistically significant, possibly because of the small and heterogeneous sample.
Sensitivity of the Speech Intelligibility Index to the Assumed Dynamic Range

ERIC Educational Resources Information Center

Jin, In-Ki; Kates, James M.; Arehart, Kathryn H.

2017-01-01

Purpose: This study aims to evaluate the sensitivity of the speech intelligibility index (SII) to the assumed speech dynamic range (DR) in different languages and with different types of stimuli. Method: Intelligibility prediction uses the absolute transfer function (ATF) to map the SII value to the predicted intelligibility for a given stimuli.…
Effects of intelligibility on working memory demand for speech perception.

PubMed

Francis, Alexander L; Nusbaum, Howard C

2009-08-01

Understanding low-intelligibility speech is effortful. In three experiments, we examined the effects of intelligibility on working memory (WM) demands imposed by perception of synthetic speech. In all three experiments, a primary speeded word recognition task was paired with a secondary WM-load task designed to vary the availability of WM capacity during speech perception. Speech intelligibility was varied either by training listeners to use available acoustic cues in a more diagnostic manner (as in Experiment 1) or by providing listeners with more informative acoustic cues (i.e., better speech quality, as in Experiments 2 and 3). In the first experiment, training significantly improved intelligibility and recognition speed; increasing WM load significantly slowed recognition. A significant interaction between training and load indicated that the benefit of training on recognition speed was observed only under low memory load. In subsequent experiments, listeners received no training; intelligibility was manipulated by changing synthesizers. Improving intelligibility without training improved recognition accuracy, and increasing memory load still decreased it, but more intelligible speech did not produce more efficient use of available WM capacity. This suggests that perceptual learning modifies the way available capacity is used, perhaps by increasing the use of more phonetically informative features and/or by decreasing use of less informative ones.
Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

ERIC Educational Resources Information Center

Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

2014-01-01

Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…
Aircraft noise and speech intelligibility in an outdoor living space.

PubMed

Alvarsson, Jesper J; Nordström, Henrik; Lundén, Peter; Nilsson, Mats E

2014-06-01

Studies of effects on speech intelligibility from aircraft noise in outdoor places are currently lacking. To explore these effects, first-order ambisonic recordings of aircraft noise were reproduced outdoors in a pergola. The average background level was 47 dB LA eq. Lists of phonetically balanced words (LAS max,word = 54 dB) were reproduced simultaneously with aircraft passage noise (LAS max,noise = 72-84 dB). Twenty individually tested listeners wrote down each presented word while seated in the pergola. The main results were (i) aircraft noise negatively affects speech intelligibility at sound pressure levels that exceed those of the speech sound (signal-to-noise ratio, S/N < 0), and (ii) the simple A-weighted S/N ratio was nearly as good an indicator of speech intelligibility as were two more advanced models, the Speech Intelligibility Index and Glasberg and Moore's [J. Audio Eng. Soc. 53, 906-918 (2005)] partial loudness model. This suggests that any of these indicators is applicable for predicting effects of aircraft noise on speech intelligibility outdoors.
Fundamental frequency discrimination and speech perception in noise in cochlear implant simulationsa)

PubMed Central

Carroll, Jeff; Zeng, Fan-Gang

2007-01-01

Increasing the number of channels at low frequencies improves discrimination of fundamental frequency (F0) in cochlear implants [Geurts and Wouters 2004]. We conducted three experiments to test whether improved F0 discrimination can be translated into increased speech intelligibility in noise in a cochlear implant simulation. The first experiment measured F0 discrimination and speech intelligibility in quiet as a function of channel density over different frequency regions. The results from this experiment showed a tradeoff in performance between F0 discrimination and speech intelligibility with a limited number of channels. The second experiment tested whether improved F0 discrimination and optimizing this tradeoff could improve speech performance with a competing talker. However, improved F0 discrimination did not improve speech intelligibility in noise. The third experiment identified the critical number of channels needed at low frequencies to improve speech intelligibility in noise. The result showed that, while 16 channels below 500 Hz were needed to observe any improvement in speech intelligibility in noise, even 32 channels did not achieve normal performance. Theoretically, these results suggest that without accurate spectral coding, F0 discrimination and speech perception in noise are two independent processes. Practically, the present results illustrate the need to increase the number of independent channels in cochlear implants. PMID:17604581
Segmental intelligibility of synthetic speech produced by rule.

PubMed

Logan, J S; Greene, B G; Pisoni, D B

1989-08-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.
Segmental intelligibility of synthetic speech produced by rule

PubMed Central

Logan, John S.; Greene, Beth G.; Pisoni, David B.

2012-01-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884

Characterizing Speech Intelligibility in Noise After Wide Dynamic Range Compression.

PubMed

Rhebergen, Koenraad S; Maalderink, Thijs H; Dreschler, Wouter A

The effects of nonlinear signal processing on speech intelligibility in noise are difficult to evaluate. Often, the effects are examined by comparing speech intelligibility scores with and without processing measured at fixed signal to noise ratios (SNRs) or by comparing the adaptive measured speech reception thresholds corresponding to 50% intelligibility (SRT50) with and without processing. These outcome measures might not be optimal. Measuring at fixed SNRs can be affected by ceiling or floor effects, because the range of relevant SNRs is not know in advance. The SRT50 is less time consuming, has a fixed performance level (i.e., 50% correct), but the SRT50 could give a limited view, because we hypothesize that the effect of most nonlinear signal processing algorithms at the SRT50 cannot be generalized to other points of the psychometric function. In this article, we tested the value of estimating the entire psychometric function. We studied the effect of wide dynamic range compression (WDRC) on speech intelligibility in stationary, and interrupted speech-shaped noise in normal-hearing subjects, using a fast method-based local linear fitting approach and by two adaptive procedures. The measured performance differences for conditions with and without WDRC for the psychometric functions in stationary noise and interrupted speech-shaped noise show that the effects of WDRC on speech intelligibility are SNR dependent. We conclude that favorable and unfavorable effects of WDRC on speech intelligibility can be missed if the results are presented in terms of SRT50 values only.
Motivation as a predictor of speech intelligibility after total laryngectomy.

PubMed

Singer, Susanne; Meyer, Alexandra; Fuchs, Michael; Schock, Juliane; Pabst, Friedemann; Vogel, Hans-Joachim; Oeken, Jens; Sandner, Annett; Koscielny, Sven; Hormes, Karl; Breitenstein, Kerstin; Dietz, Andreas

2013-06-01

It has often been argued that if patients' success with speech rehabilitation after laryngectomy is limited, it is the result of lacking motivation on their part. This project investigated the role of motivation in speech rehabilitation. In a multicenter prospective cohort study, 141 laryngectomees were interviewed at the beginning of rehabilitation and 1 year after laryngectomy. Speech intelligibility was measured with a standardized test, and patients self-assessed their own motivation shortly after the surgery. Logistic regression, adjusted for several theory-based confounding factors, was used to assess the impact of motivation on speech intelligibility. Speech intelligibility 1 year after laryngectomy was not significantly associated with the level of motivation at the beginning of rehabilitation (odds ratio [OR], 1.3; 95% confidence interval [CI], 0.7-2.3; p = .43) after adjusting for the effect of potential confounders (implantation of a voice prosthesis, patient's cognitive abilities, frustration tolerance, physical functioning, and type of rehabilitation). Motivation is not a strong predictor of speech intelligibility 1 year after laryngectomy. Copyright © 2012 Wiley Periodicals, Inc.
An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

PubMed

Kim, Gibak; Lu, Yang; Hu, Yi; Loizou, Philipos C

2009-09-01

Traditional noise-suppression algorithms have been shown to improve speech quality, but not speech intelligibility. Motivated by prior intelligibility studies of speech synthesized using the ideal binary mask, an algorithm is proposed that decomposes the input signal into time-frequency (T-F) units and makes binary decisions, based on a Bayesian classifier, as to whether each T-F unit is dominated by the target or the masker. Speech corrupted at low signal-to-noise ratio (SNR) levels (-5 and 0 dB) using different types of maskers is synthesized by this algorithm and presented to normal-hearing listeners for identification. Results indicated substantial improvements in intelligibility (over 60% points in -5 dB babble) over that attained by human listeners with unprocessed stimuli. The findings from this study suggest that algorithms that can estimate reliably the SNR in each T-F unit can improve speech intelligibility.
Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems

PubMed Central

Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L.; Berry, James D.; Perry, Bridget; Green, Jordan R.

2016-01-01

Purpose To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Method Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Results Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Conclusion Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred. PMID:27148967
Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems.

PubMed

Rong, Panying; Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L; Berry, James D; Perry, Bridget; Green, Jordan R

2016-01-01

To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred.
Intelligibility as a clinical outcome measure following intervention with children with phonologically based speech-sound disorders.

PubMed

Lousada, M; Jesus, Luis M T; Hall, A; Joffe, V

2014-01-01

The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological processes and phonetic inventory). Considering that the ultimate goal of intervention for children with phonologically based SSD is to improve intelligibility, it is curious that intervention studies focusing on children's phonology do not routinely use intelligibility as an outcome measure. It is therefore important that the impact of interventions on speech intelligibility is explored. This paper investigates the effectiveness of the two treatment approaches (phonological therapy and articulation therapy) using intelligibility measures, both in single words and in continuous speech, as the primary outcome. Fourteen children with phonologically based SSD participated in the intervention. The children were randomly assigned to phonological therapy or articulation therapy (seven children in each group). Two assessment methods were used for measuring intelligibility: a word identification task (for single words) and a rating scale (for continuous speech). Twenty-one unfamiliar adults listened and judged the children's intelligibility. Reliability analyses showed overall high agreement between listeners across both methods. Significant improvements were noted in intelligibility in both single words (paired t(6)=4.409, p=0.005) and continuous speech (asymptotic Z=2.371, p=0.018) for the group receiving phonology therapy pre- to post-treatment, but no differences in intelligibility were found for those receiving the articulation therapy pre- to post-treatment, either for single words (paired t(6)=1.763, p=0.128) or continuous speech (asymptotic Z=1.442, p=0.149). Intelligibility measures were sensitive enough to show changes in the phonological therapy group but not in the articulation therapy group. These findings emphasize the importance of using intelligibility as an outcome measure to complement the results obtained with other severity measures when exploring the effectiveness of speech interventions. This study presents new evidence for the effectiveness of phonological therapy in improving intelligibility with children with SSD. © 2014 Royal College of Speech and Language Therapists.
Predicting Intelligibility Gains in Dysarthria through Automated Speech Feature Analysis

ERIC Educational Resources Information Center

Fletcher, Annalise R.; Wisler, Alan A.; McAuliffe, Megan J.; Lansford, Kaitlin L.; Liss, Julie M.

2017-01-01

Purpose: Behavioral speech modifications have variable effects on the intelligibility of speakers with dysarthria. In the companion article, a significant relationship was found between measures of speakers' baseline speech and their intelligibility gains following cues to speak louder and reduce rate (Fletcher, McAuliffe, Lansford, Sinex, &…
Teachers' perceptions of students with speech sound disorders: a quantitative and qualitative analysis.

PubMed

Overby, Megan; Carrell, Thomas; Bernthal, John

2007-10-01

This study examined 2nd-grade teachers' perceptions of the academic, social, and behavioral competence of students with speech sound disorders (SSDs). Forty-eight 2nd-grade teachers listened to 2 groups of sentences differing by intelligibility and pitch but spoken by a single 2nd grader. For each sentence group, teachers rated the speaker's academic, social, and behavioral competence using an adapted version of the Teacher Rating Scale of the Self-Perception Profile for Children (S. Harter, 1985) and completed 3 open-ended questions. The matched-guise design controlled for confounding speaker and stimuli variables that were inherent in prior studies. Statistically significant differences in teachers' expectations of children's academic, social, and behavioral performances were found between moderately intelligible and normal intelligibility speech. Teachers associated moderately intelligible low-pitched speech with more behavior problems than moderately intelligible high-pitched speech or either pitch with normal intelligibility. One third of the teachers reported that they could not accurately predict a child's school performance based on the child's speech skills, one third of the teachers causally related school difficulty to SSD, and one third of the teachers made no comment. Intelligibility and speaker pitch appear to be speech variables that influence teachers' perceptions of children's school performance.
Relationship Between Speech Intelligibility and Speech Comprehension in Babble Noise.

PubMed

Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

2015-06-01

The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to transcribe what they heard and obey the commands in an interactive environment set up for this purpose. The former test provided intelligibility scores and the latter provided comprehension scores. Collected data revealed a globally weak correlation between intelligibility and comprehension scores (r = .35, p < .001). The discrepancy tended to grow as noise level increased. An analysis of standard deviations showed that variability in comprehension scores increased linearly with noise level, whereas higher variability in intelligibility scores was found for moderate noise level conditions. These results support the hypothesis that intelligibility scores are poor predictors of listeners' comprehension in real communication situations. Intelligibility and comprehension scores appear to provide different insights, the first measure being centered on speech signal transfer and the second on communicative performance. Both theoretical and practical implications for the use of speech intelligibility tests as indicators of speakers' performances are discussed.
Bidirectional clear speech perception benefit for native and high-proficiency non-native talkers and listeners: Intelligibility and accentednessa

PubMed Central

Smiljanić, Rajka; Bradlow, Ann R.

2011-01-01

This study investigated how native language background interacts with speaking style adaptations in determining levels of speech intelligibility. The aim was to explore whether native and high proficiency non-native listeners benefit similarly from native and non-native clear speech adjustments. The sentence-in-noise perception results revealed that fluent non-native listeners gained a large clear speech benefit from native clear speech modifications. Furthermore, proficient non-native talkers in this study implemented conversational-to-clear speaking style modifications in their second language (L2) that resulted in significant intelligibility gain for both native and non-native listeners. The results of the accentedness ratings obtained for native and non-native conversational and clear speech sentences showed that while intelligibility was improved, the presence of foreign accent remained constant in both speaking styles. This suggests that objective intelligibility and subjective accentedness are two independent dimensions of non-native speech. Overall, these results provide strong evidence that greater experience in L2 processing leads to improved intelligibility in both production and perception domains. These results also demonstrated that speaking style adaptations along with less signal distortion can contribute significantly towards successful native and non-native interactions. PMID:22225056
The influence of visual speech information on the intelligibility of English consonants produced by non-native speakers.

PubMed

Kawase, Saya; Hannah, Beverly; Wang, Yue

2014-09-01

This study examines how visual speech information affects native judgments of the intelligibility of speech sounds produced by non-native (L2) speakers. Native Canadian English perceivers as judges perceived three English phonemic contrasts (/b-v, θ-s, l-ɹ/) produced by native Japanese speakers as well as native Canadian English speakers as controls. These stimuli were presented under audio-visual (AV, with speaker voice and face), audio-only (AO), and visual-only (VO) conditions. The results showed that, across conditions, the overall intelligibility of Japanese productions of the native (Japanese)-like phonemes (/b, s, l/) was significantly higher than the non-Japanese phonemes (/v, θ, ɹ/). In terms of visual effects, the more visually salient non-Japanese phonemes /v, θ/ were perceived as significantly more intelligible when presented in the AV compared to the AO condition, indicating enhanced intelligibility when visual speech information is available. However, the non-Japanese phoneme /ɹ/ was perceived as less intelligible in the AV compared to the AO condition. Further analysis revealed that, unlike the native English productions, the Japanese speakers produced /ɹ/ without visible lip-rounding, indicating that non-native speakers' incorrect articulatory configurations may decrease the degree of intelligibility. These results suggest that visual speech information may either positively or negatively affect L2 speech intelligibility.
Construct-related validity of the TOCS measures: comparison of intelligibility and speaking rate scores in children with and without speech disorders.

PubMed

Hodge, Megan M; Gotzke, Carrie L

2014-01-01

This study evaluated construct-related validity of the Test of Children's Speech (TOCS). Intelligibility scores obtained using open-set word identification tasks (orthographic transcription) for the TOCS word and sentence tests and rate scores for the TOCS sentence test (words per minute or WPM and intelligible words per minute or IWPM) were compared for a group of 15 adults (18-30 years of age) with normal speech production and three groups of children: 48 3-6 year-olds with typical speech development and neurological histories (TDS), 48 3-6 year-olds with a speech sound disorder of unknown origin and no identified neurological impairment (SSD-UNK), and 22 3-10 year-olds with dysarthria and cerebral palsy (DYS). As expected, mean intelligibility scores and rates increased with age in the TDS group. However, word test intelligibility, WPM and IWPM scores for the 6 year-olds in the TDS group were significantly lower than those for the adults. The DYS group had significantly lower word and sentence test intelligibility and WPM and IWPM scores than the TDS and SSD-UNK groups. Compared to the TDS group, the SSD-UNK group also had significantly lower intelligibility scores for the word and sentence tests, and significantly lower IWPM, but not WPM scores on the sentence test. The results support the construct-related validity of TOCS as a tool for obtaining intelligibility and rate scores that are sensitive to group differences in 3-6 year-old children, with and without speech sound disorders, and to 3+ year-old children with speech disorders, with and without dysarthria. Readers will describe the word and sentence intelligibility and speaking rate performance of children with typically developing speech at age levels of 3, 4, 5 and 6 years, as measured by the Test of Children's Speech, and how these compare with adult speakers and two groups of children with speech disorders. They will also recognize what measures on this test differentiate children with speech sound disorders of unknown origin from children with cerebral palsy and dysarthria. Copyright © 2014 Elsevier Inc. All rights reserved.
Speech outcomes in Cantonese patients after glossectomy.

PubMed

Wong, Ripley Kit; Poon, Esther Sok-Man; Woo, Cynthia Yuen-Man; Chan, Sabina Ching-Shun; Wong, Elsa Siu-Ping; Chu, Ada Wai-Sze

2007-08-01

We sought to determine the major factors affecting speech production of Cantonese-speaking glossectomized patients. Error pattern was analyzed. Forty-one Cantonese-speaking subjects who had undergone glossectomy > or = 6 months previously were recruited. Speech production evaluation included (1) phonetic error analysis in nonsense syllable; (2) speech intelligibility in sentences evaluated by naive listeners; (3) overall speech intelligibility in conversation evaluated by experienced speech therapists. Patients receiving adjuvant radiotherapy had significantly poorer segmental and connected speech production. Total or subtotal glossectomy also resulted in poor speech outcomes. Patients having free flap reconstruction showed the best speech outcomes. Patients without lymph node metastasis had significantly better speech scores when compared with patients with lymph node metastasis. Initial consonant production had the worst scores, while vowel production was the least affected. Speech outcomes of Cantonese-speaking glossectomized patients depended on the severity of the disease. Initial consonants had the greatest effect on speech intelligibility.
Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope.

PubMed

Vanthornhout, Jonas; Decruy, Lien; Wouters, Jan; Simon, Jonathan Z; Francart, Tom

2018-04-01

Speech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation, and memory. Very often, electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods, non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.
Female voice communications in high levels of aircraft cockpit noises--Part I: spectra, levels, and microphones.

PubMed

Nixon, C W; Morris, L J; McCavitt, A R; McKinley, R L; Anderson, T R; McDaniel, M P; Yeager, D G

1998-07-01

Female produced speech, although more intelligible than male speech in some noise spectra, may be more vulnerable to degradation by high levels of some military aircraft cockpit noises. The acoustic features of female speech are higher in frequency, lower in power, and appear more susceptible than male speech to masking by some of these military noises. Current military aircraft voice communication systems were optimized for the male voice and may not adequately accommodate the female voice in these high level noises. This applied study investigated the intelligibility of female and male speech produced in the noise spectra of four military aircraft cockpits at levels ranging from 95 dB to 115 dB. The experimental subjects used standard flight helmets and headsets, noise-canceling microphones, and military aircraft voice communications systems during the measurements. The intelligibility of female speech was lower than that of male speech for all experimental conditions; however, differences were small and insignificant except at the highest levels of the cockpit noises. Intelligibility for both genders varied with aircraft noise spectrum and level. Speech intelligibility of both genders was acceptable during normal cruise noises of all four aircraft, but improvements are required in the higher levels of noise created during aircraft maximum operating conditions. The intelligibility of female speech was unacceptable at the highest measured noise level of 115 dB and may constitute a problem for other military aviators. The intelligibility degradation due to the noise can be neutralized by use of an available, improved noise-canceling microphone, by the application of current active noise reduction technology to the personal communication equipment, and by the development of a voice communications system to accommodate the speech produced by both female and male aviators.
Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

ERIC Educational Resources Information Center

Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

2010-01-01

Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…
Speech Intelligibility Advantages using an Acoustic Beamformer Display

NASA Technical Reports Server (NTRS)

Begault, Durand R.; Sunder, Kaushik; Godfroy, Martine; Otto, Peter

2015-01-01

A speech intelligibility test conforming to the Modified Rhyme Test of ANSI S3.2 "Method for Measuring the Intelligibility of Speech Over Communication Systems" was conducted using a prototype 12-channel acoustic beamformer system. The target speech material (signal) was identified against speech babble (noise), with calculated signal-noise ratios of 0, 5 and 10 dB. The signal was delivered at a fixed beam orientation of 135 deg (re 90 deg as the frontal direction of the array) and the noise at 135 deg (co-located) and 0 deg (separated). A significant improvement in intelligibility from 57% to 73% was found for spatial separation for the same signal-noise ratio (0 dB). Significant effects for improved intelligibility due to spatial separation were also found for higher signal-noise ratios (5 and 10 dB).
The Pathways for Intelligible Speech: Multivariate and Univariate Perspectives

PubMed Central

Evans, S.; Kyong, J.S.; Rosen, S.; Golestani, N.; Warren, J.E.; McGettigan, C.; Mourão-Miranda, J.; Wise, R.J.S.; Scott, S.K.

2014-01-01

An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400–2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486–2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local “searchlights” and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. PMID:23585519
Exploring the roles of spectral detail and intonation contour in speech intelligibility: an FMRI study.

PubMed

Kyong, Jeong S; Scott, Sophie K; Rosen, Stuart; Howe, Timothy B; Agnew, Zarinah K; McGettigan, Carolyn

2014-08-01

The melodic contour of speech forms an important perceptual aspect of tonal and nontonal languages and an important limiting factor on the intelligibility of speech heard through a cochlear implant. Previous work exploring the neural correlates of speech comprehension identified a left-dominant pathway in the temporal lobes supporting the extraction of an intelligible linguistic message, whereas the right anterior temporal lobe showed an overall preference for signals clearly conveying dynamic pitch information [Johnsrude, I. S., Penhune, V. B., & Zatorre, R. J. Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain, 123, 155-163, 2000; Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000]. The current study combined modulations of overall intelligibility (through vocoding and spectral inversion) with a manipulation of pitch contour (normal vs. falling) to investigate the processing of spoken sentences in functional MRI. Our overall findings replicate and extend those of Scott et al. [Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000], where greater sentence intelligibility was predominately associated with increased activity in the left STS, and the greatest response to normal sentence melody was found in right superior temporal gyrus. These data suggest a spatial distinction between brain areas associated with intelligibility and those involved in the processing of dynamic pitch information in speech. By including a set of complexity-matched unintelligible conditions created by spectral inversion, this is additionally the first study reporting a fully factorial exploration of spectrotemporal complexity and spectral inversion as they relate to the neural processing of speech intelligibility. Perhaps surprisingly, there was little evidence for an interaction between the two factors-we discuss the implications for the processing of sound and speech in the dorsolateral temporal lobes.
Vowels in clear and conversational speech: Talker differences in acoustic characteristics and intelligibility for normal-hearing listeners

NASA Astrophysics Data System (ADS)

Hargus Ferguson, Sarah; Kewley-Port, Diane

2002-05-01

Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. Recent work in this lab suggests that talkers vary in how much their intelligibility improves when they are instructed to speak clearly. The few studies examining acoustic characteristics of clear and conversational speech suggest that these differing clear speech effects result from different acoustic strategies on the part of individual talkers. However, only two studies to date have directly examined differences among talkers producing clear versus conversational speech, and neither included acoustic analysis. In this project, clear and conversational speech was recorded from 41 male and female talkers aged 18-45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. Acoustic analyses using stimuli from a subgroup of talkers shown to have a range of speaking style effects will be used to assess specific acoustic correlates of vowel intelligibility in clear and conversational speech. [Work supported by NIHDCD-02229.

Comparison of speech intelligibility in cockpit noise using SPH-4 flight helmet with and without active noise reduction

NASA Technical Reports Server (NTRS)

Chan, Jeffrey W.; Simpson, Carol A.

1990-01-01

Active Noise Reduction (ANR) is a new technology which can reduce the level of aircraft cockpit noise that reaches the pilot's ear while simultaneously improving the signal to noise ratio for voice communications and other information bearing sound signals in the cockpit. A miniature, ear-cup mounted ANR system was tested to determine whether speech intelligibility is better for helicopter pilots using ANR compared to a control condition of ANR turned off. Two signal to noise ratios (S/N), representative of actual cockpit conditions, were used for the ratio of the speech to cockpit noise sound pressure levels. Speech intelligibility was significantly better with ANR compared to no ANR for both S/N conditions. Variability of speech intelligibility among pilots was also significantly less with ANR. When the stock helmet was used with ANR turned off, the average PB Word speech intelligibility score was below the Normally Acceptable level. In comparison, it was above that level with ANR on in both S/N levels.
Measures for assessing architectural speech security (privacy) of closed offices and meeting rooms.

PubMed

Gover, Bradford N; Bradley, John S

2004-12-01

Objective measures were investigated as predictors of the speech security of closed offices and rooms. A new signal-to-noise type measure is shown to be a superior indicator for security than existing measures such as the Articulation Index, the Speech Intelligibility Index, the ratio of the loudness of speech to that of noise, and the A-weighted level difference of speech and noise. This new measure is a weighted sum of clipped one-third-octave-band signal-to-noise ratios; various weightings and clipping levels are explored. Listening tests had 19 subjects rate the audibility and intelligibility of 500 English sentences, filtered to simulate transmission through various wall constructions, and presented along with background noise. The results of the tests indicate that the new measure is highly correlated with sentence intelligibility scores and also with three security thresholds: the threshold of intelligibility (below which speech is unintelligible), the threshold of cadence (below which the cadence of speech is inaudible), and the threshold of audibility (below which speech is inaudible). The ratio of the loudness of speech to that of noise, and simple A-weighted level differences are both shown to be well correlated with these latter two thresholds (cadence and audibility), but not well correlated with intelligibility.
Of Mouths and Men: Non-Native Listeners' Identification and Evaluation of Varieties of English.

ERIC Educational Resources Information Center

Jarvella, Robert J.; Bang, Eva; Jakobsen, Arnt Lykke; Mees, Inger M.

2001-01-01

Advanced Danish students of English tried to identify the national origin of young men from Ireland, Scotland, England, and the United States from their speech and then rated the speech for attractiveness. Listeners rated speech produced by Englishmen as most attractive, and speech by Americans as least attractive. (Author/VWL)
Intelligibility of Noise-Adapted and Clear Speech in Child, Young Adult, and Older Adult Talkers

ERIC Educational Resources Information Center

Smiljanic, Rajka; Gilbert, Rachael C.

2017-01-01

Purpose: This study examined intelligibility of conversational and clear speech sentences produced in quiet and in noise by children, young adults, and older adults. Relative talker intelligibility was assessed across speaking styles. Method: Sixty-one young adult participants listened to sentences mixed with speech-shaped noise at -5 dB…
Rhythm Perception and Its Role in Perception and Learning of Dysrhythmic Speech.

PubMed

Borrie, Stephanie A; Lansford, Kaitlin L; Barrett, Tyson S

2017-03-01

The perception of rhythm cues plays an important role in recognizing spoken language, especially in adverse listening conditions. Indeed, this has been shown to hold true even when the rhythm cues themselves are dysrhythmic. This study investigates whether expertise in rhythm perception provides a processing advantage for perception (initial intelligibility) and learning (intelligibility improvement) of naturally dysrhythmic speech, dysarthria. Fifty young adults with typical hearing participated in 3 key tests, including a rhythm perception test, a receptive vocabulary test, and a speech perception and learning test, with standard pretest, familiarization, and posttest phases. Initial intelligibility scores were calculated as the proportion of correct pretest words, while intelligibility improvement scores were calculated by subtracting this proportion from the proportion of correct posttest words. Rhythm perception scores predicted intelligibility improvement scores but not initial intelligibility. On the other hand, receptive vocabulary scores predicted initial intelligibility scores but not intelligibility improvement. Expertise in rhythm perception appears to provide an advantage for processing dysrhythmic speech, but a familiarization experience is required for the advantage to be realized. Findings are discussed in relation to the role of rhythm in speech processing and shed light on processing models that consider the consequence of rhythm abnormalities in dysarthria.
Speech intelligibility in hospitals.

PubMed

Ryherd, Erica E; Moeller, Michael; Hsu, Timothy

2013-07-01

Effective communication between staff members is key to patient safety in hospitals. A variety of patient care activities including admittance, evaluation, and treatment rely on oral communication. Surprisingly, published information on speech intelligibility in hospitals is extremely limited. In this study, speech intelligibility measurements and occupant evaluations were conducted in 20 units of five different U.S. hospitals. A variety of unit types and locations were studied. Results show that overall, no unit had "good" intelligibility based on the speech intelligibility index (SII > 0.75) and several locations found to have "poor" intelligibility (SII < 0.45). Further, occupied spaces were found to have 10%-15% lower SII than unoccupied spaces on average. Additionally, staff perception of communication problems at nurse stations was significantly correlated with SII ratings. In a targeted second phase, a unit treated with sound absorption had higher SII ratings for a larger percentage of time as compared to an identical untreated unit. Taken as a whole, the study provides an extensive baseline evaluation of speech intelligibility across a variety of hospitals and unit types, offers some evidence of the positive impact of absorption on intelligibility, and identifies areas for future research.
The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility.

PubMed

Bentsen, Thomas; May, Tobias; Kressner, Abigail A; Dau, Torsten

2018-01-01

Computational speech segregation attempts to automatically separate speech from noise. This is challenging in conditions with interfering talkers and low signal-to-noise ratios. Recent approaches have adopted deep neural networks and successfully demonstrated speech intelligibility improvements. A selection of components may be responsible for the success with these state-of-the-art approaches: the system architecture, a time frame concatenation technique and the learning objective. The aim of this study was to explore the roles and the relative contributions of these components by measuring speech intelligibility in normal-hearing listeners. A substantial improvement of 25.4 percentage points in speech intelligibility scores was found going from a subband-based architecture, in which a Gaussian Mixture Model-based classifier predicts the distributions of speech and noise for each frequency channel, to a state-of-the-art deep neural network-based architecture. Another improvement of 13.9 percentage points was obtained by changing the learning objective from the ideal binary mask, in which individual time-frequency units are labeled as either speech- or noise-dominated, to the ideal ratio mask, where the units are assigned a continuous value between zero and one. Therefore, both components play significant roles and by combining them, speech intelligibility improvements were obtained in a six-talker condition at a low signal-to-noise ratio.
Perception of intelligibility and qualities of non-native accented speakers.

PubMed

Fuse, Akiko; Navichkova, Yuliya; Alloggio, Krysteena

To provide effective treatment to clients, speech-language pathologists must be understood, and be perceived to demonstrate the personal qualities necessary for therapeutic practice (e.g., resourcefulness and empathy). One factor that could interfere with the listener's perception of non-native speech is the speaker's accent. The current study explored the relationship between how accurately listeners could understand non-native speech and their perceptions of personal attributes of the speaker. Additionally, this study investigated how listeners' familiarity and experience with other languages may influence their perceptions of non-native accented speech. Through an online survey, native monolingual and bilingual English listeners rated four non-native accents (i.e., Spanish, Chinese, Russian, and Indian) on perceived intelligibility and perceived personal qualities (i.e., professionalism, intelligence, resourcefulness, empathy, and patience) necessary for speech-language pathologists. The results indicated significant relationships between the perception of intelligibility and the perception of personal qualities (i.e., professionalism, intelligence, and resourcefulness) attributed to non-native speakers. However, these findings were not supported for the Chinese accent. Bilingual listeners judged the non-native speech as more intelligible in comparison to monolingual listeners. No significant differences were found in the ratings between bilingual listeners who share the same language background as the speaker and other bilingual listeners. Based on the current findings, greater perception of intelligibility was the key to promoting a positive perception of personal qualities such as professionalism, intelligence, and resourcefulness, important for speech-language pathologists. The current study found evidence to support the claim that bilinguals have a greater ability in understanding non-native accented speech compared to monolingual listeners. The results, however, did not confirm an advantage for bilingual listeners sharing the same language backgrounds with the non-native speaker over other bilingual listeners. Copyright © 2017 Elsevier Inc. All rights reserved.
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

PubMed Central

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2015-01-01

Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on functioning. PMID:26136699
A comparative intelligibility study of single-microphone noise reduction algorithms.

PubMed

Hu, Yi; Loizou, Philipos C

2007-09-01

The evaluation of intelligibility of noise reduction algorithms is reported. IEEE sentences and consonants were corrupted by four types of noise including babble, car, street and train at two signal-to-noise ratio levels (0 and 5 dB), and then processed by eight speech enhancement methods encompassing four classes of algorithms: spectral subtractive, sub-space, statistical model based and Wiener-type algorithms. The enhanced speech was presented to normal-hearing listeners for identification. With the exception of a single noise condition, no algorithm produced significant improvements in speech intelligibility. Information transmission analysis of the consonant confusion matrices indicated that no algorithm improved significantly the place feature score, significantly, which is critically important for speech recognition. The algorithms which were found in previous studies to perform the best in terms of overall quality, were not the same algorithms that performed the best in terms of speech intelligibility. The subspace algorithm, for instance, was previously found to perform the worst in terms of overall quality, but performed well in the present study in terms of preserving speech intelligibility. Overall, the analysis of consonant confusion matrices suggests that in order for noise reduction algorithms to improve speech intelligibility, they need to improve the place and manner feature scores.
Effect of classroom acoustics on the speech intelligibility of students.

PubMed

Rabelo, Alessandra Terra Vasconcelos; Santos, Juliana Nunes; Oliveira, Rafaella Cristina; Magalhães, Max de Castro

2014-01-01

To analyze the acoustic parameters of classrooms and the relationship among equivalent sound pressure level (Leq), reverberation time (T₃₀), the Speech Transmission Index (STI), and the performance of students in speech intelligibility testing. A cross-sectional descriptive study, which analyzed the acoustic performance of 18 classrooms in 9 public schools in Belo Horizonte, Minas Gerais, Brazil, was conducted. The following acoustic parameters were measured: Leq, T₃₀, and the STI. In the schools evaluated, a speech intelligibility test was performed on 273 students, 45.4% of whom were boys, with an average age of 9.4 years. The results of the speech intelligibility test were compared to the values of the acoustic parameters with the help of Student's t-test. The Leq, T₃₀, and STI tests were conducted in empty and furnished classrooms. Children showed better results in speech intelligibility tests conducted in classrooms with less noise, a lower T₃₀, and greater STI values. The majority of classrooms did not meet the recommended regulatory standards for good acoustic performance. Acoustic parameters have a direct effect on the speech intelligibility of students. Noise contributes to a decrease in their understanding of information presented orally, which can lead to negative consequences in their education and their social integration as future professionals.
Proximate factors associated with speech intelligibility in children with cochlear implants: A preliminary study.

PubMed

Chin, Steven B; Kuhns, Matthew J

2014-01-01

The purpose of this descriptive pilot study was to examine possible relationships among speech intelligibility and structural characteristics of speech in children who use cochlear implants. The Beginners Intelligibility Test (BIT) was administered to 10 children with cochlear implants, and the intelligibility of the words in the sentences was judged by panels of naïve adult listeners. Additionally, several qualitative and quantitative measures of word omission, segment correctness, duration, and intonation variability were applied to the sentences used to assess intelligibility. Correlational analyses were conducted to determine if BIT scores and the other speech parameters were related. There was a significant correlation between BIT score and percent words omitted, but no other variables correlated significantly with BIT score. The correlation between intelligibility and word omission may be task-specific as well as reflective of memory limitations.
Minimal Pair Distinctions and Intelligibility in Preschool Children with and without Speech Sound Disorders

ERIC Educational Resources Information Center

Hodge, Megan M.; Gotzke, Carrie L.

2011-01-01

Listeners' identification of young children's productions of minimally contrastive words and predictive relationships between accurately identified words and intelligibility scores obtained from a 100-word spontaneous speech sample were determined for 36 children with typically developing speech (TDS) and 36 children with speech sound disorders…
Enhancing Speech Intelligibility: Interactions among Context, Modality, Speech Style, and Masker

ERIC Educational Resources Information Center

Van Engen, Kristin J.; Phelps, Jasmine E. B.; Smiljanic, Rajka; Chandrasekaran, Bharath

2014-01-01

Purpose: The authors sought to investigate interactions among intelligibility-enhancing speech cues (i.e., semantic context, clearly produced speech, and visual information) across a range of masking conditions. Method: Sentence recognition in noise was assessed for 29 normal-hearing listeners. Testing included semantically normal and anomalous…
Factors influencing relative speech intelligibility in patients with oral squamous cell carcinoma: a prospective study using automatic, computer-based speech analysis.

PubMed

Stelzle, F; Knipfer, C; Schuster, M; Bocklet, T; Nöth, E; Adler, W; Schempf, L; Vieler, P; Riemann, M; Neukam, F W; Nkenke, E

2013-11-01

Oral squamous cell carcinoma (OSCC) and its treatment impair speech intelligibility by alteration of the vocal tract. The aim of this study was to identify the factors of oral cancer treatment that influence speech intelligibility by means of an automatic, standardized speech-recognition system. The study group comprised 71 patients (mean age 59.89, range 35-82 years) with OSCC ranging from stage T1 to T4 (TNM staging). Tumours were located on the tongue (n=23), lower alveolar crest (n=27), and floor of the mouth (n=21). Reconstruction was conducted through local tissue plasty or microvascular transplants. Adjuvant radiotherapy was performed in 49 patients. Speech intelligibility was evaluated before, and at 3, 6, and 12 months after tumour resection, and compared to that of a healthy control group (n=40). Postoperatively, significant influences on speech intelligibility were tumour localization (P=0.010) and resection volume (P=0.019). Additionally, adjuvant radiotherapy (P=0.049) influenced intelligibility at 3 months after surgery. At 6 months after surgery, influences were resection volume (P=0.028) and adjuvant radiotherapy (P=0.034). The influence of tumour localization (P=0.001) and adjuvant radiotherapy (P=0.022) persisted after 12 months. Tumour localization, resection volume, and radiotherapy are crucial factors for speech intelligibility. Radiotherapy significantly impaired word recognition rate (WR) values with a progression of the impairment for up to 12 months after surgery. Copyright © 2013 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
Effect of the Number of Presentations on Listener Transcriptions and Reliability in the Assessment of Speech Intelligibility in Children

ERIC Educational Resources Information Center

Lagerberg, Tove B.; Johnels, Jakob Åsberg; Hartelius, Lena; Persson, Christina

2015-01-01

Background: The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, "inter alia," to the speech material, the listener and the listener task. Aims: To explore the impact of the number of…
Effect of Whole-Body Vibration on Speech. Part 2; Effect on Intelligibility

NASA Technical Reports Server (NTRS)

Begault, Durand R.

2011-01-01

The effect on speech intelligibility was measured for speech where talkers reading Diagnostic Rhyme Test material were exposed to 0.7 g whole body vibration to simulate space vehicle launch. Across all talkers, the effect of vibration was to degrade the percentage of correctly transcribed words from 83% to 74%. The magnitude of the effect of vibration on speech communication varies between individuals, for both talkers and listeners. A worst case scenario for intelligibility would be the most sensitive listener hearing the most sensitive talker; one participant s intelligibility was reduced by 26% (97% to 71%) for one of the talkers.
Exploring the Roles of Spectral Detail and Intonation Contour in Speech Intelligibility: An fMRI Study

PubMed Central

Kyong, Jeong S.; Scott, Sophie K.; Rosen, Stuart; Howe, Timothy B.; Agnew, Zarinah K.; McGettigan, Carolyn

2014-01-01

The melodic contour of speech forms an important perceptual aspect of tonal and nontonal languages and an important limiting factor on the intelligibility of speech heard through a cochlear implant. Previous work exploring the neural correlates of speech comprehension identified a left-dominant pathway in the temporal lobes supporting the extraction of an intelligible linguistic message, whereas the right anterior temporal lobe showed an overall preference for signals clearly conveying dynamic pitch information. The current study combined modulations of overall intelligibility (through vocoding and spectral inversion) with a manipulation of pitch contour (normal vs. falling) to investigate the processing of spoken sentences in functional MRI. Our overall findings replicate and extend those of Scott et al., whereas greater sentence intelligibility was predominately associated with increased activity in the left STS, the greatest response to normal sentence melody was found right superior temporal gyrus. These data suggest a spatial distinction between brain areas associated with intelligibility and those involved in the processing of dynamic pitch information in speech. By including a set of complexity-matched unintelligible conditions created by spectral inversion, this is additionally the first study reporting a fully factorial exploration of spectrotemporal complexity and spectral inversion as they relate to the neural processing of speech intelligibility. Perhaps surprisingly, there was no evidence for an interaction between the two factors—we discuss the implications for the processing of sound and speech in the dorsolateral temporal lobes. PMID:24568205
Talker differences in clear and conversational speech: Vowel intelligibility for normal-hearing listeners

NASA Astrophysics Data System (ADS)

Hargus Ferguson, Sarah

2004-10-01

Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. While variability among talkers during speech production is well known, only one study to date [Gagné et al., J. Acad. Rehab. Audiol. 27, 135-158 (1994)] has directly examined differences among talkers producing clear and conversational speech. Data from that study, which utilized ten talkers, suggested that talkers vary in the extent to which they improve their intelligibility by speaking clearly. Similar variability can be also seen in studies using smaller groups of talkers [e.g., Picheny, Durlach, and Braida, J. Speech Hear. Res. 28, 96-103 (1985)]. In the current paper, clear and conversational speech materials were recorded from 41 male and female talkers aged 18 to 45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. While female talkers showed a larger clear speech vowel intelligibility benefit than male talkers, neither talker age nor prior experience communicating with hearing-impaired listeners significantly affected the speaking style effect. .
Identification of a pathway for intelligible speech in the left temporal lobe

PubMed Central

Scott, Sophie K.; Blank, C. Catrin; Rosen, Stuart; Wise, Richard J. S.

2017-01-01

Summary It has been proposed that the identification of sounds, including species-specific vocalizations, by primates depends on anterior projections from the primary auditory cortex, an auditory pathway analogous to the ventral route proposed for the visual identification of objects. We have identified a similar route in the human for understanding intelligible speech. Using PET imaging to identify separable neural subsystems within the human auditory cortex, we used a variety of speech and speech-like stimuli with equivalent acoustic complexity but varying intelligibility. We have demonstrated that the left superior temporal sulcus responds to the presence of phonetic information, but its anterior part only responds if the stimulus is also intelligible. This novel observation demonstrates a left anterior temporal pathway for speech comprehension. PMID:11099443

Speech Intelligibility of Profoundly Deaf Pediatric Hearing Aid Users.

ERIC Educational Resources Information Center

Svirsky, Mario A.; Chin, Steven B.; Miyamoto, Richard T.; Sloan, Robert B.; Caldwell, Matthew D.

2000-01-01

A study examined the speech intelligibility of children (ages 1-15) with deafness who use hearing aids. Data revealed a strong significant trend toward higher intelligibility for children with more residual hearing, and a significant trend toward higher intelligibility for users of oral communication than those using total communication. (Contains…
[Effects of acaoustic adaptation of classrooms on the quality of verbal communication].

PubMed

Mikulski, Witold

2013-01-01

Voice organ disorders among teachers are caused by excessive voice strain. One of the measures to reduce this strain is to decrease background noise when teaching. Increasing the acoustic absorption of the room is a technical measure for achieving this aim. The absorption level also improves speech intelligibility rated by the following parameters: room reverberation time and speech transmission index (STI). This article presents the effects of acoustic adaptation of classrooms on the quality of verbal communication, aimed at getting the speech intelligibility at the good or excellent level. The article lists the criteria for evaluating classrooms in terms of the quality of verbal communication. The parameters were defined, using the measurement methods according to PN-EN ISO 3382-2:2010 and PN-EN 60268-16:2011. Acoustic adaptations were completed in two classrooms. After completing acoustic adaptations the reverberation time for the frequency of 1 kHz was reduced: in room no. 1 from 1.45 s to 0.44 s and in room no. 2 from 1.03 s to 0.37 s (maximum 0.65 s). At the same time, the speech transmission index increased: in room no. 1 from 0.55 (satisfactory speech intelligibility) to 0.75 (speech intelligibility close to excellent); in room no. 2 from 0.63 (good speech intelligibility) to 0.80 (excellent speech intelligibility). Therefore, it can be stated that prior to completing acoustic adaptations room no. 1 did not comply and room no. 2 barely complied with the criterion (speech transmission index of 0.62). After completing acoustic adaptations both rooms meet the requirements.
Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech

PubMed Central

Kishida, Takuya; Nakajima, Yoshitaka; Ueda, Kazuo; Remijn, Gerard B.

2016-01-01

Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in such noise-vocoded speech. PMID:27199790
Speech Intelligibility and Hearing Protector Selection

DTIC Science & Technology

2016-08-29

for use. 8 Another nonstandardized speech intelligibility test relevant to military environments is the Coordinate Response Measure ( CRM ...developed by the U.S. Air Force Research Laboratory (Bolia, Nelson, Ericson, and Simpson, 2000). The phrases in the CRM are comprised of a call...detections and the percentage of correctly identified color-number combinations. The CRM is particularly useful in evaluating speech intelligibility over
The Role of Music in Speech Intelligibility of Learners with Post Lingual Hearing Impairment in Selected Units in Lusaka District

ERIC Educational Resources Information Center

Katongo, Emily Mwamba; Ndhlovu, Daniel

2015-01-01

This study sought to establish the role of music in speech intelligibility of learners with Post Lingual Hearing Impairment (PLHI) and strategies teachers used to enhance speech intelligibility in learners with PLHI in selected special units for the deaf in Lusaka district. The study used a descriptive research design. Qualitative and quantitative…
Speech Intelligibility of Aircrew Mask Communication Configurations in High-Noise Environments

DTIC Science & Technology

2017-09-28

ARL-TR-8168 ● Sep 2017 US Army Research Laboratory Speech Intelligibility of Aircrew Mask Communication Configurations in High ...Laboratory Speech Intelligibility of Aircrew Mask Communication Configurations in High -Noise Environments by Kimberly A Pollard and Lamar Garrett...in High - Noise Environments 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Kimberly A Pollard and Lamar
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit

PubMed Central

2012-01-01

Abstract Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based on the non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of the proposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearing impaired children and elderly listeners. It was shown that for the speech with average rate equal to or higher than 6.48 vowels/s, all of the proposed methods have statistically significant impact on the improvement of speech intelligibility for hearing impaired children with reduced hearing resolution and one of the proposed methods significantly improves comprehension of speech in the group of elderly listeners with reduced hearing resolution. Virtual slides http://www.diagnosticpathology.diagnomx.eu/vs/2065486371761991 PMID:23009662
Comparing Binaural Pre-processing Strategies I: Instrumental Evaluation.

PubMed

Baumgärtel, Regina M; Krawczyk-Becker, Martin; Marquardt, Daniel; Völker, Christoph; Hu, Hongmei; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Ernst, Stephan M A; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-12-30

In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios. © The Author(s) 2015.
Comparing Binaural Pre-processing Strategies I

PubMed Central

Krawczyk-Becker, Martin; Marquardt, Daniel; Völker, Christoph; Hu, Hongmei; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Ernst, Stephan M. A.; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-01-01

In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios. PMID:26721920
Sound System Engineering & Optimization: The effects of multiple arrivals on the intelligibility of reinforced speech

NASA Astrophysics Data System (ADS)

Ryan, Timothy James

The effects of multiple arrivals on the intelligibility of speech produced by live-sound reinforcement systems are examined. The intent is to determine if correlations exist between the manipulation of sound system optimization parameters and the subjective attribute speech intelligibility. Given the number, and wide range, of variables involved, this exploratory research project attempts to narrow the focus of further studies. Investigated variables are delay time between signals arriving from multiple elements of a loudspeaker array, array type and geometry and the two-way interactions of speech-to-noise ratio and array geometry with delay time. Intelligibility scores were obtained through subjective evaluation of binaural recordings, reproduced via headphone, using the Modified Rhyme Test. These word-score results are compared with objective measurements of Speech Transmission Index (STI). Results indicate that both variables, delay time and array geometry, have significant effects on intelligibility. Additionally, it is seen that all three of the possible two-way interactions have significant effects. Results further reveal that the STI measurement method overestimates the decrease in intelligibility due to short delay times between multiple arrivals.
The effect of intensive speech rate and intonation therapy on intelligibility in Parkinson's disease.

PubMed

Martens, Heidi; Van Nuffelen, Gwen; Dekens, Tomas; Hernández-Díaz Huici, Maria; Kairuz Hernández-Díaz, Hector Arturo; De Letter, Miet; De Bodt, Marc

2015-01-01

Most studies on treatment of prosody in individuals with dysarthria due to Parkinson's disease are based on intensive treatment of loudness. The present study investigates the effect of intensive treatment of speech rate and intonation on the intelligibility of individuals with dysarthria due to Parkinson's disease. A one group pretest-posttest design was used to compare intelligibility, speech rate, and intonation before and after treatment. Participants included eleven Dutch-speaking individuals with predominantly moderate dysarthria due to Parkinson's disease, who received five one-hour treatment sessions per week during three weeks. Treatment focused on lowering speech rate and magnifying the phrase final intonation contrast between statements and questions. Intelligibility was perceptually assessed using a standardized sentence intelligibility test. Speech rate was automatically assessed during the sentence intelligibility test as well as during a passage reading task and a storytelling task. Intonation was perceptually assessed using a sentence reading task and a sentence repetition task, and also acoustically analyzed in terms of maximum fundamental frequency. After treatment, there was a significant improvement of sentence intelligibility (effect size .83), a significant increase of pause frequency during the passage reading task, a significant improvement of correct listener identification of statements and questions, and a significant increase of the maximum fundamental frequency in the final syllable of questions during both intonation tasks. The findings suggest that participants were more intelligible and more able to manipulate pause frequency and statement-question intonation after treatment. However, the relationship between the change in intelligibility on the one hand and the changes in speech rate and intonation on the other hand is not yet fully understood. Results are nuanced in the light of the operated research design. The reader will be able to: (1) describe the effect of intensive speech rate and intonation treatment on intelligibility of speakers with dysarthria due to PD, (2) describe the effect of intensive speech rate treatment on rate manipulation by speakers with dysarthria due to PD, and (3) describe the effect of intensive intonation treatment on manipulation of the phrase final intonation contrast between statements and questions by speakers with dysarthria due to PD. Copyright © 2015 Elsevier Inc. All rights reserved.
In-flight speech intelligibility evaluation of a service member with sensorineural hearing loss: case report.

PubMed

Casto, Kristen L; Cho, Timothy H

2012-09-01

This case report describes the in-flight speech intelligibility evaluation of an aircraft crewmember with pure tone audiometric thresholds that exceed the U.S. Army's flight standards. Results of in-flight speech intelligibility testing highlight the inability to predict functional auditory abilities from pure tone audiometry and underscore the importance of conducting validated functional hearing evaluations to determine aviation fitness-for-duty.
On the importance of early reflections for speech in rooms.

PubMed

Bradley, J S; Sato, H; Picard, M

2003-06-01

This paper presents the results of new studies based on speech intelligibility tests in simulated sound fields and analyses of impulse response measurements in rooms used for speech communication. The speech intelligibility test results confirm the importance of early reflections for achieving good conditions for speech in rooms. The addition of early reflections increased the effective signal-to-noise ratio and related speech intelligibility scores for both impaired and nonimpaired listeners. The new results also show that for common conditions where the direct sound is reduced, it is only possible to understand speech because of the presence of early reflections. Analyses of measured impulse responses in rooms intended for speech show that early reflections can increase the effective signal-to-noise ratio by up to 9 dB. A room acoustics computer model is used to demonstrate that the relative importance of early reflections can be influenced by the room acoustics design.
Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

NASA Astrophysics Data System (ADS)

Middag, Catherine; Martens, Jean-Pierre; Van Nuffelen, Gwen; De Bodt, Marc

2009-12-01

It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words) and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008) is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.
Evidence-based occupational hearing screening II: validation of a screening methodology using measures of functional hearing ability.

PubMed

Soli, Sigfrid D; Amano-Kusumoto, Akiko; Clavier, Odile; Wilbur, Jed; Casto, Kristen; Freed, Daniel; Laroche, Chantal; Vaillancourt, Véronique; Giguère, Christian; Dreschler, Wouter A; Rhebergen, Koenraad S

2018-05-01

Validate use of the Extended Speech Intelligibility Index (ESII) for prediction of speech intelligibility in non-stationary real-world noise environments. Define a means of using these predictions for objective occupational hearing screening for hearing-critical public safety and law enforcement jobs. Analyses of predicted and measured speech intelligibility in recordings of real-world noise environments were performed in two studies using speech recognition thresholds (SRTs) and intelligibility measures. ESII analyses of the recordings were used to predict intelligibility. Noise recordings were made in prison environments and at US Army facilities for training ground and airborne forces. Speech materials included full bandwidth sentences and bandpass filtered sentences that simulated radio transmissions. A total of 22 adults with normal hearing (NH) and 15 with mild-moderate hearing impairment (HI) participated in the two studies. Average intelligibility predictions for individual NH and HI subjects were accurate in both studies (r 2 ≥ 0.94). Pooled predictions were slightly less accurate (0.78 ≤ r 2 ≤ 0.92). An individual's SRT and audiogram can accurately predict the likelihood of effective speech communication in noise environments with known ESII characteristics, where essential hearing-critical tasks are performed. These predictions provide an objective means of occupational hearing screening.
Variability and Intelligibility of Clarified Speech to Different Listener Groups

NASA Astrophysics Data System (ADS)

Silber, Ronnie F.

Two studies examined the modifications that adult speakers make in speech to disadvantaged listeners. Previous research that has focused on speech to the deaf individuals and to young children has shown that adults clarify speech when addressing these two populations. Acoustic measurements suggest that the signal undergoes similar changes for both populations. Perceptual tests corroborate these results for the deaf population, but are nonsystematic in developmental studies. The differences in the findings for these populations and the nonsystematic results in the developmental literature may be due to methodological factors. The present experiments addressed these methodological questions. Studies of speech to hearing impaired listeners have used read, nonsense, sentences, for which speakers received explicit clarification instructions and feedback, while in the child literature, excerpts of real-time conversations were used. Therefore, linguistic samples were not precisely matched. In this study, experiments used various linguistic materials. Experiment 1 used a children's story; experiment 2, nonsense sentences. Four mothers read both types of material in four ways: (1) in "normal" adult speech, (2) in "babytalk," (3) under the clarification instructions used in the "hearing impaired studies" (instructed clear speech) and (4) in (spontaneous) clear speech without instruction. No extra practice or feedback was given. Sentences were presented to 40 normal hearing college students with and without simultaneous masking noise. Results were separately tabulated for content and function words, and analyzed using standard statistical tests. The major finding in the study was individual variation in speaker intelligibility. "Real world" speakers vary in their baseline intelligibility. The four speakers also showed unique patterns of intelligibility as a function of each independent variable. Results were as follows. Nonsense sentences were less intelligible than story sentences. Function words were equal to, or more intelligible than, content words. Babytalk functioned as a clear speech style in story sentences but not nonsense sentences. One of the two clear speech styles was clearer than normal speech in adult-directed clarification. However, which style was clearer depended on interactions among the variables. The individual patterns seemed to result from interactions among demand characteristics, baseline intelligibility, materials, and differences in articulatory flexibility.
The impact of phonetic dissimilarity on the perception of foreign accented speech

NASA Astrophysics Data System (ADS)

Weil, Shawn A.

2003-10-01

Non-normative speech (i.e., synthetic speech, pathological speech, foreign accented speech) is more difficult to process for native listeners than is normative speech. Does perceptual dissimilarity affect only intelligibility, or are there other costs to processing? The current series of experiments investigates both the intelligibility and time course of foreign accented speech (FAS) perception. Native English listeners heard single English words spoken by both native English speakers and non-native speakers (Mandarin or Russian). Words were chosen based on the similarity between the phonetic inventories of the respective languages. Three experimental designs were used: a cross-modal matching task, a word repetition (shadowing) task, and two subjective ratings tasks which measured impressions of accentedness and effortfulness. The results replicate previous investigations that have found that FAS significantly lowers word intelligibility. Furthermore, in FAS as well as perceptual effort, in the word repetition task, correct responses are slower to accented words than to nonaccented words. An analysis indicates that both intelligibility and reaction time are, in part, functions of the similarity between the talker's utterance and the listener's representation of the word.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

PubMed

Greene, Beth G; Logan, John S; Pisoni, David B

1986-03-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.
Improving Speech Intelligibility in Children with Childhood Apraxia of Speech: Employing Evidence-Based Practice. EBP Briefs. Volume 9, Issue 5

ERIC Educational Resources Information Center

Koehlinger, Keegan M.

2015-01-01

Clinical Question: Would a preschool-aged child with childhood apraxia of speech (CAS) benefit from a singular approach--such as motor planning, sensory cueing, linguistic and rhythmic--or a combined approach in order to increase intelligibility of spoken language? Method: Systematic Review. Study Sources: ASHA Wire, Google Scholar, Speech Bite.…
A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation.

PubMed

Lai, Ying-Hui; Chen, Fei; Wang, Syu-Siang; Lu, Xugang; Tsao, Yu; Lee, Chin-Hui

2017-07-01

In a cochlear implant (CI) speech processor, noise reduction (NR) is a critical component for enabling CI users to attain improved speech perception under noisy conditions. Identifying an effective NR approach has long been a key topic in CI research. Recently, a deep denoising autoencoder (DDAE) based NR approach was proposed and shown to be effective in restoring clean speech from noisy observations. It was also shown that DDAE could provide better performance than several existing NR methods in standardized objective evaluations. Following this success with normal speech, this paper further investigated the performance of DDAE-based NR to improve the intelligibility of envelope-based vocoded speech, which simulates speech signal processing in existing CI devices. We compared the performance of speech intelligibility between DDAE-based NR and conventional single-microphone NR approaches using the noise vocoder simulation. The results of both objective evaluations and listening test showed that, under the conditions of nonstationary noise distortion, DDAE-based NR yielded higher intelligibility scores than conventional NR approaches. This study confirmed that DDAE-based NR could potentially be integrated into a CI processor to provide more benefits to CI users under noisy conditions.

Listening Effort During Sentence Processing Is Increased for Non-native Listeners: A Pupillometry Study

PubMed Central

Borghini, Giulia; Hazan, Valerie

2018-01-01

Current evidence demonstrates that even though some non-native listeners can achieve native-like performance for speech perception tasks in quiet, the presence of a background noise is much more detrimental to speech intelligibility for non-native compared to native listeners. Even when performance is equated across groups, it is likely that greater listening effort is required for non-native listeners. Importantly, the added listening effort might result in increased fatigue and a reduced ability to successfully perform multiple tasks simultaneously. Task-evoked pupil responses have been demonstrated to be a reliable measure of cognitive effort and can be useful in clarifying those aspects. In this study we compared the pupil response for 23 native English speakers and 27 Italian speakers of English as a second language. Speech intelligibility was tested for sentences presented in quiet and in background noise at two performance levels that were matched across groups. Signal-to-noise levels corresponding to these sentence intelligibility levels were pre-determined using an adaptive intelligibility task. Pupil response was significantly greater in non-native compared to native participants across both intelligibility levels. Therefore, for a given intelligibility level, a greater listening effort is required when listening in a second language in order to understand speech in noise. Results also confirmed that pupil response is sensitive to speech intelligibility during language comprehension, in line with previous research. However, contrary to our predictions, pupil response was not differentially modulated by intelligibility levels for native and non-native listeners. The present study corroborates that pupillometry can be deemed as a valid measure to be used in speech perception investigation, because it is sensitive to differences both across participants, such as listener type, and across conditions, such as variations in the level of speech intelligibility. Importantly, pupillometry offers us the possibility to uncover differences in listening effort even when those do not emerge in the performance level of individuals. PMID:29593489
Application of artifical intelligence principles to the analysis of "crazy" speech.

PubMed

Garfield, D A; Rapp, C

1994-04-01

Artificial intelligence computer simulation methods can be used to investigate psychotic or "crazy" speech. Here, symbolic reasoning algorithms establish semantic networks that schematize speech. These semantic networks consist of two main structures: case frames and object taxonomies. Node-based reasoning rules apply to object taxonomies and pathway-based reasoning rules apply to case frames. Normal listeners may recognize speech as "crazy talk" based on violations of node- and pathway-based reasoning rules. In this article, three separate segments of schizophrenic speech illustrate violations of these rules. This artificial intelligence approach is compared and contrasted with other neurolinguistic approaches and is discussed as a conceptual link between neurobiological and psychodynamic understandings of psychopathology.
Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech

PubMed Central

Van Ackeren, Markus Johannes; Barbero, Francesca M; Mattioni, Stefania; Bottini, Roberto

2018-01-01

The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives. PMID:29338838
SPEECH EVALUATION WITH AND WITHOUT PALATAL OBTURATOR IN PATIENTS SUBMITTED TO MAXILLECTOMY

PubMed Central

de Carvalho-Teles, Viviane; Pegoraro-Krook, Maria Inês; Lauris, José Roberto Pereira

2006-01-01

Most patients who have undergone resection of the maxillae due to benign or malignant tumors in the palatomaxillary region present with speech and swallowing disorders. Coupling of the oral and nasal cavities increases nasal resonance, resulting in hypernasality and unintelligible speech. Prosthodontic rehabilitation of maxillary resections with effective separation of the oral and nasal cavities can improve speech and esthetics, and assist the psychosocial adjustment of the patient as well. The objective of this study was to evaluate the efficacy of the palatal obturator prosthesis on speech intelligibility and resonance of 23 patients with age ranging from 18 to 83 years (Mean = 49.5 years), who had undergone inframedial-structural maxillectomy. The patients were requested to count from 1 to 20, to repeat 21 words and to spontaneously speak for 15 seconds, once with and again without the prosthesis, for tape recording purposes. The resonance and speech intelligibility were judged by 5 speech language pathologists from the tape recordings samples. The results have shown that the majority of patients (82.6%) significantly improved their speech intelligibility, and 16 patients (69.9%) exhibited a significant hypernasality reduction with the obturator in place. The results of this study indicated that maxillary obturator prosthesis was efficient to improve the speech intelligibility and resonance in patients who had undergone maxillectomy. PMID:19089242
Neurogenic Orofacial Weakness and Speech in Adults With Dysarthria

PubMed Central

Makashay, Matthew J.; Helou, Leah B.; Clark, Heather M.

2017-01-01

Purpose This study compared orofacial strength between adults with dysarthria and neurologically normal (NN) matched controls. In addition, orofacial muscle weakness was examined for potential relationships to speech impairments in adults with dysarthria. Method Matched groups of 55 adults with dysarthria and 55 NN adults generated maximum pressure (Pmax) against an air-filled bulb during lingual elevation, protrusion and lateralization, and buccodental and labial compressions. These orofacial strength measures were compared with speech intelligibility, perceptual ratings of speech, articulation rate, and fast syllable-repetition rate. Results The dysarthria group demonstrated significantly lower orofacial strength than the NN group on all tasks. Lingual strength correlated moderately and buccal strength correlated weakly with most ratings of speech deficits. Speech intelligibility was not sensitive to dysarthria severity. Individuals with severely reduced anterior lingual elevation Pmax (< 18 kPa) had normal to profoundly impaired sentence intelligibility (99%–6%) and moderately to severely impaired speech (26%–94% articulatory imprecision; 33%–94% overall severity). Conclusions Results support the presence of orofacial muscle weakness in adults with dysarthrias of varying etiologies but reinforce tenuous links between orofacial strength and speech production disorders. By examining individual data, preliminary evidence emerges to suggest that speech, but not necessarily intelligibility, is likely to be impaired when lingual weakness is severe. PMID:28763804
Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

ERIC Educational Resources Information Center

Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

2015-01-01

Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…
Relationships between Speech Intelligibility and Word Articulation Scores in Children with Hearing Loss

PubMed Central

Ertmer, David J.

2012-01-01

Purpose This investigation sought to determine whether scores from a commonly used word-based articulation test are closely associated with speech intelligibility in children with hearing loss. If the scores are closely related, articulation testing results might be used to estimate intelligibility. If not, the importance of direct assessment of intelligibility would be reinforced. Methods Forty-four children with hearing losses produced words from the Goldman-Fristoe Test of Articulation-2 and sets of 10 short sentences. Correlation analyses were conducted between scores for seven word-based predictor variables and percent-intelligible scores derived from listener judgments of stimulus sentences. Results Six of seven predictor variables were significantly correlated with percent-intelligible scores. However, regression analysis revealed that no single predictor variable or multi- variable model accounted for more than 25% of the variability in intelligibility scores. Implications The findings confirm the importance of assessing connected speech intelligibility directly. PMID:20220022
Performance in noise: Impact of reduced speech intelligibility on Sailor performance in a Navy command and control environment.

PubMed

Keller, M David; Ziriax, John M; Barns, William; Sheffield, Benjamin; Brungart, Douglas; Thomas, Tony; Jaeger, Bobby; Yankaskas, Kurt

2017-06-01

Noise, hearing loss, and electronic signal distortion, which are common problems in military environments, can impair speech intelligibility and thereby jeopardize mission success. The current study investigated the impact that impaired communication has on operational performance in a command and control environment by parametrically degrading speech intelligibility in a simulated shipborne Combat Information Center. Experienced U.S. Navy personnel served as the study participants and were required to monitor information from multiple sources and respond appropriately to communications initiated by investigators playing the roles of other personnel involved in a realistic Naval scenario. In each block of the scenario, an adaptive intelligibility modification system employing automatic gain control was used to adjust the signal-to-noise ratio to achieve one of four speech intelligibility levels on a Modified Rhyme Test: No Loss, 80%, 60%, or 40%. Objective and subjective measures of operational performance suggested that performance systematically degraded with decreasing speech intelligibility, with the largest drop occurring between 80% and 60%. These results confirm the importance of noise reduction, good communication design, and effective hearing conservation programs to maximize the operational effectiveness of military personnel. Published by Elsevier B.V.
Measuring up to speech intelligibility.

PubMed

Miller, Nick

2013-01-01

Improvement or maintenance of speech intelligibility is a central aim in a whole range of conditions in speech-language therapy, both developmental and acquired. Best clinical practice and pursuance of the evidence base for interventions would suggest measurement of intelligibility forms a vital role in clinical decision-making and monitoring. However, what should be measured to gauge intelligibility and how this is achieved and relates to clinical planning continues to be a topic of debate. This review considers the strengths and weaknesses of selected clinical approaches to intelligibility assessment, stressing the importance of explanatory, diagnostic testing as both a more sensitive and a clinically informative method. The worth of this, and any approach, is predicated, though, on awareness and control of key design, elicitation, transcription and listening/listener variables to maximize validity and reliability of assessments. These are discussed. A distinction is drawn between signal-dependent and -independent factors in intelligibility evaluation. Discussion broaches how these different perspectives might be reconciled to deliver comprehensive insights into intelligibility levels and their clinical/educational significance. The paper ends with a call for wider implementation of best practice around intelligibility assessment. © 2013 Royal College of Speech and Language Therapists.
Breath-Group Intelligibility in Dysarthria: Characteristics and Underlying Correlates

ERIC Educational Resources Information Center

Yunusova, Yana; Weismer, Gary; Kent, Ray D.; Rusche, Nicole M.

2005-01-01

Purpose: This study was designed to determine whether within-speaker fluctuations in speech intelligibility occurred among speakers with dysarthria who produced a reading passage, and, if they did, whether selected linguistic and acoustic variables predicted the variations in speech intelligibility. Method: Participants with dysarthria included a…
Predicting Intelligibility Gains in Individuals with Dysarthria from Baseline Speech Features

ERIC Educational Resources Information Center

Fletcher, Annalise R.; McAuliffe, Megan J.; Lansford, Kaitlin L.; Sinex, Donal G.; Liss, Julie M.

2017-01-01

Purpose: Across the treatment literature, behavioral speech modifications have produced variable intelligibility changes in speakers with dysarthria. This study is the first of two articles exploring whether measurements of baseline speech features can predict speakers' responses to these modifications. Method: Fifty speakers (7 older individuals…
Influence of Visual Information on the Intelligibility of Dysarthric Speech

ERIC Educational Resources Information Center

Keintz, Connie K.; Bunton, Kate; Hoit, Jeannette D.

2007-01-01

Purpose: To examine the influence of visual information on speech intelligibility for a group of speakers with dysarthria associated with Parkinson's disease. Method: Eight speakers with Parkinson's disease and dysarthria were recorded while they read sentences. Speakers performed a concurrent manual task to facilitate typical speech production.…
Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

ERIC Educational Resources Information Center

Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

2004-01-01

This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…
Speech Intelligibility and Psychosocial Functioning in Deaf Children and Teens with Cochlear Implants

ERIC Educational Resources Information Center

Freeman, Valerie; Pisoni, David B.; Kronenberger, William G.; Castellanos, Irina

2017-01-01

Deaf children with cochlear implants (CIs) are at risk for psychosocial adjustment problems, possibly due to delayed speech-language skills. This study investigated associations between a core component of spoken-language ability--speech intelligibility--and the psychosocial development of prelingually deaf CI users. Audio-transcription measures…
Spectrotemporal Modulation Sensitivity as a Predictor of Speech Intelligibility for Hearing-Impaired Listeners

PubMed Central

Bernstein, Joshua G.W.; Mehraei, Golbarg; Shamma, Shihab; Gallun, Frederick J.; Theodoroff, Sarah M.; Leek, Marjorie R.

2014-01-01

Background A model that can accurately predict speech intelligibility for a given hearing-impaired (HI) listener would be an important tool for hearing-aid fitting or hearing-aid algorithm development. Existing speech-intelligibility models do not incorporate variability in suprathreshold deficits that are not well predicted by classical audiometric measures. One possible approach to the incorporation of such deficits is to base intelligibility predictions on sensitivity to simultaneously spectrally and temporally modulated signals. Purpose The likelihood of success of this approach was evaluated by comparing estimates of spectrotemporal modulation (STM) sensitivity to speech intelligibility and to psychoacoustic estimates of frequency selectivity and temporal fine-structure (TFS) sensitivity across a group of HI listeners. Research Design The minimum modulation depth required to detect STM applied to an 86 dB SPL four-octave noise carrier was measured for combinations of temporal modulation rate (4, 12, or 32 Hz) and spectral modulation density (0.5, 1, 2, or 4 cycles/octave). STM sensitivity estimates for individual HI listeners were compared to estimates of frequency selectivity (measured using the notched-noise method at 500, 1000measured using the notched-noise method at 500, 2000, and 4000 Hz), TFS processing ability (2 Hz frequency-modulation detection thresholds for 500, 10002 Hz frequency-modulation detection thresholds for 500, 2000, and 4000 Hz carriers) and sentence intelligibility in noise (at a 0 dB signal-to-noise ratio) that were measured for the same listeners in a separate study. Study Sample Eight normal-hearing (NH) listeners and 12 listeners with a diagnosis of bilateral sensorineural hearing loss participated. Data Collection and Analysis STM sensitivity was compared between NH and HI listener groups using a repeated-measures analysis of variance. A stepwise regression analysis compared STM sensitivity for individual HI listeners to audiometric thresholds, age, and measures of frequency selectivity and TFS processing ability. A second stepwise regression analysis compared speech intelligibility to STM sensitivity and the audiogram-based Speech Intelligibility Index. Results STM detection thresholds were elevated for the HI listeners, but only for low rates and high densities. STM sensitivity for individual HI listeners was well predicted by a combination of estimates of frequency selectivity at 4000 Hz and TFS sensitivity at 500 Hz but was unrelated to audiometric thresholds. STM sensitivity accounted for an additional 40% of the variance in speech intelligibility beyond the 40% accounted for by the audibility-based Speech Intelligibility Index. Conclusions Impaired STM sensitivity likely results from a combination of a reduced ability to resolve spectral peaks and a reduced ability to use TFS information to follow spectral-peak movements. Combining STM sensitivity estimates with audiometric threshold measures for individual HI listeners provided a more accurate prediction of speech intelligibility than audiometric measures alone. These results suggest a significant likelihood of success for an STM-based model of speech intelligibility for HI listeners. PMID:23636210
Speech-on-speech masking with variable access to the linguistic content of the masker speech for native and nonnative english speakers.

PubMed

Calandruccio, Lauren; Bradlow, Ann R; Dhar, Sumitrajit

2014-04-01

Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared with native-accented English speech was reported in Calandruccio et al (2010a). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech Affiliationect masking release. A mixed-model design with within-subject (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech and high-intelligibility, moderate-intelligibility, and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Three listener groups were tested, including monolingual English speakers with normal hearing, nonnative English speakers with normal hearing, and monolingual English speakers with hearing loss. The nonnative English speakers were from various native language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetric mild sloping to moderate sensorineural hearing loss. Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the key words within the sentences (100 key words per masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and listener groups. Monolingual English speakers with normal hearing benefited when the competing speech signal was foreign accented compared with native accented, allowing for improved speech recognition. Various levels of intelligibility across the foreign-accented speech maskers did not influence results. Neither the nonnative English-speaking listeners with normal hearing nor the monolingual English speakers with hearing loss benefited from masking release when the masker was changed from native-accented to foreign-accented English. Slight modifications between the target and the masker speech allowed monolingual English speakers with normal hearing to improve their recognition of native-accented English, even when the competing speech was highly intelligible. Further research is needed to determine which modifications within the competing speech signal caused the Mandarin-accented English to be less effective with respect to masking. Determining the influences within the competing speech that make it less effective as a masker or determining why monolingual normal-hearing listeners can take advantage of these differences could help improve speech recognition for those with hearing loss in the future. American Academy of Audiology.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

PubMed Central

GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

2012-01-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916
Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces

PubMed Central

Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Savariaux, Christophe; Yvert, Blaise

2016-01-01

Restoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI) controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips) into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN) trained on electromagnetic articulography (EMA) data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer. PMID:27880768
Speech outcome in unilateral complete cleft lip and palate patients: a descriptive study.

PubMed

Rullo, R; Di Maggio, D; Addabbo, F; Rullo, F; Festa, V M; Perillo, L

2014-09-01

In this study, resonance and articulation disorders were examined in a group of patients surgically treated for cleft lip and palate, considering family social background, and children's ability of self monitoring their speech output while speaking. Fifty children (32 males and 18 females) mean age 6.5 ± 1.6 years, affected by non-syndromic complete unilateral cleft of the lip and palate underwent the same surgical protocol. The speech level was evaluated using the Accordi's speech assessment protocol that focuses on intelligibility, nasality, nasal air escape, pharyngeal friction, and glottal stop. Pearson product-moment correlation analysis was used to detect significant associations between analysed parameters. A total of 16% (8 children) of the sample had severe to moderate degree of nasality and nasal air escape, presence of pharyngeal friction and glottal stop, which obviously compromise speech intelligibility. Ten children (10%) showed a barely acceptable phonological outcome: nasality and nasal air escape were mild to moderate, but the intelligibility remained poor. Thirty-two children (64%) had normal speech. Statistical analysis revealed a significant correlation between the severity of nasal resonance and nasal air escape (p ≤ 0.05). No statistical significant correlation was found between the final intelligibility and the patient social background, neither between the final intelligibility nor the age of the patients. The differences in speech outcome could be explained with a specific, subjective, and inborn ability, different for each child, in self-monitoring their speech output.
Effects of Lexical and Somatosensory Feedback on Long-Term Improvements in Intelligibility of Dysarthric Speech

ERIC Educational Resources Information Center

Borrie, Stephanie A.; Schäfer, Martina C. M.

2017-01-01

Purpose: Intelligibility improvements immediately following perceptual training with dysarthric speech using lexical feedback are comparable to those observed when training uses somatosensory feedback (Borrie & Schäfer, 2015). In this study, we investigated if these lexical and somatosensory guided improvements in listener intelligibility of…

Effects of Speaking Task on Intelligibility in Parkinson's Disease

ERIC Educational Resources Information Center

Tjaden, Kris; Wilding, Greg

2011-01-01

Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare…
Intelligibility of 4-Year-Old Children with and without Cerebral Palsy

ERIC Educational Resources Information Center

Hustad, Katherine C.; Schueler, Brynn; Schultz, Laurel; DuHadway, Caitlin

2012-01-01

Purpose: The authors examined speech intelligibility in typically developing (TD) children and 3 groups of children with cerebral palsy (CP) who were classified into speech/language profile groups following Hustad, Gorton, and Lee (2010). Questions addressed differences in transcription intelligibility scores among groups, the effects of utterance…
Development of Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Assessments of the Modulation Type with Regard to Intelligibility and Sound Quality

NASA Astrophysics Data System (ADS)

Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

2012-07-01

Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed; however, further improvements are needed, especially in terms of articulation and sound quality. In this study, the intelligibility and sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulation] were evaluated. The results showed that DSB-TC and transposed speech were more intelligible than DSB-SC speech, and transposed speech was closer than the other types of BCU speech to air-conducted speech in terms of sound quality. These results provide useful information for further development of the BCUHA.
Relationship Among Signal Fidelity, Hearing Loss, and Working Memory for Digital Noise Suppression.

PubMed

Arehart, Kathryn; Souza, Pamela; Kates, James; Lunner, Thomas; Pedersen, Michael Syskind

2015-01-01

This study considered speech modified by additive babble combined with noise-suppression processing. The purpose was to determine the relative importance of the signal modifications, individual peripheral hearing loss, and individual cognitive capacity on speech intelligibility and speech quality. The participant group consisted of 31 individuals with moderate high-frequency hearing loss ranging in age from 51 to 89 years (mean = 69.6 years). Speech intelligibility and speech quality were measured using low-context sentences presented in babble at several signal-to-noise ratios. Speech stimuli were processed with a binary mask noise-suppression strategy with systematic manipulations of two parameters (error rate and attenuation values). The cumulative effects of signal modification produced by babble and signal processing were quantified using an envelope-distortion metric. Working memory capacity was assessed with a reading span test. Analysis of variance was used to determine the effects of signal processing parameters on perceptual scores. Hierarchical linear modeling was used to determine the role of degree of hearing loss and working memory capacity in individual listener response to the processed noisy speech. The model also considered improvements in envelope fidelity caused by the binary mask and the degradations to envelope caused by error and noise. The participants showed significant benefits in terms of intelligibility scores and quality ratings for noisy speech processed by the ideal binary mask noise-suppression strategy. This benefit was observed across a range of signal-to-noise ratios and persisted when up to a 30% error rate was introduced into the processing. Average intelligibility scores and average quality ratings were well predicted by an objective metric of envelope fidelity. Degree of hearing loss and working memory capacity were significant factors in explaining individual listener's intelligibility scores for binary mask processing applied to speech in babble. Degree of hearing loss and working memory capacity did not predict listeners' quality ratings. The results indicate that envelope fidelity is a primary factor in determining the combined effects of noise and binary mask processing for intelligibility and quality of speech presented in babble noise. Degree of hearing loss and working memory capacity are significant factors in explaining variability in listeners' speech intelligibility scores but not in quality ratings.
Accent, intelligibility, and comprehensibility in the perception of foreign-accented Lombard speech

NASA Astrophysics Data System (ADS)

Li, Chi-Nin

2003-10-01

Speech produced in noise (Lombard speech) has been reported to be more intelligible than speech produced in quiet (normal speech). This study examined the perception of non-native Lombard speech in terms of intelligibility, comprehensibility, and degree of foreign accent. Twelve Cantonese speakers and a comparison group of English speakers read simple true and false English statements in quiet and in 70 dB of masking noise. Lombard and normal utterances were mixed with noise at a constant signal-to-noise ratio, and presented along with noise-free stimuli to eight new English listeners who provided transcription scores, comprehensibility ratings, and accent ratings. Analyses showed that, as expected, utterances presented in noise were less well perceived than were noise-free sentences, and that the Cantonese speakers' productions were more accented, but less intelligible and less comprehensible than those of the English speakers. For both groups of speakers, the Lombard sentences were correctly transcribed more often than their normal utterances in noisy conditions. However, the Cantonese-accented Lombard sentences were not rated as easier to understand than was the normal speech in all conditions. The assigned accent ratings were similar throughout all listening conditions. Implications of these findings will be discussed.
Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners.

PubMed

Geravanchizadeh, Masoud; Fallah, Ali

2015-12-01

A binaural and psychoacoustically motivated intelligibility model, based on a well-known monaural microscopic model is proposed. This model simulates a phoneme recognition task in the presence of spatially distributed speech-shaped noise in anechoic scenarios. In the proposed model, binaural advantage effects are considered by generating a feature vector for a dynamic-time-warping speech recognizer. This vector consists of three subvectors incorporating two monaural subvectors to model the better-ear hearing, and a binaural subvector to simulate the binaural unmasking effect. The binaural unit of the model is based on equalization-cancellation theory. This model operates blindly, which means separate recordings of speech and noise are not required for the predictions. Speech intelligibility tests were conducted with 12 normal hearing listeners by collecting speech reception thresholds (SRTs) in the presence of single and multiple sources of speech-shaped noise. The comparison of the model predictions with the measured binaural SRTs, and with the predictions of a macroscopic binaural model called extended equalization-cancellation, shows that this approach predicts the intelligibility in anechoic scenarios with good precision. The square of the correlation coefficient (r(2)) and the mean-absolute error between the model predictions and the measurements are 0.98 and 0.62 dB, respectively.
Effects of Additional Low-Pass-Filtered Speech on Listening Effort for Noise-Band-Vocoded Speech in Quiet and in Noise.

PubMed

Pals, Carina; Sarampalis, Anastasios; van Dijk, Mart; Başkent, Deniz

2018-05-11

Residual acoustic hearing in electric-acoustic stimulation (EAS) can benefit cochlear implant (CI) users in increased sound quality, speech intelligibility, and improved tolerance to noise. The goal of this study was to investigate whether the low-pass-filtered acoustic speech in simulated EAS can provide the additional benefit of reducing listening effort for the spectrotemporally degraded signal of noise-band-vocoded speech. Listening effort was investigated using a dual-task paradigm as a behavioral measure, and the NASA Task Load indeX as a subjective self-report measure. The primary task of the dual-task paradigm was identification of sentences presented in three experiments at three fixed intelligibility levels: at near-ceiling, 50%, and 79% intelligibility, achieved by manipulating the presence and level of speech-shaped noise in the background. Listening effort for the primary intelligibility task was reflected in the performance on the secondary, visual response time task. Experimental speech processing conditions included monaural or binaural vocoder, with added low-pass-filtered speech (to simulate EAS) or without (to simulate CI). In Experiment 1, in quiet with intelligibility near-ceiling, additional low-pass-filtered speech reduced listening effort compared with binaural vocoder, in line with our expectations, although not compared with monaural vocoder. In Experiments 2 and 3, for speech in noise, added low-pass-filtered speech allowed the desired intelligibility levels to be reached at less favorable speech-to-noise ratios, as expected. It is interesting that this came without the cost of increased listening effort usually associated with poor speech-to-noise ratios; at 50% intelligibility, even a reduction in listening effort on top of the increased tolerance to noise was observed. The NASA Task Load indeX did not capture these differences. The dual-task results provide partial evidence for a potential decrease in listening effort as a result of adding low-frequency acoustic speech to noise-band-vocoded speech. Whether these findings translate to CI users with residual acoustic hearing will need to be addressed in future research because the quality and frequency range of low-frequency acoustic sound available to listeners with hearing loss may differ from our idealized simulations, and additional factors, such as advanced age and varying etiology, may also play a role.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
Spatial Release From Masking in Simulated Cochlear Implant Users With and Without Access to Low-Frequency Acoustic Hearing

PubMed Central

Dietz, Mathias; Hohmann, Volker; Jürgens, Tim

2015-01-01

For normal-hearing listeners, speech intelligibility improves if speech and noise are spatially separated. While this spatial release from masking has already been quantified in normal-hearing listeners in many studies, it is less clear how spatial release from masking changes in cochlear implant listeners with and without access to low-frequency acoustic hearing. Spatial release from masking depends on differences in access to speech cues due to hearing status and hearing device. To investigate the influence of these factors on speech intelligibility, the present study measured speech reception thresholds in spatially separated speech and noise for 10 different listener types. A vocoder was used to simulate cochlear implant processing and low-frequency filtering was used to simulate residual low-frequency hearing. These forms of processing were combined to simulate cochlear implant listening, listening based on low-frequency residual hearing, and combinations thereof. Simulated cochlear implant users with additional low-frequency acoustic hearing showed better speech intelligibility in noise than simulated cochlear implant users without acoustic hearing and had access to more spatial speech cues (e.g., higher binaural squelch). Cochlear implant listener types showed higher spatial release from masking with bilateral access to low-frequency acoustic hearing than without. A binaural speech intelligibility model with normal binaural processing showed overall good agreement with measured speech reception thresholds, spatial release from masking, and spatial speech cues. This indicates that differences in speech cues available to listener types are sufficient to explain the changes of spatial release from masking across these simulated listener types. PMID:26721918
Talker Differences in Clear and Conversational Speech: Acoustic Characteristics of Vowels

ERIC Educational Resources Information Center

Ferguson, Sarah Hargus; Kewley-Port, Diane

2007-01-01

Purpose: To determine the specific acoustic changes that underlie improved vowel intelligibility in clear speech. Method: Seven acoustic metrics were measured for conversational and clear vowels produced by 12 talkers--6 who previously were found (S. H. Ferguson, 2004) to produce a large clear speech vowel intelligibility effect for listeners with…
Production Variability and Single Word Intelligibility in Aphasia and Apraxia of Speech

ERIC Educational Resources Information Center

Haley, Katarina L.; Martin, Gwenyth

2011-01-01

This study was designed to estimate test-retest reliability of orthographic speech intelligibility testing in speakers with aphasia and AOS and to examine its relationship to the consistency of speaker and listener responses. Monosyllabic single word speech samples were recorded from 13 speakers with coexisting aphasia and AOS. These words were…
Methods and Applications of the Audibility Index in Hearing Aid Selection and Fitting

PubMed Central

Amlani, Amyn M.; Punch, Jerry L.; Ching, Teresa Y. C.

2002-01-01

During the first half of the 20th century, communications engineers at Bell Telephone Laboratories developed the articulation model for predicting speech intelligibility transmitted through different telecommunication devices under varying electroacoustic conditions. The profession of audiology adopted this model and its quantitative aspects, known as the Articulation Index and Speech Intelligibility Index, and applied these indices to the prediction of unaided and aided speech intelligibility in hearing-impaired listeners. Over time, the calculation methods of these indices—referred to collectively in this paper as the Audibility Index—have been continually refined and simplified for clinical use. This article provides (1) an overview of the basic principles and the calculation methods of the Audibility Index, the Speech Transmission Index and related indices, as well as the Speech Recognition Sensitivity Model, (2) a review of the literature on using the Audibility Index to predict speech intelligibility of hearing-impaired listeners, (3) a review of the literature on the applicability of the Audibility Index to the selection and fitting of hearing aids, and (4) a discussion of future scientific needs and clinical applications of the Audibility Index. PMID:25425917
The Impact of Dysphonic Voices on Healthy Listeners: Listener Reaction Times, Speech Intelligibility, and Listener Comprehension.

PubMed

Evitts, Paul M; Starmer, Heather; Teets, Kristine; Montgomery, Christen; Calhoun, Lauren; Schulze, Allison; MacKenzie, Jenna; Adams, Lauren

2016-11-01

There is currently minimal information on the impact of dysphonia secondary to phonotrauma on listeners. Considering the high incidence of voice disorders with professional voice users, it is important to understand the impact of a dysphonic voice on their audiences. Ninety-one healthy listeners (39 men, 52 women; mean age = 23.62 years) were presented with speech stimuli from 5 healthy speakers and 5 speakers diagnosed with dysphonia secondary to phonotrauma. Dependent variables included processing speed (reaction time [RT] ratio), speech intelligibility, and listener comprehension. Voice quality ratings were also obtained for all speakers by 3 expert listeners. Statistical results showed significant differences between RT ratio and number of speech intelligibility errors between healthy and dysphonic voices. There was not a significant difference in listener comprehension errors. Multiple regression analyses showed that voice quality ratings from the Consensus Assessment Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were able to predict RT ratio and speech intelligibility but not listener comprehension. Results of the study suggest that although listeners require more time to process and have more intelligibility errors when presented with speech stimuli from speakers with dysphonia secondary to phonotrauma, listener comprehension may not be affected.
Speech intelligibility at high helium-oxygen pressures.

PubMed

Rothman, H B; Gelfand, R; Hollien, H; Lambertsen, C J

1980-12-01

Word-list intelligibility scores of unprocessed speech (mean of 4 subjects) were recorded in helium-oxygen atmospheres at stable pressures equivalent to 1600, 1400, 1200, 1000, 860, 690, 560, 392, and 200 fsw daring Predictive Studies IV-1975 by wide-bandwidth condenser microphones (frequency responses not degraded by increased gas density). Intelligibility scores were substantially lower in helium-oxygen a 200 fsw than in air at l ATA, but there was little difference between 200 fsw and 1600 fsw. A previously documented prominent decrease in intelligibility of speech between 200 or 600 fsw because of helium and pressure was probably due to degradation of microphone frequency response by high gas density.
A Comparison of the Single-sided (Gen II) and Double-sided (Gen I) Combat Arms Earplugs (CAE): Acoustic Properties, Human Performance, and User Acceptance

DTIC Science & Technology

2012-12-01

results of steady-state and impulse noise attenuation objective and real-ear measurements; localization and speech intelligibility human performance... Intelligibility 12 6.1 Method...12 Figure 8. Speech intelligibility testing of Gen I CAE and Gen II CAE
The Intelligibility of Indian English. Monograph No. 4.

ERIC Educational Resources Information Center

Bansal, R. K.

Twenty-four English speakers from various regions of India were tested for the intelligibility of their speech. Recordings of speech in a variety of contexts were evaluated by listeners from the United Kingdom, the United States, Nigeria, and Germany. On the basis of the resulting intelligibility scores, factors which tend to hinder…
Disentangling syntax and intelligibility in auditory language comprehension.

PubMed

Friederici, Angela D; Kotz, Sonja A; Scott, Sophie K; Obleser, Jonas

2010-03-01

Studies of the neural basis of spoken language comprehension typically focus on aspects of auditory processing by varying signal intelligibility, or on higher-level aspects of language processing such as syntax. Most studies in either of these threads of language research report brain activation including peaks in the superior temporal gyrus (STG) and/or the superior temporal sulcus (STS), but it is not clear why these areas are recruited in functionally different studies. The current fMRI study aims to disentangle the functional neuroanatomy of intelligibility and syntax in an orthogonal design. The data substantiate functional dissociations between STS and STG in the left and right hemispheres: first, manipulations of speech intelligibility yield bilateral mid-anterior STS peak activation, whereas syntactic phrase structure violations elicit strongly left-lateralized mid STG and posterior STS activation. Second, ROI analyses indicate all interactions of speech intelligibility and syntactic correctness to be located in the left frontal and temporal cortex, while the observed right-hemispheric activations reflect less specific responses to intelligibility and syntax. Our data demonstrate that the mid-to-anterior STS activation is associated with increasing speech intelligibility, while the mid-to-posterior STG/STS is more sensitive to syntactic information within the speech. 2009 Wiley-Liss, Inc.
Cortical characterization of the perception of intelligible and unintelligible speech measured via high-density electroencephalography.

PubMed

Utianski, Rene L; Caviness, John N; Liss, Julie M

2015-01-01

High-density electroencephalography was used to evaluate cortical activity during speech comprehension via a sentence verification task. Twenty-four participants assigned true or false to sentences produced with 3 noise-vocoded channel levels (1--unintelligible, 6--decipherable, 16--intelligible), during simultaneous EEG recording. Participant data were sorted into higher- (HP) and lower-performing (LP) groups. The identification of a late-event related potential for LP listeners in the intelligible condition and in all listeners when challenged with a 6-Ch signal supports the notion that this induced potential may be related to either processing degraded speech, or degraded processing of intelligible speech. Different cortical locations are identified as neural generators responsible for this activity; HP listeners are engaging motor aspects of their language system, utilizing an acoustic-phonetic based strategy to help resolve the sentence, while LP listeners do not. This study presents evidence for neurophysiological indices associated with more or less successful speech comprehension performance across listening conditions. Copyright © 2014 Elsevier Inc. All rights reserved.
Reduced efficiency of audiovisual integration for nonnative speech.

PubMed

Yi, Han-Gyol; Phelps, Jasmine E B; Smiljanic, Rajka; Chandrasekaran, Bharath

2013-11-01

The role of visual cues in native listeners' perception of speech produced by nonnative speakers has not been extensively studied. Native perception of English sentences produced by native English and Korean speakers in audio-only and audiovisual conditions was examined. Korean speakers were rated as more accented in audiovisual than in the audio-only condition. Visual cues enhanced word intelligibility for native English speech but less so for Korean-accented speech. Reduced intelligibility of Korean-accented audiovisual speech was associated with implicit visual biases, suggesting that listener-related factors partially influence the efficiency of audiovisual integration for nonnative speech perception.
Effects of speech intelligibility level on concurrent visual task performance.

PubMed

Payne, D G; Peters, L J; Birkmire, D P; Bonto, M A; Anastasi, J S; Wenger, M J

1994-09-01

Four experiments were performed to determine if changes in the level of speech intelligibility in an auditory task have an impact on performance in concurrent visual tasks. The auditory task used in each experiment was a memory search task in which subjects memorized a set of words and then decided whether auditorily presented probe items were members of the memorized set. The visual tasks used were an unstable tracking task, a spatial decision-making task, a mathematical reasoning task, and a probability monitoring task. Results showed that performance on the unstable tracking and probability monitoring tasks was unaffected by the level of speech intelligibility on the auditory task, whereas accuracy in the spatial decision-making and mathematical processing tasks was significantly worse at low speech intelligibility levels. The findings are interpreted within the framework of multiple resource theory.
Speech intelligibility in noise using throat and acoustic microphones.

PubMed

Acker-Mills, Barbara E; Houtsma, Adrianus J M; Ahroon, William A

2006-01-01

Helicopter cockpits are very noisy and this noise must be reduced for effective communication. The standard U.S. Army aviation helmet is equipped with a noise-canceling acoustic microphone, but some ambient noise still is transmitted. Throat microphones are not sensitive to air molecule vibrations and thus, transmittal of ambient noise is reduced. It is possible that throat microphones could enhance speech communication in helicopters, but speech intelligibility with the devices must first be assessed. In the current study, speech intelligibility of signals generated by an acoustic microphone, a throat microphone, and by the combined output of the two microphones was assessed using the Modified Rhyme Test (MRT). Stimulus words were recorded in a reverberant chamber with ambient broadband noise intensity at 90 and 106 dBA. Listeners completed the MRT task in the same settings, thus simulating the typical environment of a rotary-wing aircraft. Results show that speech intelligibility is significantly worse for the throat microphone (average percent correct = 55.97) than for the acoustic microphone (average percent correct = 69.70), particularly for the higher noise level. In addition, no benefit is gained by simultaneously using both microphones. A follow-up experiment evaluated different consonants using the Diagnostic Rhyme Test and replicated the MRT results. The current results show that intelligibility using throat microphones is poorer than with the use of boom microphones in noisy and in quiet environments. Therefore, throat microphones are not recommended for use in any situation where fast and accurate speech intelligibility is essential.

Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise

PubMed Central

Chen, Fei; Loizou, Philipos C.

2012-01-01

Recent evidence suggests that spectral change, as measured by cochlea-scaled entropy (CSE), predicts speech intelligibility better than the information carried by vowels or consonants in sentences. Motivated by this finding, the present study investigates whether intelligibility indices implemented to include segments marked with significant spectral change better predict speech intelligibility in noise than measures that include all phonetic segments paying no attention to vowels/consonants or spectral change. The prediction of two intelligibility measures [normalized covariance measure (NCM), coherence-based speech intelligibility index (CSII)] is investigated using three sentence-segmentation methods: relative root-mean-square (RMS) levels, CSE, and traditional phonetic segmentation of obstruents and sonorants. While the CSE method makes no distinction between spectral changes occurring within vowels/consonants, the RMS-level segmentation method places more emphasis on the vowel-consonant boundaries wherein the spectral change is often most prominent, and perhaps most robust, in the presence of noise. Higher correlation with intelligibility scores was obtained when including sentence segments containing a large number of consonant-vowel boundaries than when including segments with highest entropy or segments based on obstruent/sonorant classification. These data suggest that in the context of intelligibility measures the type of spectral change captured by the measure is important. PMID:22559382
A laboratory study for assessing speech privacy in a simulated open-plan office.

PubMed

Lee, P J; Jeon, J Y

2014-06-01

The aim of this study is to assess speech privacy in open-plan office using two recently introduced single-number quantities: the spatial decay rate of speech, DL(2,S) [dB], and the A-weighted sound pressure level of speech at a distance of 4 m, L(p,A,S,4) m [dB]. Open-plan offices were modeled using a DL(2,S) of 4, 8, and 12 dB, and L(p,A,S,4) m was changed in three steps, from 43 to 57 dB.Auditory experiments were conducted at three locations with source–receiver distances of 8, 16, and 24 m, while background noise level was fixed at 30 dBA.A total of 20 subjects were asked to rate the speech intelligibility and listening difficulty of 240 Korean sentences in such surroundings. The speech intelligibility scores were not affected by DL(2,S) or L(p,A,S,4) m at a source–receiver distance of 8 m; however, listening difficulty ratings were significantly changed with increasing DL(2,S) and L(p,A,S,4) m values. At other locations, the influences of DL(2,S) and L(p,A,S,4) m on speech intelligibility and listening difficulty ratings were significant. It was also found that the speech intelligibility scores and listening difficulty ratings were considerably changed with increasing the distraction distance (r(D)). Furthermore, listening difficulty is more sensitive to variations in DL(2,S) and L(p,A,S,4) m than intelligibility scores for sound fields with high speech transmission performances. The recently introduced single-number quantities in the ISO standard, based on the spatial distribution of sound pressure level, were associated with speech privacy in an open-plan office. The results support single-number quantities being suitable to assess speech privacy, mainly at large distances. This new information can be considered when designing open-plan offices and making acoustic guidelines of open-plan offices.
Speech-on-speech masking with variable access to the linguistic content of the masker speech for native and non-native speakers of English

PubMed Central

Calandruccio, Lauren; Bradlow, Ann R.; Dhar, Sumitrajit

2013-01-01

Background Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared to native-accented English speech was reported in Calandruccio, Dhar and Bradlow (2010). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. Purpose The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech affect masking release. Research Design A mixed model design with within- (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech, and high-intelligibility, moderate-intelligibility and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Study Sample Three listener groups were tested including monolingual English speakers with normal hearing, non-native speakers of English with normal hearing, and monolingual speakers of English with hearing loss. The non-native speakers of English were from various native-language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetrical, mild sloping to moderate sensorineural hearing loss. Data Collection and Analysis Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the keywords within the sentences (100 keywords/masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and the listener groups. Results Monolingual speakers of English with normal hearing benefited when the competing speech signal was foreign-accented compared to native-accented allowing for improved speech recognition. Various levels of intelligibility across the foreign-accented speech maskers did not influence results. Neither the non-native English listeners with normal hearing, nor the monolingual English speakers with hearing loss benefited from masking release when the masker was changed from native-accented to foreign-accented English. Conclusions Slight modifications between the target and the masker speech allowed monolingual speakers of English with normal hearing to improve their recognition of native-accented English even when the competing speech was highly intelligible. Further research is needed to determine which modifications within the competing speech signal caused the Mandarin-accented English to be less effective with respect to masking. Determining the influences within the competing speech that make it less effective as a masker, or determining why monolingual normal-hearing listeners can take advantage of these differences could help improve speech recognition for those with hearing loss in the future. PMID:25126683
Effects of Loud and Amplified Speech on Sentence and Word Intelligibility in Parkinson Disease

ERIC Educational Resources Information Center

Neel, Amy T.

2009-01-01

Purpose: In the two experiments in this study, the author examined the effects of increased vocal effort (loud speech) and amplification on sentence and word intelligibility in speakers with Parkinson disease (PD). Methods: Five talkers with PD produced sentences and words at habitual levels of effort and using loud speech techniques. Amplified…
Acoustic Changes in the Speech of Children with Cerebral Palsy Following an Intensive Program of Dysarthria Therapy

ERIC Educational Resources Information Center

Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

2018-01-01

Background: The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. Aims: To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. Methods & Procedures: We recorded 16…
The Benefits of Bimodal Aiding on Extended Dimensions of Speech Perception: Intelligibility, Listening Effort, and Sound Quality

PubMed Central

Chalupper, Josef

2017-01-01

The benefits of combining a cochlear implant (CI) and a hearing aid (HA) in opposite ears on speech perception were examined in 15 adult unilateral CI recipients who regularly use a contralateral HA. A within-subjects design was carried out to assess speech intelligibility testing, listening effort ratings, and a sound quality questionnaire for the conditions CI alone, CIHA together, and HA alone when applicable. The primary outcome of bimodal benefit, defined as the difference between CIHA and CI, was statistically significant for speech intelligibility in quiet as well as for intelligibility in noise across tested spatial conditions. A reduction in effort on top of intelligibility at the highest tested signal-to-noise ratio was found. Moreover, the bimodal listening situation was rated to sound more voluminous, less tinny, and less unpleasant than CI alone. Listening effort and sound quality emerged as feasible and relevant measures to demonstrate bimodal benefit across a clinically representative range of bimodal users. These extended dimensions of speech perception can shed more light on the array of benefits provided by complementing a CI with a contralateral HA. PMID:28874096
The Influence of Noise Reduction on Speech Intelligibility, Response Times to Speech, and Perceived Listening Effort in Normal-Hearing Listeners.

PubMed

van den Tillaart-Haverkate, Maj; de Ronde-Brons, Inge; Dreschler, Wouter A; Houben, Rolph

2017-01-01

Single-microphone noise reduction leads to subjective benefit, but not to objective improvements in speech intelligibility. We investigated whether response times (RTs) provide an objective measure of the benefit of noise reduction and whether the effect of noise reduction is reflected in rated listening effort. Twelve normal-hearing participants listened to digit triplets that were either unprocessed or processed with one of two noise-reduction algorithms: an ideal binary mask (IBM) and a more realistic minimum mean square error estimator (MMSE). For each of these three processing conditions, we measured (a) speech intelligibility, (b) RTs on two different tasks (identification of the last digit and arithmetic summation of the first and last digit), and (c) subjective listening effort ratings. All measurements were performed at four signal-to-noise ratios (SNRs): -5, 0, +5, and +∞ dB. Speech intelligibility was high (>97% correct) for all conditions. A significant decrease in response time, relative to the unprocessed condition, was found for both IBM and MMSE for the arithmetic but not the identification task. Listening effort ratings were significantly lower for IBM than for MMSE and unprocessed speech in noise. We conclude that RT for an arithmetic task can provide an objective measure of the benefit of noise reduction. For young normal-hearing listeners, both ideal and realistic noise reduction can reduce RTs at SNRs where speech intelligibility is close to 100%. Ideal noise reduction can also reduce perceived listening effort.
Associations between speech features and phenotypic severity in Treacher Collins syndrome

PubMed Central

2014-01-01

Background Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Methods Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5–74 years, median 34 years) divided into three groups comprising children 5–10 years (n = 4), adolescents 11–18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0–6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Results Children and adolescents presented with significantly higher speech composite scores (median 4, range 1–6) than adults (median 1, range 0–5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31–99) than in adults (98%, range 93–100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Conclusions Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services. PMID:24775909
Associations between speech features and phenotypic severity in Treacher Collins syndrome.

PubMed

Asten, Pamela; Akre, Harriet; Persson, Christina

2014-04-28

Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services.
Association of Velopharyngeal Insufficiency With Quality of Life and Patient-Reported Outcomes After Speech Surgery.

PubMed

Bhuskute, Aditi; Skirko, Jonathan R; Roth, Christina; Bayoumi, Ahmed; Durbin-Johnson, Blythe; Tollefson, Travis T

2017-09-01

Patients with cleft palate and other causes of velopharyngeal insufficiency (VPI) suffer adverse effects on social interactions and communication. Measurement of these patient-reported outcomes is needed to help guide surgical and nonsurgical care. To further validate the VPI Effects on Life Outcomes (VELO) instrument, measure the change in quality of life (QOL) after speech surgery, and test the association of change in speech with change in QOL. Prospective descriptive cohort including children and young adults undergoing speech surgery for VPI in a tertiary academic center. Participants completed the validated VELO instrument before and after surgical treatment. The main outcome measures were preoperative and postoperative VELO scores and the perceptual speech assessment of speech intelligibility. The VELO scores are divided into subscale domains. Changes in VELO after surgery were analyzed using linear regression models. VELO scores were analyzed as a function of speech intelligibility adjusting for age and cleft type. The correlation between speech intelligibility rating and VELO scores was estimated using the polyserial correlation. Twenty-nine patients (13 males and 16 females) were included. Mean (SD) age was 7.9 (4.1) years (range, 4-20 years). Pharyngeal flap was used in 14 (48%) cases, Furlow palatoplasty in 12 (41%), and sphincter pharyngoplasty in 1 (3%). The mean (SD) preoperative speech intelligibility rating was 1.71 (1.08), which decreased postoperatively to 0.79 (0.93) in 24 patients who completed protocol (P < .01). The VELO scores improved after surgery (P<.001) as did most subscale scores. Caregiver impact did not change after surgery (P = .36). Speech Intelligibility was correlated with preoperative and postoperative total VELO score (P < .01) and to preoperative subscale domains (situational difficulty [VELO-SiD, P = .005] and perception by others [VELO-PO, P = .05]) and postoperative subscale domains (VELO-SiD [P = .03], VELO-PO [P = .003]). Neither the VELO total nor subscale score change after surgery was correlated with change in speech intelligibility. Speech surgery improves VPI-specific quality of life. We confirmed validation in a population of untreated patients with VPI and included pharyngeal flap surgery, which had not previously been included in validation studies. The VELO instrument provides patient-specific outcomes, which allows a broader understanding of the social, emotional, and physical effects of VPI. 2.
Acoustic properties of naturally produced clear speech at normal speaking rates

NASA Astrophysics Data System (ADS)

Krause, Jean C.; Braida, Louis D.

2004-01-01

Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.
An Alternative to the Computational Speech Intelligibility Index Estimates: Direct Measurement of Rectangular Passband Intelligibilities

ERIC Educational Resources Information Center

Warren, Richard M.; Bashford, James A., Jr.; Lenz, Peter W.

2011-01-01

The need for determining the relative intelligibility of passbands spanning the speech spectrum has been addressed by publications of the American National Standards Institute (ANSI). When the Articulation Index (AI) standard (ANSI, S3.5, 1969, R1986) was developed, available filters confounded passband and slope contributions. The AI procedure…
Intelligibility of Digital Speech Masked by Noise: Normal Hearing and Hearing Impaired Listeners

DTIC Science & Technology

1990-06-01

spectrograms of these phrases were generated by a List 13 Processing Language (LISP) on a Symbolics 3670 artificial intelligence computer (see Figure 10). The...speech and the amount of difference varies with the type of vocoder. 26 ADPC INTELIGIBILITY AND TOE OF MAING 908 78- INTELLIGIBILITY 48 LI OS NORMA 30
Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)5: Speech Intelligibility Performance with Double Hearing Protection, HGU-84/P Flight Helmet

DTIC Science & Technology

2017-04-06

Pressure Level (SPL) background pink noise. The speech intelligibility tests shall result in a Modified Rhyme Test (MRT) score as listed below...Speech intelligibility testing shall be measured per ANSI S3.2 for each background pink noise level using a minimum of ten talkers and of ten...listeners. The test shall be conducted wearing the JSAM-TA using appropriate communication 6 DISTRIBUTION STATEMENT A: Approved for public release
Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)3: Noise Attenuation and Speech Intelligibility Performance with Double Hearing Protection, HGU-68/P Flight Helmet

DTIC Science & Technology

2017-03-31

dB Sound Pressure Level (SPL) background pink noise. The speech intelligibility tests shall result in a Modified Rhyme Test (MRT) score as listed...below. Speech intelligibility testing shall be measured per ANSI S3.2 for each background pink noise level using a minimum of ten talkers and of ten...listeners. The test shall be conducted wearing the JSAM-TA using appropriate communication amplification. Test must include the configurations
Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech

PubMed Central

Nakajima, Yoshitaka; Matsuda, Mizuki; Ueda, Kazuo; Remijn, Gerard B.

2018-01-01

Temporal resolution needed for Japanese speech communication was measured. A new experimental paradigm that can reflect the spectro-temporal resolution necessary for healthy listeners to perceive speech is introduced. As a first step, we report listeners' intelligibility scores of Japanese speech with a systematically degraded temporal resolution, so-called “mosaic speech”: speech mosaicized in the coordinates of time and frequency. The results of two experiments show that mosaic speech cut into short static segments was almost perfectly intelligible with a temporal resolution of 40 ms or finer. Intelligibility dropped for a temporal resolution of 80 ms, but was still around 50%-correct level. The data are in line with previous results showing that speech signals separated into short temporal segments of <100 ms can be remarkably robust in terms of linguistic-content perception against drastic manipulations in each segment, such as partial signal omission or temporal reversal. The human perceptual system thus can extract meaning from unexpectedly rough temporal information in speech. The process resembles that of the visual system stringing together static movie frames of ~40 ms into vivid motion. PMID:29740295
A music perception disorder (congenital amusia) influences speech comprehension.

PubMed

Liu, Fang; Jiang, Cunmei; Wang, Bei; Xu, Yi; Patel, Aniruddh D

2015-01-01

This study investigated the underlying link between speech and music by examining whether and to what extent congenital amusia, a musical disorder characterized by degraded pitch processing, would impact spoken sentence comprehension for speakers of Mandarin, a tone language. Sixteen Mandarin-speaking amusics and 16 matched controls were tested on the intelligibility of news-like Mandarin sentences with natural and flat fundamental frequency (F0) contours (created via speech resynthesis) under four signal-to-noise (SNR) conditions (no noise, +5, 0, and -5dB SNR). While speech intelligibility in quiet and extremely noisy conditions (SNR=-5dB) was not significantly compromised by flattened F0, both amusic and control groups achieved better performance with natural-F0 sentences than flat-F0 sentences under moderately noisy conditions (SNR=+5 and 0dB). Relative to normal listeners, amusics demonstrated reduced speech intelligibility in both quiet and noise, regardless of whether the F0 contours of the sentences were natural or flattened. This deficit in speech intelligibility was not associated with impaired pitch perception in amusia. These findings provide evidence for impaired speech comprehension in congenital amusia, suggesting that the deficit of amusics extends beyond pitch processing and includes segmental processing. Copyright © 2014 Elsevier Ltd. All rights reserved.
Long term rehabilitation of a total glossectomy patient.

PubMed

Bachher, Gurmit Kaur; Dholam, Kanchan P

2010-09-01

Malignant tumours of the oral cavity that require resection of the tongue result in severe deficiencies in speech and deglutition. Speech misarticulation leads to loss of speech intelligibility, which can prevent or limit communication. Prosthodontic rehabilitation involves fabrication of a Palatal Augmentation Prosthesis (PAP) following partial glossectomy and a mandibular tongue prosthesis after total glossectomy [1]. Speech analysis of a total glossectmy patient rehabilitated with a tongue prosthesis was done with the help of Dr. Speech Software Version 4 (Tiger DRS, Inc., Seattle) twelve years after treatment. Speech therapy sessions along with a prosthesis helped him to correct the dental sounds by using the lower lip and upper dentures (labio-dentals). It was noticed that speech intelligibility, intonation pattern, speech articulation and overall loudness was noticeably improved.
Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility.

PubMed

Biberger, Thomas; Ewert, Stephan D

2016-08-01

Human auditory perception and speech intelligibility have been successfully described based on the two concepts of spectral masking and amplitude modulation (AM) masking. The power-spectrum model (PSM) [Patterson and Moore (1986). Frequency Selectivity in Hearing, pp. 123-177] accounts for effects of spectral masking and critical bandwidth, while the envelope power-spectrum model (EPSM) [Ewert and Dau (2000). J. Acoust. Soc. Am. 108, 1181-1196] has been successfully applied to AM masking and discrimination. Both models extract the long-term (envelope) power to calculate signal-to-noise ratios (SNR). Recently, the EPSM has been applied to speech intelligibility (SI) considering the short-term envelope SNR on various time scales (multi-resolution speech-based envelope power-spectrum model; mr-sEPSM) to account for SI in fluctuating noise [Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134, 436-446]. Here, a generalized auditory model is suggested combining the classical PSM and the mr-sEPSM to jointly account for psychoacoustics and speech intelligibility. The model was extended to consider the local AM depth in conditions with slowly varying signal levels, and the relative role of long-term and short-term SNR was assessed. The suggested generalized power-spectrum model is shown to account for a large variety of psychoacoustic data and to predict speech intelligibility in various types of background noise.
[Communication and noise. Speech intelligibility of airplane pilots with and without active noise compensation].

PubMed

Matschke, R G

1994-08-01

Noise exposure measurements were performed with pilots of the German Federal Navy during flight situations. The ambient noise levels during regular flight were maintained at levels above a 90 dB A-weighted level. This noise intensity requires wearing ear protection to avoid sound-induced hearing loss. To be able to understand radio communication (ATC) in spite of a noisy environment, headphone volume must be raised above the noise of the engines. The use of ear plugs in addition to the headsets and flight helmets is only of limited value because personal ear protection affects the intelligibility of ATC. Whereas speech intelligibility of pilots with normal hearing is affected to only a smaller degree, pilots with pre-existing high-frequency hearing losses show substantial impairments of speech intelligibility that vary in proportion to the hearing deficit present. Communication abilities can be reduced drastically, which in turn can affect air traffic security. The development of active noise compensation devices (ANC) that make use of the "anti-noise" principle may be a solution to this dilemma. To evaluate the effectiveness of an ANC-system and its influence on speech intelligibility, speech audiometry was performed with a German standardized test during simulated flight conditions with helicopter pilots. Results demonstrate the helpful effect on speech understanding especially for pilots with noise-induced hearing losses. This may help to avoid pre-retirement professional disability.

Binaural enhancement for bilateral cochlear implant users.

PubMed

Brown, Christopher A

2014-01-01

Bilateral cochlear implant (BCI) users receive limited binaural cues and, thus, show little improvement to speech intelligibility from spatial cues. The feasibility of a method for enhancing the binaural cues available to BCI users is investigated. This involved extending interaural differences of levels, which typically are restricted to high frequencies, into the low-frequency region. Speech intelligibility was measured in BCI users listening over headphones and with direct stimulation, with a target talker presented to one side of the head in the presence of a masker talker on the other side. Spatial separation was achieved by applying either naturally occurring binaural cues or enhanced cues. In this listening configuration, BCI patients showed greater speech intelligibility with the enhanced binaural cues than with naturally occurring binaural cues. In some situations, it is possible for BCI users to achieve greater speech intelligibility when binaural cues are enhanced by applying interaural differences of levels in the low-frequency region.
Influence of auditory fatigue on masked speech intelligibility

NASA Technical Reports Server (NTRS)

Parker, D. E.; Martens, W. L.; Johnston, P. A.

1980-01-01

Intelligibility of PB word lists embedded in simultaneous masking noise was evaluated before and after fatiguing-noise exposure, which was determined by observing the number of words correctly repeated during a shadowing task. Both the speech signal and the masking noise were filtered to a 2825-3185-Hz band. Masking-noise leves were varied from 0- to 90-dB SL. Fatigue was produced by a 1500-3000-Hz octave band of noise at 115 dB (re 20 micron-Pa) presented continuously for 5 min. The results of three experiments indicated that speed intelligibility was reduced when the speech was presented against a background of silence but that the fatiguing-noise exposure had no effect on intelligibility when the speech was made more intense and embedded in masking noise of 40-90-dB SL. These observations are interpreted by considering the recruitment produced by fatigue and masking noise.
Talker Differences in Clear and Conversational Speech: Vowel Intelligibility for Older Adults with Hearing Loss

ERIC Educational Resources Information Center

Ferguson, Sarah Hargus

2012-01-01

Purpose: To establish the range of talker variability for vowel intelligibility in clear versus conversational speech for older adults with hearing loss and to determine whether talkers who produced a clear speech benefit for young listeners with normal hearing also did so for older adults with hearing loss. Method: Clear and conversational vowels…
Perceptual Measures of Speech from Individuals with Parkinson's Disease and Multiple Sclerosis: Intelligibility and beyond

ERIC Educational Resources Information Center

Sussman, Joan E.; Tjaden, Kris

2012-01-01

Purpose: The primary purpose of this study was to compare percent correct word and sentence intelligibility scores for individuals with multiple sclerosis (MS) and Parkinson's disease (PD) with scaled estimates of speech severity obtained for a reading passage. Method: Speech samples for 78 talkers were judged, including 30 speakers with MS, 16…
Internationally comparable screening tests for listening in noise in several European languages: the German digit triplet test as an optimization prototype.

PubMed

Zokoll, Melanie A; Wagener, Kirsten C; Brand, Thomas; Buschermöhle, Michael; Kollmeier, Birger

2012-09-01

A review is given of internationally comparable speech-in-noise tests for hearing screening purposes that were part of the European HearCom project. This report describes the development, optimization, and evaluation of such tests for headphone and telephone presentation, using the example of the German digit triplet test. In order to achieve the highest possible comparability, language- and speaker-dependent factors in speech intelligibility should be compensated for. The tests comprise spoken numbers in background noise and estimate the speech reception threshold (SRT), i.e. the signal-to-noise ratio (SNR) yielding 50% speech intelligibility. The respective reference speech intelligibility functions for headphone and telephone presentation of the German version for 15 and 10 normal-hearing listeners are described by a SRT of -9.3 ± 0.2 and -6.5 ± 0.4 dB SNR, and slopes of 19.6 and 17.9%/dB, respectively. Reference speech intelligibility functions of all digit triplet tests optimized within the HearCom project allow for investigation of the comparability due to language specificities. The optimization criteria established here should be used for similar screening tests in other languages.
Effects of reverberation and noise on speech intelligibility in normal-hearing and aided hearing-impaired listeners.

PubMed

Xia, Jing; Xu, Buye; Pentony, Shareka; Xu, Jingjing; Swaminathan, Jayaganesh

2018-03-01

Many hearing-aid wearers have difficulties understanding speech in reverberant noisy environments. This study evaluated the effects of reverberation and noise on speech recognition in normal-hearing listeners and hearing-impaired listeners wearing hearing aids. Sixteen typical acoustic scenes with different amounts of reverberation and various types of noise maskers were simulated using a loudspeaker array in an anechoic chamber. Results showed that, across all listening conditions, speech intelligibility of aided hearing-impaired listeners was poorer than normal-hearing counterparts. Once corrected for ceiling effects, the differences in the effects of reverberation on speech intelligibility between the two groups were much smaller. This suggests that, at least, part of the difference in susceptibility to reverberation between normal-hearing and hearing-impaired listeners was due to ceiling effects. Across both groups, a complex interaction between the noise characteristics and reverberation was observed on the speech intelligibility scores. Further fine-grained analyses of the perception of consonants showed that, for both listener groups, final consonants were more susceptible to reverberation than initial consonants. However, differences in the perception of specific consonant features were observed between the groups.
Speech and communication in Parkinson’s disease: a cross-sectional exploratory study in the UK

PubMed Central

Barnish, Maxwell S; Horton, Simon M C; Butterfint, Zoe R; Clark, Allan B; Atkinson, Rachel A; Deane, Katherine H O

2017-01-01

Objective To assess associations between cognitive status, intelligibility, acoustics and functional communication in PD. Design Cross-sectional exploratory study of functional communication, including a within-participants experimental design for listener assessment. Setting A major academic medical centre in the East of England, UK. Participants Questionnaire data were assessed for 45 people with Parkinson’s disease (PD), who had self-reported speech or communication difficulties and did not have clinical dementia. Acoustic and listener analyses were conducted on read and conversational speech for 20 people with PD and 20 familiar conversation partner controls without speech, language or cognitive difficulties. Main outcome measures Functional communication assessed by the Communicative Participation Item Bank (CPIB) and Communicative Effectiveness Survey (CES). Results People with PD had lower intelligibility than controls for both the read (mean difference 13.7%, p=0.009) and conversational (mean difference 16.2%, p=0.04) sentences. Intensity and pause were statistically significant predictors of intelligibility in read sentences. Listeners were less accurate identifying the intended emotion in the speech of people with PD (14.8% point difference across conditions, p=0.02) and this was associated with worse speaker cognitive status (16.7% point difference, p=0.04). Cognitive status was a significant predictor of functional communication using CPIB (F=8.99, p=0.005, η2 = 0.15) but not CES. Intelligibility in conversation sentences was a statistically significant predictor of CPIB (F=4.96, p=0.04, η2 = 0.19) and CES (F=13.65, p=0.002, η2 = 0.43). Read sentence intelligibility was not a significant predictor of either outcome. Conclusions Cognitive status was an important predictor of functional communication—the role of intelligibility was modest and limited to conversational and not read speech. Our results highlight the importance of focusing on functional communication as well as physical speech impairment in speech and language therapy (SLT) for PD. Our results could inform future trials of SLT techniques for PD. PMID:28554918
The Effect of Uni- and Bilateral Thalamic Deep Brain Stimulation on Speech in Patients With Essential Tremor: Acoustics and Intelligibility.

PubMed

Becker, Johannes; Barbe, Michael T; Hartinger, Mariam; Dembek, Till A; Pochmann, Jil; Wirths, Jochen; Allert, Niels; Mücke, Doris; Hermes, Anne; Meister, Ingo G; Visser-Vandewalle, Veerle; Grice, Martine; Timmermann, Lars

2017-04-01

Deep brain stimulation (DBS) of the ventral intermediate nucleus (VIM) is performed to suppress medically-resistant essential tremor (ET). However, stimulation induced dysarthria (SID) is a common side effect, limiting the extent to which tremor can be suppressed. To date, the exact pathogenesis of SID in VIM-DBS treated ET patients is unknown. We investigate the effect of inactivated, uni- and bilateral VIM-DBS on speech production in patients with ET. We employ acoustic measures, tempo, and intelligibility ratings and patient's self-estimated speech to quantify SID, with a focus on comparing bilateral to unilateral stimulation effects and the effect of electrode position on speech. Sixteen German ET patients participated in this study. Each patient was acoustically recorded with DBS-off, unilateral-right-hemispheric-DBS-on, unilateral-left-hemispheric-DBS-on, and bilateral-DBS-on during an oral diadochokinesis task and a read German standard text. To capture the extent of speech impairment, we measured syllable duration and intensity ratio during the DDK task. Naïve listeners rated speech tempo and speech intelligibility of the read text on a 5-point-scale. Patients had to rate their "ability to speak". We found an effect of bilateral compared to unilateral and inactivated stimulation on syllable durations and intensity ratio, as well as on external intelligibility ratings and patients' VAS scores. Additionally, VAS scores are associated with more laterally located active contacts. For speech ratings, we found an effect of syllable duration such that tempo and intelligibility was rated worse for speakers exhibiting greater syllable durations. Our data confirms that SID is more pronounced under bilateral compared to unilateral stimulation. Laterally located electrodes are associated with more severe SID according to patient's self-ratings. We can confirm the relation between diadochokinetic rate and SID in that listener's tempo and intelligibility ratings can be predicted by measured syllable durations from DDK tasks. © 2017 International Neuromodulation Society.
Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.

PubMed

Fu, Szu-Wei; Li, Pei-Chun; Lai, Ying-Hui; Yang, Cheng-Chien; Hsieh, Li-Chun; Tsao, Yu

2017-11-01

Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients. Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients.
On the relationship between auditory cognition and speech intelligibility in cochlear implant users: An ERP study.

PubMed

Finke, Mareike; Büchner, Andreas; Ruigendijk, Esther; Meyer, Martin; Sandmann, Pascale

2016-07-01

There is a high degree of variability in speech intelligibility outcomes across cochlear-implant (CI) users. To better understand how auditory cognition affects speech intelligibility with the CI, we performed an electroencephalography study in which we examined the relationship between central auditory processing, cognitive abilities, and speech intelligibility. Postlingually deafened CI users (N=13) and matched normal-hearing (NH) listeners (N=13) performed an oddball task with words presented in different background conditions (quiet, stationary noise, modulated noise). Participants had to categorize words as living (targets) or non-living entities (standards). We also assessed participants' working memory (WM) capacity and verbal abilities. For the oddball task, we found lower hit rates and prolonged response times in CI users when compared with NH listeners. Noise-related prolongation of the N1 amplitude was found for all participants. Further, we observed group-specific modulation effects of event-related potentials (ERPs) as a function of background noise. While NH listeners showed stronger noise-related modulation of the N1 latency, CI users revealed enhanced modulation effects of the N2/N4 latency. In general, higher-order processing (N2/N4, P3) was prolonged in CI users in all background conditions when compared with NH listeners. Longer N2/N4 latency in CI users suggests that these individuals have difficulties to map acoustic-phonetic features to lexical representations. These difficulties seem to be increased for speech-in-noise conditions when compared with speech in quiet background. Correlation analyses showed that shorter ERP latencies were related to enhanced speech intelligibility (N1, N2/N4), better lexical fluency (N1), and lower ratings of listening effort (N2/N4) in CI users. In sum, our findings suggest that CI users and NH listeners differ with regards to both the sensory and the higher-order processing of speech in quiet as well as in noisy background conditions. Our results also revealed that verbal abilities are related to speech processing and speech intelligibility in CI users, confirming the view that auditory cognition plays an important role for CI outcome. We conclude that differences in auditory-cognitive processing contribute to the variability in speech performance outcomes observed in CI users. Copyright © 2016 Elsevier Ltd. All rights reserved.
Longitudinal follow-up to evaluate speech disorders in early-treated patients with infantile-onset Pompe disease.

PubMed

Zeng, Yin-Ting; Hwu, Wuh-Liang; Torng, Pao-Chuan; Lee, Ni-Chung; Shieh, Jeng-Yi; Lu, Lu; Chien, Yin-Hsiu

2017-05-01

Patients with infantile-onset Pompe disease (IOPD) can be treated by recombinant human acid alpha glucosidase (rhGAA) replacement beginning at birth with excellent survival rates, but they still commonly present with speech disorders. This study investigated the progress of speech disorders in these early-treated patients and ascertained the relationship with treatments. Speech disorders, including hypernasal resonance, articulation disorders, and speech intelligibility, were scored by speech-language pathologists using auditory perception in seven early-treated patients over a period of 6 years. Statistical analysis of the first and last evaluations of the patients was performed with the Wilcoxon signed-rank test. A total of 29 speech samples were analyzed. All the patients suffered from hypernasality, articulation disorder, and impairment in speech intelligibility at the age of 3 years. The conditions were stable, and 2 patients developed normal or near normal speech during follow-up. Speech therapy and a high dose of rhGAA appeared to improve articulation in 6 of the 7 patients (86%, p = 0.028) by decreasing the omission of consonants, which consequently increased speech intelligibility (p = 0.041). Severity of hypernasality greatly reduced only in 2 patients (29%, p = 0.131). Speech disorders were common even in early and successfully treated patients with IOPD; however, aggressive speech therapy and high-dose rhGAA could improve their speech disorders. Copyright © 2016 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
Performance Evaluation of Intelligent Systems at the National Institute of Standards and Technology (NIST)

DTIC Science & Technology

2011-03-01

past few years, including performance evaluation of emergency response robots , sensor systems on unmanned ground vehicles, speech-to-speech translation...emergency response robots ; intelligent systems; mixed palletizing, testing, simulation; robotic vehicle perception systems; search and rescue robots ...ranging from autonomous vehicles to urban search and rescue robots to speech translation and manufacturing systems. The evaluations have occurred in
An algorithm to improve speech recognition in noise for hearing-impaired listeners

PubMed Central

Healy, Eric W.; Yoho, Sarah E.; Wang, Yuxuan; Wang, DeLiang

2013-01-01

Despite considerable effort, monaural (single-microphone) algorithms capable of increasing the intelligibility of speech in noise have remained elusive. Successful development of such an algorithm is especially important for hearing-impaired (HI) listeners, given their particular difficulty in noisy backgrounds. In the current study, an algorithm based on binary masking was developed to separate speech from noise. Unlike the ideal binary mask, which requires prior knowledge of the premixed signals, the masks used to segregate speech from noise in the current study were estimated by training the algorithm on speech not used during testing. Sentences were mixed with speech-shaped noise and with babble at various signal-to-noise ratios (SNRs). Testing using normal-hearing and HI listeners indicated that intelligibility increased following processing in all conditions. These increases were larger for HI listeners, for the modulated background, and for the least-favorable SNRs. They were also often substantial, allowing several HI listeners to improve intelligibility from scores near zero to values above 70%. PMID:24116438
Horizontal localization and speech intelligibility with bilateral and unilateral hearing aid amplification.

PubMed

Köbler, S; Rosenhall, U

2002-10-01

Speech intelligibility and horizontal localization of 19 subjects with mild-to-moderate hearing loss were studied in order to evaluate the advantages and disadvantages of bilateral and unilateral hearing aid (HA) fittings. Eight loudspeakers were arranged in a circular array covering the horizontal plane around the subjects. Speech signals of a sentence test were delivered by one, randomly chosen, loudspeaker. At the same time, the other seven loudspeakers emitted noise with the same long-term average spectrum as the speech signals. The subjects were asked to repeat the speech signal and to point out the corresponding loudspeaker. Speech intelligibility was significantly improved by HAs, bilateral amplification being superior to unilateral. Horizontal localization could not be improved by HA amplification. However, bilateral HAs preserved the subjects' horizontal localization, whereas unilateral amplification decreased their horizontal localization abilities. Front-back confusions were common in the horizontal localization test. The results indicate that bilateral HA amplification has advantages compared with unilateral amplification.
Intelligent interfaces for expert systems

NASA Technical Reports Server (NTRS)

Villarreal, James A.; Wang, Lui

1988-01-01

Vital to the success of an expert system is an interface to the user which performs intelligently. A generic intelligent interface is being developed for expert systems. This intelligent interface was developed around the in-house developed Expert System for the Flight Analysis System (ESFAS). The Flight Analysis System (FAS) is comprised of 84 configuration controlled FORTRAN subroutines that are used in the preflight analysis of the space shuttle. In order to use FAS proficiently, a person must be knowledgeable in the areas of flight mechanics, the procedures involved in deploying a certain payload, and an overall understanding of the FAS. ESFAS, still in its developmental stage, is taking into account much of this knowledge. The generic intelligent interface involves the integration of a speech recognizer and synthesizer, a preparser, and a natural language parser to ESFAS. The speech recognizer being used is capable of recognizing 1000 words of connected speech. The natural language parser is a commercial software package which uses caseframe instantiation in processing the streams of words from the speech recognizer or the keyboard. The systems configuration is described along with capabilities and drawbacks.
Intelligibility of Clear Speech: Effect of Instruction

ERIC Educational Resources Information Center

Lam, Jennifer; Tjaden, Kris

2013-01-01

Purpose: The authors investigated how clear speech instructions influence sentence intelligibility. Method: Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis…
Speech Intelligibility

NASA Astrophysics Data System (ADS)

Brand, Thomas

Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.
Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)3: Noise Attenuation and Speech Intelligibility Performance with Double Hearing Protection, HGU-55A/P JHMCS Flight Helmet

DTIC Science & Technology

2017-03-01

in an environment 71-115 dB Sound Pressure Level (SPL) background pink noise. The speech intelligibility tests shall result in a Modified Rhyme... Test (MRT) score as listed below. Speech intelligibility testing shall be measured per ANSI S3.2 for each background pink noise level using a...minimum of ten talkers and of ten listeners. The test shall be conducted wearing the JSAM-TA using appropriate communication amplification. Test must
Auditory “bubbles”: Efficient classification of the spectrotemporal modulations essential for speech intelligibility

PubMed Central

Venezia, Jonathan H.; Hickok, Gregory; Richards, Virginia M.

2016-01-01

Speech intelligibility depends on the integrity of spectrotemporal patterns in the signal. The current study is concerned with the speech modulation power spectrum (MPS), which is a two-dimensional representation of energy at different combinations of temporal and spectral (i.e., spectrotemporal) modulation rates. A psychophysical procedure was developed to identify the regions of the MPS that contribute to successful reception of auditory sentences. The procedure, based on the two-dimensional image classification technique known as “bubbles” (Gosselin and Schyns (2001). Vision Res. 41, 2261–2271), involves filtering (i.e., degrading) the speech signal by removing parts of the MPS at random, and relating filter patterns to observer performance (keywords identified) over a number of trials. The result is a classification image (CImg) or “perceptual map” that emphasizes regions of the MPS essential for speech intelligibility. This procedure was tested using normal-rate and 2×-time-compressed sentences. The results indicated: (a) CImgs could be reliably estimated in individual listeners in relatively few trials, (b) CImgs tracked changes in spectrotemporal modulation energy induced by time compression, though not completely, indicating that “perceptual maps” deviated from physical stimulus energy, and (c) the bubbles method captured variance in intelligibility not reflected in a common modulation-based intelligibility metric (spectrotemporal modulation index or STMI). PMID:27586738
The role of accent imitation in sensorimotor integration during processing of intelligible speech

PubMed Central

Adank, Patti; Rueschemeyer, Shirley-Ann; Bekkering, Harold

2013-01-01

Recent theories on how listeners maintain perceptual invariance despite variation in the speech signal allocate a prominent role to imitation mechanisms. Notably, these simulation accounts propose that motor mechanisms support perception of ambiguous or noisy signals. Indeed, imitation of ambiguous signals, e.g., accented speech, has been found to aid effective speech comprehension. Here, we explored the possibility that imitation in speech benefits perception by increasing activation in speech perception and production areas. Participants rated the intelligibility of sentences spoken in an unfamiliar accent of Dutch in a functional Magnetic Resonance Imaging experiment. Next, participants in one group repeated the sentences in their own accent, while a second group vocally imitated the accent. Finally, both groups rated the intelligibility of accented sentences in a post-test. The neuroimaging results showed an interaction between type of training and pre- and post-test sessions in left Inferior Frontal Gyrus, Supplementary Motor Area, and left Superior Temporal Sulcus. Although alternative explanations such as task engagement and fatigue need to be considered as well, the results suggest that imitation may aid effective speech comprehension by supporting sensorimotor integration. PMID:24109447

Alignment of classification paradigms for communication abilities in children with cerebral palsy

PubMed Central

Hustad, Katherine C.; Oakes, Ashley; McFadd, Emily; Allison, Kristen M.

2015-01-01

Aim We examined three communication ability classification paradigms for children with cerebral palsy (CP): the Communication Function Classification System (CFCS), the Viking Speech Scale (VSS), and the Speech Language Profile Groups (SLPG). Questions addressed inter-judge reliability, whether the VSS and the CFCS captured impairments in speech and language, and whether there were differences in speech intelligibility among levels within each classification paradigm. Method 80 children (42 males) with a range of types and severity levels of CP participated (mean age, 60 months; SD 4.8 months). Two speech-language pathologists classified each child via parent-child interaction samples and previous experience with the children for the CFCS and VSS, and uisng quantitative speech and language assessment data for the SLPG. Intelligibility scores were obtained using standard clinical intelligibility measurement. Results Kappa values were .67 (95% CI [.55, .79]) for the CFCS, .82 (95% CI [.72, .92]), for the VSS, .95 (95% CI [.72, .92]) for the SLPG. Descriptively, reliability within levels of each paradigm varied, with the lowest agreement occurring within the CFCS at levels II (42%), III (40%), and IV (61%). Neither the CFCS nor the VSS were sensitive to language impairments captured by the SLPG. Significant differences in speech intelligibility were found among levels for all classification paradigms. Interpretation Multiple tools are necessary to understand speech, language, and communication profiles in children with CP. Characterization of abilities at all levels of the ICF will advance our understanding of the ways that speech, language, and communication abilities present in children with CP. PMID:26521844
The effect of bone conduction microphone placement on intensity and spectrum of transmitted speech items.

PubMed

Tran, Phuong K; Letowski, Tomasz R; McBride, Maranda E

2013-06-01

Speech signals can be converted into electrical audio signals using either conventional air conduction (AC) microphone or a contact bone conduction (BC) microphone. The goal of this study was to investigate the effects of the location of a BC microphone on the intensity and frequency spectrum of the recorded speech. Twelve locations, 11 on the talker's head and 1 on the collar bone, were investigated. The speech sounds were three vowels (/u/, /a/, /i/) and two consonants (/m/, /∫/). The sounds were produced by 12 talkers. Each sound was recorded simultaneously with two BC microphones and an AC microphone. Analyzed spectral data showed that the BC recordings made at the forehead of the talker were the most similar to the AC recordings, whereas the collar bone recordings were most different. Comparison of the spectral data with speech intelligibility data collected in another study revealed a strong negative relationship between BC speech intelligibility and the degree of deviation of the BC speech spectrum from the AC spectrum. In addition, the head locations that resulted in the highest speech intelligibility were associated with the lowest output signals among all tested locations. Implications of these findings for BC communication are discussed.
Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

PubMed

Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

2012-12-01

Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.
Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users.

PubMed

Goehring, Tobias; Bolner, Federico; Monaghan, Jessica J M; van Dijk, Bas; Zarowski, Andrzej; Bleeck, Stefan

2017-02-01

Speech understanding in noisy environments is still one of the major challenges for cochlear implant (CI) users in everyday life. We evaluated a speech enhancement algorithm based on neural networks (NNSE) for improving speech intelligibility in noise for CI users. The algorithm decomposes the noisy speech signal into time-frequency units, extracts a set of auditory-inspired features and feeds them to the neural network to produce an estimation of which frequency channels contain more perceptually important information (higher signal-to-noise ratio, SNR). This estimate is used to attenuate noise-dominated and retain speech-dominated CI channels for electrical stimulation, as in traditional n-of-m CI coding strategies. The proposed algorithm was evaluated by measuring the speech-in-noise performance of 14 CI users using three types of background noise. Two NNSE algorithms were compared: a speaker-dependent algorithm, that was trained on the target speaker used for testing, and a speaker-independent algorithm, that was trained on different speakers. Significant improvements in the intelligibility of speech in stationary and fluctuating noises were found relative to the unprocessed condition for the speaker-dependent algorithm in all noise types and for the speaker-independent algorithm in 2 out of 3 noise types. The NNSE algorithms used noise-specific neural networks that generalized to novel segments of the same noise type and worked over a range of SNRs. The proposed algorithm has the potential to improve the intelligibility of speech in noise for CI users while meeting the requirements of low computational complexity and processing delay for application in CI devices. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments

PubMed Central

Colburn, H. Steven

2016-01-01

Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. PMID:27698261
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.

PubMed

Mi, Jing; Colburn, H Steven

2016-10-03

Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. © The Author(s) 2016.
A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals With Parkinson's Disease

PubMed Central

Choi, Yaelin

2017-01-01

Purpose The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method A total of 48 speakers from the 4 speaker groups (AE speakers with PD, Korean speakers with PD, healthy English speakers, and healthy Korean speakers) were asked to read a paragraph in their native languages. Four acoustic variables were analyzed: acoustic vowel space, voice onset time contrast scores, normalized pairwise variability index, and articulation rate. Speech intelligibility scores were obtained from scaled estimates of sentences extracted from the paragraph. Results The findings indicated that the multiple regression models of speech intelligibility were different in Korean and AE, even with the same set of predictor variables and with speakers matched on speech intelligibility across languages. Analysis of the descriptive data for the acoustic variables showed the expected compression of the vowel space in speakers with PD in both languages, lower normalized pairwise variability index scores in Korean compared with AE, and no differences within or across language in articulation rate. Conclusions The results indicate that the basis of an intelligibility deficit in dysarthria is likely to depend on the native language of the speaker and listener. Additional research is required to explore other potential predictor variables, as well as additional language comparisons to pursue cross-linguistic considerations in classification and diagnosis of dysarthria types. PMID:28821018
Impact of tongue reduction on overall speech intelligibility, articulation and oromyofunctional behavior in 4 children with Beckwith-Wiedemann syndrome.

PubMed

Van Lierde, K; Galiwango, G; Hodges, A; Bettens, K; Luyten, A; Vermeersch, H

2012-01-01

The purpose of this study was to determine the impact of partial glossectomy (using the keyhole technique) on speech intelligibility, articulation, resonance and oromyofunctional behavior. A partial glossectomy was performed in 4 children with Beckwith- Wiedemann syndrome between the ages of 0.5 and 3.1 years. An ENT assessment, a phonetic inventory, a phonemic and phonological analysis and a consensus perceptual evaluation of speech intelligibility, resonance and oromyofunctional behavior were performed. It was not possible in this study to separate the effects of the surgery from the typical developmental progress of speech sound mastery. Improved speech intelligibility, a more complete phonetic inventory, an increase in phonological skills, normal resonance and increased motor-oriented oral behavior were found in the postsurgical condition. The presence of phonetic distortions, lip incompetence and interdental tongue position were still present in the postsurgical condition. Speech therapy should be focused on correct phonetic placement and a motor-oriented approach to increase lip competence, and on functional tongue exercises and tongue lifting during the production of alveolars. Detailed analyses in a larger number of subjects with and without Beckwith-Wiedemann syndrome may help further illustrate the long-term impact of partial glossectomy. Copyright © 2011 S. Karger AG, Basel.
Listening with a foreign-accent: The interlanguage speech intelligibility benefit in Mandarin speakers of English

PubMed Central

Xie, Xin; Fowler, Carol A.

2013-01-01

This study examined the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. In the latter group, it also examined the role of the language environment and English proficiency. Three groups of listeners were tested: native English listeners (NE), Mandarin-speaking Chinese listeners in the US (M-US) and Mandarin listeners in Beijing, China (M-BJ). As a group, M-US and M-BJ listeners were matched on English proficiency and age of acquisition. A nonword transcription task was used. Identification accuracy for word-final stops in the nonwords established two independent interlanguage intelligibility effects. An interlanguage speech intelligibility benefit for listeners (ISIB-L) was manifest by both groups of Mandarin listeners outperforming native English listeners in identification of Mandarin-accented speech. In the benefit for talkers (ISIB-T), only M-BJ listeners were more accurate identifying Mandarin-accented speech than native English speech. Thus, both Mandarin groups demonstrated an ISIB-L while only the M-BJ group overall demonstrated an ISIB-T. The English proficiency of listeners was found to modulate the magnitude of the ISIB-T in both groups. Regression analyses also suggested that the listener groups differ in their use of acoustic information to identify voicing in stop consonants. PMID:24293741
Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality.

PubMed

Kates, James M; Arehart, Kathryn H

2015-10-01

This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships.
Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality

PubMed Central

Kates, James M.; Arehart, Kathryn H.

2015-01-01

This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships. PMID:26520329
Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

PubMed

Larm, Petra; Hongisto, Valtteri

2006-02-01

During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.
Glossectomy: a case report.

PubMed

Dworkin, J P

1982-04-01

A 27-year-old man, a law student, underwent partial glossectomy, right hemimandibulectomy and radical neck dissection due to recurrent carcinoma of the oral cavity. These surgical procedures resulted in severe swallowing and speech difficulties for which he was treated by tube feeding and speech therapy, respectively. Emphasis in therapy was placed on compensatory articulator techniques for the improvement of speech intelligibility. Those adaptive tongue stump, labial, and palato-pharyngeal compensations which were employed are discussed. After. 9 months of speech therapy, he was judged to have achieved fair-to good speech intelligibility, and was able to continue law school. At the time of this writing, he was practicing law.
The interaction of acoustic and linguistic grouping cues in auditory object formation

NASA Astrophysics Data System (ADS)

Shapley, Kathy; Carrell, Thomas

2005-09-01

One of the earliest explanations for good speech intelligibility in poor listening situations was context [Miller et al., J. Exp. Psychol. 41 (1951)]. Context presumably allows listeners to group and predict speech appropriately and is known as a top-down listening strategy. Amplitude comodulation is another mechanism that has been shown to improve sentence intelligibility. Amplitude comodulation provides acoustic grouping information without changing the linguistic content of the desired signal [Carrell and Opie, Percept. Psychophys. 52 (1992); Hu and Wang, Proceedings of ICASSP-02 (2002)] and is considered a bottom-up process. The present experiment investigated how amplitude comodulation and semantic information combined to improve speech intelligibility. Sentences with high- and low-predictability word sequences [Boothroyd and Nittrouer, J. Acoust. Soc. Am. 84 (1988)] were constructed in two different formats: time-varying sinusoidal sentences (TVS) and reduced-channel sentences (RC). The stimuli were chosen because they minimally represent the traditionally defined speech cues and therefore emphasized the importance of the high-level context effects and low-level acoustic grouping cues. Results indicated that semantic information did not influence intelligibility levels of TVS and RC sentences. In addition amplitude modulation aided listeners' intelligibility scores in the TVS condition but hindered listeners' intelligibility scores in the RC condition.
Optimizing Classroom Acoustics Using Computer Model Studies.

ERIC Educational Resources Information Center

Reich, Rebecca; Bradley, John

1998-01-01

Investigates conditions relating to the maximum useful-to-detrimental sound ratios present in classrooms and determining the optimum conditions for speech intelligibility. Reveals that speech intelligibility is more strongly influenced by ambient noise levels and that the optimal location for sound absorbing material is on a classroom's upper…
Communication Effectiveness of Individuals with Amyotrophic Lateral Sclerosis

ERIC Educational Resources Information Center

Ball, Laura J.; Beukelman, David R.; Pattee, Gary L.

2004-01-01

The purpose of this study was to examine the relationships among speech intelligibility and communication effectiveness as rated by speakers and their listeners. Participants completed procedures to measure (a) speech intelligibility, (b) self-perceptions of communication effectiveness, and (c) listener (spouse or family member) perceptions of…
Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English.

PubMed

Li, Junfeng; Yang, Lin; Zhang, Jianping; Yan, Yonghong; Hu, Yi; Akagi, Masato; Loizou, Philipos C

2011-05-01

A large number of single-channel noise-reduction algorithms have been proposed based largely on mathematical principles. Most of these algorithms, however, have been evaluated with English speech. Given the different perceptual cues used by native listeners of different languages including tonal languages, it is of interest to examine whether there are any language effects when the same noise-reduction algorithm is used to process noisy speech in different languages. A comparative evaluation and investigation is taken in this study of various single-channel noise-reduction algorithms applied to noisy speech taken from three languages: Chinese, Japanese, and English. Clean speech signals (Chinese words and Japanese words) were first corrupted by three types of noise at two signal-to-noise ratios and then processed by five single-channel noise-reduction algorithms. The processed signals were finally presented to normal-hearing listeners for recognition. Intelligibility evaluation showed that the majority of noise-reduction algorithms did not improve speech intelligibility. Consistent with a previous study with the English language, the Wiener filtering algorithm produced small, but statistically significant, improvements in intelligibility for car and white noise conditions. Significant differences between the performances of noise-reduction algorithms across the three languages were observed.
Intelligibility of foreign-accented speech: Effects of listening condition, listener age, and listener hearing status

NASA Astrophysics Data System (ADS)

Ferguson, Sarah Hargus

2005-09-01

It is well known that, for listeners with normal hearing, speech produced by non-native speakers of the listener's first language is less intelligible than speech produced by native speakers. Intelligibility is well correlated with listener's ratings of talker comprehensibility and accentedness, which have been shown to be related to several talker factors, including age of second language acquisition and level of similarity between the talker's native and second language phoneme inventories. Relatively few studies have focused on factors extrinsic to the talker. The current project explored the effects of listener and environmental factors on the intelligibility of foreign-accented speech. Specifically, monosyllabic English words previously recorded from two talkers, one a native speaker of American English and the other a native speaker of Spanish, were presented to three groups of listeners (young listeners with normal hearing, elderly listeners with normal hearing, and elderly listeners with hearing impairment; n=20 each) in three different listening conditions (undistorted words in quiet, undistorted words in 12-talker babble, and filtered words in quiet). Data analysis will focus on interactions between talker accent, listener age, listener hearing status, and listening condition. [Project supported by American Speech-Language-Hearing Association AARC Award.
Dysfluencies in the speech of adults with intellectual disabilities and reported speech difficulties.

PubMed

Coppens-Hofman, Marjolein C; Terband, Hayo R; Maassen, Ben A M; van Schrojenstein Lantman-De Valk, Henny M J; van Zaalen-op't Hof, Yvonne; Snik, Ad F M

2013-01-01

In individuals with an intellectual disability, speech dysfluencies are more common than in the general population. In clinical practice, these fluency disorders are generally diagnosed and treated as stuttering rather than cluttering. To characterise the type of dysfluencies in adults with intellectual disabilities and reported speech difficulties with an emphasis on manifestations of stuttering and cluttering, which distinction is to help optimise treatment aimed at improving fluency and intelligibility. The dysfluencies in the spontaneous speech of 28 adults (18-40 years; 16 men) with mild and moderate intellectual disabilities (IQs 40-70), who were characterised as poorly intelligible by their caregivers, were analysed using the speech norms for typically developing adults and children. The speakers were subsequently assigned to different diagnostic categories by relating their resulting dysfluency profiles to mean articulatory rate and articulatory rate variability. Twenty-two (75%) of the participants showed clinically significant dysfluencies, of which 21% were classified as cluttering, 29% as cluttering-stuttering and 25% as clear cluttering at normal articulatory rate. The characteristic pattern of stuttering did not occur. The dysfluencies in the speech of adults with intellectual disabilities and poor intelligibility show patterns that are specific for this population. Together, the results suggest that in this specific group of dysfluent speakers interventions should be aimed at cluttering rather than stuttering. The reader will be able to (1) describe patterns of dysfluencies in the speech of adults with intellectual disabilities that are specific for this group of people, (2) explain that a high rate of dysfluencies in speech is potentially a major determiner of poor intelligibility in adults with ID and (3) describe suggestions for intervention focusing on cluttering rather than stuttering in dysfluent speakers with ID. Copyright © 2013 Elsevier Inc. All rights reserved.
Alignment of classification paradigms for communication abilities in children with cerebral palsy.

PubMed

Hustad, Katherine C; Oakes, Ashley; McFadd, Emily; Allison, Kristen M

2016-06-01

We examined three communication ability classification paradigms for children with cerebral palsy (CP): the Communication Function Classification System (CFCS), the Viking Speech Scale (VSS), and the Speech Language Profile Groups (SLPG). Questions addressed interjudge reliability, whether the VSS and the CFCS captured impairments in speech and language, and whether there were differences in speech intelligibility among levels within each classification paradigm. Eighty children (42 males, 38 females) with a range of types and severity levels of CP participated (mean age 60mo, range 50-72mo [SD 5mo]). Two speech-language pathologists classified each child via parent-child interaction samples and previous experience with the children for the CFCS and VSS, and using quantitative speech and language assessment data for the SLPG. Intelligibility scores were obtained using standard clinical intelligibility measurement. Kappa values were 0.67 (95% confidence interval [CI] 0.55-0.79) for the CFCS, 0.82 (95% CI 0.72-0.92) for the VSS, and 0.95 (95% CI 0.72-0.92) for the SLPG. Descriptively, reliability within levels of each paradigm varied, with the lowest agreement occurring within the CFCS at levels II (42%), III (40%), and IV (61%). Neither the CFCS nor the VSS were sensitive to language impairments captured by the SLPG. Significant differences in speech intelligibility were found among levels for all classification paradigms. Multiple tools are necessary to understand speech, language, and communication profiles in children with CP. Characterization of abilities at all levels of the International Classification of Functioning, Disability and Health will advance our understanding of the ways that speech, language, and communication abilities present in children with CP. © 2015 Mac Keith Press.

Effects of subthalamic stimulation on speech of consecutive patients with Parkinson disease

PubMed Central

Zrinzo, L.; Martinez-Torres, I.; Frost, E.; Pinto, S.; Foltynie, T.; Holl, E.; Petersen, E.; Roughton, M.; Hariz, M.I.; Limousin, P.

2011-01-01

Objective: Subthalamic nucleus deep brain stimulation (STN-DBS) is an effective treatment for advanced Parkinson disease (PD). Following STN-DBS, speech intelligibility can deteriorate, limiting its beneficial effect. Here we prospectively examined the short- and long-term speech response to STN-DBS in a consecutive series of patients to identify clinical and surgical factors associated with speech change. Methods: Thirty-two consecutive patients were assessed before surgery, then 1 month, 6 months, and 1 year after STN-DBS in 4 conditions on- and off-medication with on- and off-stimulation using established and validated speech and movement scales. Fifteen of these patients were followed up for 3 years. A control group of 12 patients with PD were followed up for 1 year. Results: Within the surgical group, speech intelligibility significantly deteriorated by an average of 14.2% ± 20.15% off-medication and 16.9% ± 21.8% on-medication 1 year after STN-DBS. The medical group deteriorated by 3.6% ± 5.5% and 4.5% ± 8.8%, respectively. Seven patients showed speech amelioration after surgery. Loudness increased significantly in all tasks with stimulation. A less severe preoperative on-medication motor score was associated with a more favorable speech response to STN-DBS after 1 year. Medially located electrodes on the left STN were associated with a significantly higher risk of speech deterioration than electrodes within the nucleus. There was a strong relationship between high voltage in the left electrode and poor speech outcome at 1 year. Conclusion: The effect of STN-DBS on speech is variable and multifactorial, with most patients exhibiting decline of speech intelligibility. Both medical and surgical issues contribute to deterioration of speech in STN-DBS patients. Classification of evidence: This study provides Class III evidence that STN-DBS for PD results in deterioration in speech intelligibility in all combinations of medication and stimulation states at 1 month, 6 months, and 1 year compared to baseline and to control subjects treated with best medical therapy. PMID:21068426
Sensory Intelligence for Extraction of an Abstract Auditory Rule: A Cross-Linguistic Study.

PubMed

Guo, Xiao-Tao; Wang, Xiao-Dong; Liang, Xiu-Yuan; Wang, Ming; Chen, Lin

2018-02-21

In a complex linguistic environment, while speech sounds can greatly vary, some shared features are often invariant. These invariant features constitute so-called abstract auditory rules. Our previous study has shown that with auditory sensory intelligence, the human brain can automatically extract the abstract auditory rules in the speech sound stream, presumably serving as the neural basis for speech comprehension. However, whether the sensory intelligence for extraction of abstract auditory rules in speech is inherent or experience-dependent remains unclear. To address this issue, we constructed a complex speech sound stream using auditory materials in Mandarin Chinese, in which syllables had a flat lexical tone but differed in other acoustic features to form an abstract auditory rule. This rule was occasionally and randomly violated by the syllables with the rising, dipping or falling tone. We found that both Chinese and foreign speakers detected the violations of the abstract auditory rule in the speech sound stream at a pre-attentive stage, as revealed by the whole-head recordings of mismatch negativity (MMN) in a passive paradigm. However, MMNs peaked earlier in Chinese speakers than in foreign speakers. Furthermore, Chinese speakers showed different MMN peak latencies for the three deviant types, which paralleled recognition points. These findings indicate that the sensory intelligence for extraction of abstract auditory rules in speech sounds is innate but shaped by language experience. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Automatic intelligibility classification of sentence-level pathological speech

PubMed Central

Kim, Jangwon; Kumar, Naveen; Tsiartas, Andreas; Li, Ming; Narayanan, Shrikanth S.

2014-01-01

Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGO database (cerebral palsy or amyotrophic lateral sclerosis), by evaluating classification accuracy without overlapping subjects’ data among training and test partitions. Results show that the feature sets of each of the voice quality subsystem, prosodic subsystem, and pronunciation subsystem, offer significant discriminating power for binary intelligibility classification. We observe that the proposed posterior smoothing in the acoustic space can further reduce classification errors. The smoothed posterior score fusion of subsystems shows the best classification performance (73.5% for unweighted, and 72.8% for weighted, average recalls of the binary classes). PMID:25414544
Children's Attitudes Toward Peers With Unintelligible Speech Associated With Cleft Lip and/or Palate.

PubMed

Lee, Alice; Gibbon, Fiona E; Spivey, Kimberley

2017-05-01

The objective of this study was to investigate whether reduced speech intelligibility in children with cleft palate affects social and personal attribute judgments made by typically developing children of different ages. The study (1) measured the correlation between intelligibility scores of speech samples from children with cleft palate and social and personal attribute judgments made by typically developing children based on these samples and (2) compared the attitude judgments made by children of different ages. Participants A total of 90 typically developing children, 30 in each of three age groups (7 to 8 years, 9 to 10 years, and 11 to 12 years). Speech intelligibility scores and typically developing children's attitudes were measured using eight social and personal attributes on a three-point rating scale. There was a significant correlation between the speech intelligibility scores and attitude judgments for a number of traits: "sick-healthy" as rated by the children aged 7 to 8 years, "no friends-friends" by the children aged 9 to 10 years, and "ugly-good looking" and "no friends-friends" by the children aged 11 to 12 years. Children aged 7 to 8 years gave significantly lower ratings for "mean-kind" but higher ratings for "shy-outgoing" when compared with the other two groups. Typically developing children tended to make negative social and personal attribute judgments about children with cleft palate based solely on the intelligibility of their speech. Society, educators, and health professionals should work together to ensure that children with cleft palate are not stigmatized by their peers.
Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility

PubMed Central

Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

2014-01-01

Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise. PMID:25566159
Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility.

PubMed

Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

2014-01-01

Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise.
Eyes and ears: Using eye tracking and pupillometry to understand challenges to speech recognition.

PubMed

Van Engen, Kristin J; McLaughlin, Drew J

2018-05-04

Although human speech recognition is often experienced as relatively effortless, a number of common challenges can render the task more difficult. Such challenges may originate in talkers (e.g., unfamiliar accents, varying speech styles), the environment (e.g. noise), or in listeners themselves (e.g., hearing loss, aging, different native language backgrounds). Each of these challenges can reduce the intelligibility of spoken language, but even when intelligibility remains high, they can place greater processing demands on listeners. Noisy conditions, for example, can lead to poorer recall for speech, even when it has been correctly understood. Speech intelligibility measures, memory tasks, and subjective reports of listener difficulty all provide critical information about the effects of such challenges on speech recognition. Eye tracking and pupillometry complement these methods by providing objective physiological measures of online cognitive processing during listening. Eye tracking records the moment-to-moment direction of listeners' visual attention, which is closely time-locked to unfolding speech signals, and pupillometry measures the moment-to-moment size of listeners' pupils, which dilate in response to increased cognitive load. In this paper, we review the uses of these two methods for studying challenges to speech recognition. Copyright © 2018. Published by Elsevier B.V.
Recognition of speech in noise after application of time-frequency masks: Dependence on frequency and threshold parameters

PubMed Central

Sinex, Donal G.

2013-01-01

Binary time-frequency (TF) masks can be applied to separate speech from noise. Previous studies have shown that with appropriate parameters, ideal TF masks can extract highly intelligible speech even at very low speech-to-noise ratios (SNRs). Two psychophysical experiments provided additional information about the dependence of intelligibility on the frequency resolution and threshold criteria that define the ideal TF mask. Listeners identified AzBio Sentences in noise, before and after application of TF masks. Masks generated with 8 or 16 frequency bands per octave supported nearly-perfect identification. Word recognition accuracy was slightly lower and more variable with 4 bands per octave. When TF masks were generated with a local threshold criterion of 0 dB SNR, the mean speech reception threshold was −9.5 dB SNR, compared to −5.7 dB for unprocessed sentences in noise. Speech reception thresholds decreased by about 1 dB per dB of additional decrease in the local threshold criterion. Information reported here about the dependence of speech intelligibility on frequency and level parameters has relevance for the development of non-ideal TF masks for clinical applications such as speech processing for hearing aids. PMID:23556604
Age-related changes to spectral voice characteristics affect judgments of prosodic, segmental, and talker attributes for child and adult speech.

PubMed

Dilley, Laura C; Wieland, Elizabeth A; Gamache, Jessica L; McAuley, J Devin; Redford, Melissa A

2013-02-01

As children mature, changes in voice spectral characteristics co-vary with changes in speech, language, and behavior. In this study, spectral characteristics were manipulated to alter the perceived ages of talkers' voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Speech was modified by lowering formants and fundamental frequency, for 5-year-old children's utterances, or raising them, for adult caregivers' utterances. Next, participants differing in awareness of the manipulation (Experiment 1A) or amount of speech-language training (Experiment 1B) made judgments of prosodic, segmental, and talker attributes. Experiment 2 investigated the effects of spectral modification on intelligibility. Finally, in Experiment 3, trained analysts used formal prosody coding to assess prosodic characteristics of spectrally modified and unmodified speech. Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work.
Listener Perception of Monopitch, Naturalness, and Intelligibility for Speakers with Parkinson's Disease

ERIC Educational Resources Information Center

Anand, Supraja; Stepp, Cara E.

2015-01-01

Purpose: Given the potential significance of speech naturalness to functional and social rehabilitation outcomes, the objective of this study was to examine the effect of listener perceptions of monopitch on speech naturalness and intelligibility in individuals with Parkinson's disease (PD). Method: Two short utterances were extracted from…
Speech Intelligibility and Marital Communication in Amyotrophic Lateral Sclerosis: An Exploratory Study

ERIC Educational Resources Information Center

Joubert, Karin; Bornman, Juan; Alant, Erna

2011-01-01

Amyotrophic lateral sclerosis (ALS), a rapidly progressive neuromuscular disease, has a devastating impact not only on individuals diagnosed with ALS but also their spouses. Speech intelligibility, often compromised as a result of dysarthria, affects the couple's ability to maintain effective, intimate communication. The purpose of this…
Perceptual analysis of speech following traumatic brain injury in childhood.

PubMed

Cahill, Louise M; Murdoch, Bruce E; Theodoros, Deborah G

2002-05-01

To investigate perceptually the speech dimensions, oromotor function, and speech intelligibility of a group of individuals with traumatic brain injury (TBI) acquired in childhood. The speech of 24 children with TBI was analysed perceptually and compared with that of a group of non-neurologically impaired children matched for age and sex. The 16 dysarthric TBI subjects were significantly less intelligible than the control subjects, and demonstrated significant impairment in 12 of the 33 speech dimensions rated. In addition, the eight non-dysarthric TBI subjects were significantly impaired in many areas of oromotor function on the Frenchay Dysarthria Assessment, indicating some degree of pre-clinical speech impairment. The results of the perceptual analysis are discussed in terms of the possible underlying pathophysiological bases of the deviant speech features identified, and the need for a comprehensive instrumental assessment, to more accurately determine the level of breakdown in the speech production mechanism in children following TBI.
Alternative Speech Communication System for Persons with Severe Speech Disorders

NASA Astrophysics Data System (ADS)

Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

2009-12-01

Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
A survey of acoustic conditions in semi-open plan classrooms in the United Kingdom.

PubMed

Greenland, Emma E; Shield, Bridget M

2011-09-01

This paper reports the results of a large scale, detailed acoustic survey of 42 open plan classrooms of varying design in the UK each of which contained between 2 and 14 teaching areas or classbases. The objective survey procedure, which was designed specifically for use in open plan classrooms, is described. The acoustic measurements relating to speech intelligibility within a classbase, including ambient noise level, intrusive noise level, speech to noise ratio, speech transmission index, and reverberation time, are presented. The effects on speech intelligibility of critical physical design variables, such as the number of classbases within an open plan unit and the selection of acoustic finishes for control of reverberation, are examined. This analysis enables limitations of open plan classrooms to be discussed and acoustic design guidelines to be developed to ensure good listening conditions. The types of teaching activity to provide adequate acoustic conditions, plus the speech intelligibility requirements of younger children, are also discussed. © 2011 Acoustical Society of America
Speech intelligibility and speech quality of modified loudspeaker announcements examined in a simulated aircraft cabin.

PubMed

Pennig, Sibylle; Quehl, Julia; Wittkowski, Martin

2014-01-01

Acoustic modifications of loudspeaker announcements were investigated in a simulated aircraft cabin to improve passengers' speech intelligibility and quality of communication in this specific setting. Four experiments with 278 participants in total were conducted in an acoustic laboratory using a standardised speech test and subjective rating scales. In experiments 1 and 2 the sound pressure level (SPL) of the announcements was varied (ranging from 70 to 85 dB(A)). Experiments 3 and 4 focused on frequency modification (octave bands) of the announcements. All studies used a background noise with the same SPL (74 dB(A)), but recorded at different seat positions in the aircraft cabin (front, rear). The results quantify speech intelligibility improvements with increasing signal-to-noise ratio and amplification of particular octave bands, especially the 2 kHz and the 4 kHz band. Thus, loudspeaker power in an aircraft cabin can be reduced by using appropriate filter settings in the loudspeaker system.
The effect of audiovisual and binaural listening on the acceptable noise level (ANL): establishing an ANL conceptual model.

PubMed

Wu, Yu-Hsiang; Stangl, Elizabeth; Pang, Carol; Zhang, Xuyang

2014-02-01

Little is known regarding the acoustic features of a stimulus used by listeners to determine the acceptable noise level (ANL). Features suggested by previous research include speech intelligibility (noise is unacceptable when it degrades speech intelligibility to a certain degree; the intelligibility hypothesis) and loudness (noise is unacceptable when the speech-to-noise loudness ratio is poorer than a certain level; the loudness hypothesis). The purpose of the study was to investigate if speech intelligibility or loudness is the criterion feature that determines ANL. To achieve this, test conditions were chosen so that the intelligibility and loudness hypotheses would predict different results. In Experiment 1, the effect of audiovisual (AV) and binaural listening on ANL was investigated; in Experiment 2, the effect of interaural correlation (ρ) on ANL was examined. A single-blinded, repeated-measures design was used. Thirty-two and twenty-five younger adults with normal hearing participated in Experiments 1 and 2, respectively. In Experiment 1, both ANL and speech recognition performance were measured using the AV version of the Connected Speech Test (CST) in three conditions: AV-binaural, auditory only (AO)-binaural, and AO-monaural. Lipreading skill was assessed using the Utley lipreading test. In Experiment 2, ANL and speech recognition performance were measured using the Hearing in Noise Test (HINT) in three binaural conditions, wherein the interaural correlation of noise was varied: ρ = 1 (N(o)S(o) [a listening condition wherein both speech and noise signals are identical across two ears]), -1 (NπS(o) [a listening condition wherein speech signals are identical across two ears whereas the noise signals of two ears are 180 degrees out of phase]), and 0 (N(u)S(o) [a listening condition wherein speech signals are identical across two ears whereas noise signals are uncorrelated across ears]). The results were compared to the predictions made based on the intelligibility and loudness hypotheses. The results of the AV and AO conditions appeared to support the intelligibility hypothesis due to the significant correlation between visual benefit in ANL (AV re: AO ANL) and (1) visual benefit in CST performance (AV re: AO CST) and (2) lipreading skill. The results of the N(o)S(o), NπS(o), and N(u)S(o) conditions negated the intelligibility hypothesis because binaural processing benefit (NπS(o) re: N(o)S(o), and N(u)S(o) re: N(o)S(o)) in ANL was not correlated to that in HINT performance. Instead, the results somewhat supported the loudness hypothesis because the pattern of ANL results across the three conditions (N(o)S(o) ≈ NπS(o) ≈ N(u)S(o) ANL) was more consistent with what was predicted by the loudness hypothesis (N(o)S(o) ≈ NπS(o) < N(u)S(o) ANL) than by the intelligibility hypothesis (NπS(o) < N(u)S(o) < N(o)S(o) ANL). The results of the binaural and monaural conditions supported neither hypothesis because (1) binaural benefit (binaural re: monaural) in ANL was not correlated to that in speech recognition performance, and (2) the pattern of ANL results across conditions (binaural < monaural ANL) was not consistent with the prediction made based on previous binaural loudness summation research (binaural ≥ monaural ANL). The study suggests that listeners may use multiple acoustic features to make ANL judgments. The binaural/monaural results showing that neither hypothesis was supported further indicate that factors other than speech intelligibility and loudness, such as psychological factors, may affect ANL. The weightings of different acoustic features in ANL judgments may vary widely across individuals and listening conditions. American Academy of Audiology.
Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners.

PubMed

Park, Hyojin; Ince, Robin A A; Schyns, Philippe G; Thut, Gregor; Gross, Joachim

2015-06-15

Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Frontal Top-Down Signals Increase Coupling of Auditory Low-Frequency Oscillations to Continuous Speech in Human Listeners

PubMed Central

Park, Hyojin; Ince, Robin A.A.; Schyns, Philippe G.; Thut, Gregor; Gross, Joachim

2015-01-01

Summary Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. PMID:26028433
Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

ERIC Educational Resources Information Center

Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

2018-01-01

Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…
Acoustic changes in the speech of children with cerebral palsy following an intensive program of dysarthria therapy.

PubMed

Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

2018-01-01

The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. We recorded 16 young people with cerebral palsy and dysarthria (nine girls; mean age 14 years, SD = 2; nine spastic type, two dyskinetic, four mixed; one Worster-Drought) producing speech in two conditions (single words, connected speech) twice before and twice after therapy focusing on respiration, phonation and rate. In both single-word and connected speech we measured vocal intensity (root mean square-RMS), period-to-period variability (Shimmer APQ, Jitter RAP and PPQ) and harmonics-to-noise ratio (HNR). In connected speech we also measured mean fundamental frequency, utterance duration in seconds and speech and articulation rate (syllables/s with and without pauses respectively). All acoustic measures were made using Praat. Intelligibility was calculated in previous research. In single words statistically significant but very small reductions were observed in period-to-period variability following therapy: Shimmer APQ -0.15 (95% CI = -0.21 to -0.09); Jitter RAP -0.08 (95% CI = -0.14 to -0.01); Jitter PPQ -0.08 (95% CI = -0.15 to -0.01). No changes in period-to-period perturbation across phrases in connected speech were detected. However, changes in connected speech were observed in phrase length, rate and intensity. Following therapy, mean utterance duration increased by 1.11 s (95% CI = 0.37-1.86) when measured with pauses and by 1.13 s (95% CI = 0.40-1.85) when measured without pauses. Articulation rate increased by 0.07 syllables/s (95% CI = 0.02-0.13); speech rate increased by 0.06 syllables/s (95% CI = < 0.01-0.12); and intensity increased by 0.03 Pascals (95% CI = 0.02-0.04). There was a gradual reduction in mean fundamental frequency across all time points (-11.85 Hz, 95% CI = -19.84 to -3.86). Only increases in the intensity of single words (0.37 Pascals, 95% CI = 0.10-0.65) and reductions in fundamental frequency (-0.11 Hz, 95% CI = -0.21 to -0.02) in connected speech were associated with gains in intelligibility. Mean reductions in impairment in vocal function following therapy observed were small and most are unlikely to be clinically significant. Changes in vocal control did not explain improved intelligibility. © 2017 Royal College of Speech and Language Therapists.

Direct magnitude estimates of speech intelligibility in dysarthria: effects of a chosen standard.

PubMed

Weismer, Gary; Laures, Jacqueline S

2002-06-01

Direct magnitude estimation (DME) has been used frequently as a perceptual scaling technique in studies of the speech intelligibility of persons with speech disorders. The technique is typically used with a standard, or reference stimulus, chosen as a good exemplar of "midrange" intelligibility. In several published studies, the standard has been chosen subjectively, usually on the basis of the expertise of the investigators. The current experiment demonstrates that a fixed set of sentence-level utterances, obtained from 4 individuals with dysarthria (2 with Parkinson disease, 2 with traumatic brain injury) as well as 3 neurologically normal speakers, is scaled differently depending on the identity of the standard. Four different standards were used in the main experiment, three of which were judged qualitatively in two independent evaluations to be good exemplars of midrange intelligibility. Acoustic analyses did not reveal obvious differences between these four standards but suggested that the standard with the worst-scaled intelligibility had much poorer voice source characteristics compared to the other three standards. Results are discussed in terms of possible standardization of midrange intelligibility exemplars for DME experiments.
The impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners (L).

PubMed

Kokkinakis, Kostas; Loizou, Philipos C

2011-09-01

The purpose of this study is to determine the relative impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners. Sentences were presented in two conditions wherein reverberant consonant segments were replaced with clean consonants, and in another condition wherein reverberant vowel segments were replaced with clean vowels. The underlying assumption is that self-masking effects would dominate in the first condition, whereas overlap-masking effects would dominate in the second condition. Results indicated that the degradation of speech intelligibility in reverberant conditions is caused primarily by self-masking effects that give rise to flattened formant transitions. © 2011 Acoustical Society of America
Integrating the acoustics of running speech into the pure tone audiogram: a step from audibility to intelligibility and disability.

PubMed

Corthals, Paul

2008-01-01

The aim of the present study is to construct a simple method for visualizing and quantifying the audibility of speech on the audiogram and to predict speech intelligibility. The proposed method involves a series of indices on the audiogram form reflecting the sound pressure level distribution of running speech. The indices that coincide with a patient's pure tone thresholds reflect speech audibility and give evidence of residual functional hearing capacity. Two validation studies were conducted among sensorineurally hearing-impaired participants (n = 56 and n = 37, respectively) to investigate the relation with speech recognition ability and hearing disability. The potential of the new audibility indices as predictors for speech reception thresholds is comparable to the predictive potential of the ANSI 1968 articulation index and the ANSI 1997 speech intelligibility index. The sum of indices or a weighted combination can explain considerable proportions of variance in speech reception results for sentences in quiet free field conditions. The proportions of variance that can be explained in questionnaire results on hearing disability are less, presumably because the threshold indices almost exclusively reflect message audibility and much less the psychosocial consequences of hearing deficits. The outcomes underpin the validity of the new audibility indexing system, even though the proposed method may be better suited for predicting relative performance across a set of conditions than for predicting absolute speech recognition performance. (c) 2007 S. Karger AG, Basel
Cued Speech Transliteration: Effects of Accuracy and Lag Time on Message Intelligibility

ERIC Educational Resources Information Center

Krause, Jean C.; Lopez, Katherine A.

2017-01-01

This paper is the second in a series concerned with the level of access afforded to students who use educational interpreters. The first paper (Krause & Tessler, 2016) focused on factors affecting accuracy of messages produced by Cued Speech (CS) transliterators (expression). In this study, factors affecting intelligibility (reception by deaf…
Measuring the Effects of Reverberation and Noise on Sentence Intelligibility for Hearing-Impaired Listeners

ERIC Educational Resources Information Center

George, Erwin L. J.; Goverts, S. Theo; Festen, Joost M.; Houtgast, Tammo

2010-01-01

Purpose: The Speech Transmission Index (STI; Houtgast, Steeneken, & Plomp, 1980; Steeneken & Houtgast, 1980) is commonly used to quantify the adverse effects of reverberation and stationary noise on speech intelligibility for normal-hearing listeners. Duquesnoy and Plomp (1980) showed that the STI can be applied for presbycusic listeners, relating…
The Listener: No Longer the Silent Partner in Reduced Intelligibility

ERIC Educational Resources Information Center

Zielinski, Beth W.

2008-01-01

In this study I investigate the impact of different characteristics of the L2 speech signal on the intelligibility of L2 speakers of English to native listeners. Three native listeners were observed and questioned as they orthographically transcribed utterances taken from connected conversational speech produced by three L2 speakers from different…
Speech characteristics in a Ugandan child with a rare paramedian craniofacial cleft: a case report.

PubMed

Van Lierde, K M; Bettens, K; Luyten, A; De Ley, S; Tungotyo, M; Balumukad, D; Galiwango, G; Bauters, W; Vermeersch, H; Hodges, A

2013-03-01

The purpose of this study is to describe the speech characteristics in an English-speaking Ugandan boy of 4.5 years who has a rare paramedian craniofacial cleft (unilateral lip, alveolar, palatal, nasal and maxillary cleft, and associated hypertelorism). Closure of the lip together with the closure of the hard and soft palate (one-stage palatal closure) was performed at the age of 5 months. Objective as well as subjective speech assessment techniques were used. The speech samples were perceptually judged for articulation, intelligibility and nasality. The Nasometer was used for the objective measurement of the nasalance values. The most striking communication problems in this child with the rare craniofacial cleft are an incomplete phonetic inventory, a severely impaired speech intelligibility with the presence of very severe hypernasality, mild nasal emission, phonetic disorders (omission of several consonants, decreased intraoral pressure in explosives, insufficient frication of fricatives and the use of a middorsum palatal stop) and phonological disorders (deletion of initial and final consonants and consonant clusters). The increased objective nasalance values are in agreement with the presence of the audible nasality disorders. The results revealed that several phonetic and phonological articulation disorders together with a decreased speech intelligibility and resonance disorders are present in the child with a rare craniofacial cleft. To what extent a secondary surgery for velopharyngeal insufficiency, combined with speech therapy, will improve speech intelligibility, articulation and resonance characteristics is a subject for further research. The results of such analyses may ultimately serve as a starting point for specific surgical and logopedic treatment that addresses the specific needs of children with rare facial clefts. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues.

PubMed

Wirtzfeld, Michael R; Ibrahim, Rasha A; Bruce, Ian C

2017-10-01

Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.
Long-Term Follow-Up Study of Young Adults Treated for Unilateral Complete Cleft Lip, Alveolus, and Palate by a Treatment Protocol Including Two-Stage Palatoplasty: Speech Outcomes

PubMed Central

Bittermann, Dirk; Janssen, Laura; Bittermann, Gerhard Koendert Pieter; Boonacker, Chantal; Haverkamp, Sarah; de Wilde, Hester; Van Der Heul, Marise; Specken, Tom FJMC; Koole, Ron; Kon, Moshe; Breugem, Corstiaan Cornelis; Mink van der Molen, Aebele Barber

2017-01-01

Background No consensus exists on the optimal treatment protocol for orofacial clefts or the optimal timing of cleft palate closure. This study investigated factors influencing speech outcomes after two-stage palate repair in adults with a non-syndromal complete unilateral cleft lip and palate (UCLP). Methods This was a retrospective analysis of adult patients with a UCLP who underwent two-stage palate closure and were treated at our tertiary cleft centre. Patients ≥17 years of age were invited for a final speech assessment. Their medical history was obtained from their medical files, and speech outcomes were assessed by a speech pathologist during the follow-up consultation. Results Forty-eight patients were included in the analysis, with a mean age of 21 years (standard deviation, 3.4 years). Their mean age at the time of hard and soft palate closure was 3 years and 8.0 months, respectively. In 40% of the patients, a pharyngoplasty was performed. On a 5-point intelligibility scale, 84.4% received a score of 1 or 2; meaning that their speech was intelligible. We observed a significant correlation between intelligibility scores and the incidence of articulation errors (P<0.001). In total, 36% showed mild to moderate hypernasality during the speech assessment, and 11%–17% of the patients exhibited increased nasalance scores, assessed through nasometry. Conclusions The present study describes long-term speech outcomes after two-stage palatoplasty with hard palate closure at a mean age of 3 years old. We observed moderate long-term intelligibility scores, a relatively high incidence of persistent hypernasality, and a high pharyngoplasty incidence. PMID:28573094
Multi-time resolution analysis of speech: evidence from psychophysics

PubMed Central

Chait, Maria; Greenberg, Steven; Arai, Takayuki; Simon, Jonathan Z.; Poeppel, David

2015-01-01

How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10–40 Hz modulation frequency) and syllable-sized (2–10 Hz modulation frequency) units in speech processing. However, it is not clear how information associated with such different time scales interacts in a manner relevant for speech perception. We report behavioral experiments on speech intelligibility employing a stimulus that allows us to investigate how distinct temporal modulations in speech are treated separately and whether they are combined. We created sentences in which the slow (~4 Hz; Slow) and rapid (~33 Hz; Shigh) modulations—corresponding to ~250 and ~30 ms, the average duration of syllables and certain phonetic properties, respectively—were selectively extracted. Although Slow and Shigh have low intelligibility when presented separately, dichotic presentation of Shigh with Slow results in supra-additive performance, suggesting a synergistic relationship between low- and high-modulation frequencies. A second experiment desynchronized presentation of the Slow and Shigh signals. Desynchronizing signals relative to one another had no impact on intelligibility when delays were less than ~45 ms. Longer delays resulted in a steep intelligibility decline, providing further evidence of integration or binding of information within restricted temporal windows. Our data suggest that human speech perception uses multi-time resolution processing. Signals are concurrently analyzed on at least two separate time scales, the intermediate representations of these analyses are integrated, and the resulting bound percept has significant consequences for speech intelligibility—a view compatible with recent insights from neuroscience implicating multi-timescale auditory processing. PMID:26136650
Formant-Frequency Variation and Informational Masking of Speech by Extraneous Formants: Evidence Against Dynamic and Speech-Specific Acoustical Constraints

PubMed Central

2014-01-01

How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 − F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints. PMID:24842068
The eye as a window to the listening brain: neural correlates of pupil size as a measure of cognitive listening load.

PubMed

Zekveld, Adriana A; Heslenfeld, Dirk J; Johnsrude, Ingrid S; Versfeld, Niek J; Kramer, Sophia E

2014-11-01

An important aspect of hearing is the degree to which listeners have to deploy effort to understand speech. One promising measure of listening effort is task-evoked pupil dilation. Here, we use functional magnetic resonance imaging (fMRI) to identify the neural correlates of pupil dilation during comprehension of degraded spoken sentences in 17 normal-hearing listeners. Subjects listened to sentences degraded in three different ways: the target female speech was masked by fluctuating noise, by speech from a single male speaker, or the target speech was noise-vocoded. The degree of degradation was individually adapted such that 50% or 84% of the sentences were intelligible. Control conditions included clear speech in quiet, and silent trials. The peak pupil dilation was larger for the 50% compared to the 84% intelligibility condition, and largest for speech masked by the single-talker masker, followed by speech masked by fluctuating noise, and smallest for noise-vocoded speech. Activation in the bilateral superior temporal gyrus (STG) showed the same pattern, with most extensive activation for speech masked by the single-talker masker. Larger peak pupil dilation was associated with more activation in the bilateral STG, bilateral ventral and dorsal anterior cingulate cortex and several frontal brain areas. A subset of the temporal region sensitive to pupil dilation was also sensitive to speech intelligibility and degradation type. These results show that pupil dilation during speech perception in challenging conditions reflects both auditory and cognitive processes that are recruited to cope with degraded speech and the need to segregate target speech from interfering sounds. Copyright © 2014 Elsevier Inc. All rights reserved.
A clinical assessment of cochlear implant recipient performance: implications for individualized map settings in specific environments.

PubMed

Hey, Matthias; Hocke, Thomas; Mauger, Stefan; Müller-Deile, Joachim

2016-11-01

Individual speech intelligibility was measured in quiet and noise for cochlear Implant recipients upgrading from the Freedom to the CP900 series sound processor. The postlingually deafened participants (n = 23) used either Nucleus CI24RE or CI512 cochlear implant, and currently wore a Freedom sound processor. A significant group mean improvement in speech intelligibility was found in quiet (Freiburg monosyllabic words at 50 dB SPL ) and in noise (adaptive Oldenburger sentences in noise) for the two CP900 series SmartSound programs compared to the Freedom program. Further analysis was carried out on individual's speech intelligibility outcomes in quiet and in noise. Results showed a significant improvement or decrement for some recipients when upgrading to the new programs. To further increase speech intelligibility outcomes when upgrading, an enhanced upgrade procedure is proposed that includes additional testing with different signal-processing schemes. Implications of this research are that future automated scene analysis and switching technologies could provide additional performance improvements by introducing individualized scene-dependent settings.
Evolution of the speech intelligibility of prelinguistically deaf children who received a cochlear implant

NASA Astrophysics Data System (ADS)

Bouchard, Marie-Eve; Cohen, Henri; Lenormand, Marie-Therese

2005-04-01

The 2 main objectives of this investigation are (1) to assess the evolution of the speech intelligibility of 12 prelinguistically deaf children implanted between 25 and 78 months of age and (2) to clarify the influence of the age at implantation on the intelligibility. Speech productions videorecorded at 6, 18 and 36 months following surgery during a standardized free play session. Selected syllables were then presented to 40 adults listeners who were asked to identify the vowels or the consonants they heard and to judge the quality of the segments. Perceived vowels were then located in the vocalic space whereas consonants were classified according to voicing, manner and place of articulation. 3 (Groups) ×3 (Times) ANOVA with repeated measures revealed a clear influence of time as well as age at implantation on the acquisition patterns. Speech intelligibility of these implanted children tended to improve as their experience with the device increased. Based on these results, it is proposed that sensory restoration following cochlear implant served as a probe to develop articulatory strategies allowing them to reach the intended acoustico-perceptual target.
The Effectiveness of Clear Speech as a Masker

ERIC Educational Resources Information Center

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Neural and Behavioral Mechanisms of Clear Speech

ERIC Educational Resources Information Center

Luque, Jenna Silver

2017-01-01

Clear speech is a speaking style that has been shown to improve intelligibility in adverse listening conditions, for various listener and talker populations. Clear-speech phonetic enhancements include a slowed speech rate, expanded vowel space, and expanded pitch range. Although clear-speech phonetic enhancements have been demonstrated across a…
Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss

PubMed Central

Fogerty, Daniel; Ahlstrom, Jayne B.; Bologna, William J.; Dubno, Judy R.

2015-01-01

This study investigated how single-talker modulated noise impacts consonant and vowel cues to sentence intelligibility. Younger normal-hearing, older normal-hearing, and older hearing-impaired listeners completed speech recognition tests. All listeners received spectrally shaped speech matched to their individual audiometric thresholds to ensure sufficient audibility with the exception of a second younger listener group who received spectral shaping that matched the mean audiogram of the hearing-impaired listeners. Results demonstrated minimal declines in intelligibility for older listeners with normal hearing and more evident declines for older hearing-impaired listeners, possibly related to impaired temporal processing. A correlational analysis suggests a common underlying ability to process information during vowels that is predictive of speech-in-modulated noise abilities. Whereas, the ability to use consonant cues appears specific to the particular characteristics of the noise and interruption. Performance declines for older listeners were mostly confined to consonant conditions. Spectral shaping accounted for the primary contributions of audibility. However, comparison with the young spectral controls who received identical spectral shaping suggests that this procedure may reduce wideband temporal modulation cues due to frequency-specific amplification that affected high-frequency consonants more than low-frequency vowels. These spectral changes may impact speech intelligibility in certain modulation masking conditions. PMID:26093436
Evaluation of model-based versus non-parametric monaural noise-reduction approaches for hearing aids.

PubMed

Harlander, Niklas; Rosenkranz, Tobias; Hohmann, Volker

2012-08-01

Single channel noise reduction has been well investigated and seems to have reached its limits in terms of speech intelligibility improvement, however, the quality of such schemes can still be advanced. This study tests to what extent novel model-based processing schemes might improve performance in particular for non-stationary noise conditions. Two prototype model-based algorithms, a speech-model-based, and a auditory-model-based algorithm were compared to a state-of-the-art non-parametric minimum statistics algorithm. A speech intelligibility test, preference rating, and listening effort scaling were performed. Additionally, three objective quality measures for the signal, background, and overall distortions were applied. For a better comparison of all algorithms, particular attention was given to the usage of the similar Wiener-based gain rule. The perceptual investigation was performed with fourteen hearing-impaired subjects. The results revealed that the non-parametric algorithm and the auditory model-based algorithm did not affect speech intelligibility, whereas the speech-model-based algorithm slightly decreased intelligibility. In terms of subjective quality, both model-based algorithms perform better than the unprocessed condition and the reference in particular for highly non-stationary noise environments. Data support the hypothesis that model-based algorithms are promising for improving performance in non-stationary noise conditions.
A physiologically-inspired model reproducing the speech intelligibility benefit in cochlear implant listeners with residual acoustic hearing.

PubMed

Zamaninezhad, Ladan; Hohmann, Volker; Büchner, Andreas; Schädler, Marc René; Jürgens, Tim

2017-02-01

This study introduces a speech intelligibility model for cochlear implant users with ipsilateral preserved acoustic hearing that aims at simulating the observed speech-in-noise intelligibility benefit when receiving simultaneous electric and acoustic stimulation (EA-benefit). The model simulates the auditory nerve spiking in response to electric and/or acoustic stimulation. The temporally and spatially integrated spiking patterns were used as the final internal representation of noisy speech. Speech reception thresholds (SRTs) in stationary noise were predicted for a sentence test using an automatic speech recognition framework. The model was employed to systematically investigate the effect of three physiologically relevant model factors on simulated SRTs: (1) the spatial spread of the electric field which co-varies with the number of electrically stimulated auditory nerves, (2) the "internal" noise simulating the deprivation of auditory system, and (3) the upper bound frequency limit of acoustic hearing. The model results show that the simulated SRTs increase monotonically with increasing spatial spread for fixed internal noise, and also increase with increasing the internal noise strength for a fixed spatial spread. The predicted EA-benefit does not follow such a systematic trend and depends on the specific combination of the model parameters. Beyond 300 Hz, the upper bound limit for preserved acoustic hearing is less influential on speech intelligibility of EA-listeners in stationary noise. The proposed model-predicted EA-benefits are within the range of EA-benefits shown by 18 out of 21 actual cochlear implant listeners with preserved acoustic hearing. Copyright © 2016 Elsevier B.V. All rights reserved.
Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram.

PubMed

Hossain, Mohammad E; Jassim, Wissam A; Zilany, Muhammad S A

2016-01-01

Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants.

Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram

PubMed Central

Hossain, Mohammad E.; Jassim, Wissam A.; Zilany, Muhammad S. A.

2016-01-01

Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants. PMID:26967160
Assessment of the Speech Intelligibility Performance of Post Lingual Cochlear Implant Users at Different Signal-to-Noise Ratios Using the Turkish Matrix Test.

PubMed

Polat, Zahra; Bulut, Erdoğan; Ataş, Ahmet

2016-09-01

Spoken word recognition and speech perception tests in quiet are being used as a routine in assessment of the benefit which children and adult cochlear implant users receive from their devices. Cochlear implant users generally demonstrate high level performances in these test materials as they are able to achieve high level speech perception ability in quiet situations. Although these test materials provide valuable information regarding Cochlear Implant (CI) users' performances in optimal listening conditions, they do not give realistic information regarding performances in adverse listening conditions, which is the case in the everyday environment. The aim of this study was to assess the speech intelligibility performance of post lingual CI users in the presence of noise at different signal-to-noise ratio with the Matrix Test developed for Turkish language. Cross-sectional study. The thirty post lingual implant user adult subjects, who had been using implants for a minimum of one year, were evaluated with Turkish Matrix test. Subjects' speech intelligibility was measured using the adaptive and non-adaptive Matrix Test in quiet and noisy environments. The results of the study show a correlation between Pure Tone Average (PTA) values of the subjects and Matrix test Speech Reception Threshold (SRT) values in the quiet. Hence, it is possible to asses PTA values of CI users using the Matrix Test also. However, no correlations were found between Matrix SRT values in the quiet and Matrix SRT values in noise. Similarly, the correlation between PTA values and intelligibility scores in noise was also not significant. Therefore, it may not be possible to assess the intelligibility performance of CI users using test batteries performed in quiet conditions. The Matrix Test can be used to assess the benefit of CI users from their systems in everyday life, since it is possible to perform intelligibility test with the Matrix test using a material that CI users experience in their everyday life and it is possible to assess their difficulty in speech discrimination in noisy conditions they have to cope with.
Intelligibility in Context Scale: Normative and Validation Data for English-Speaking Preschoolers.

PubMed

McLeod, Sharynne; Crowe, Kathryn; Shahaeian, Ameneh

2015-07-01

The purpose of this study was to describe normative and validation data on the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012c) for English-speaking children. The ICS is a 7-item, parent-report measure of children's speech intelligibility with a range of communicative partners. Data were collected from the parents of 803 Australian English-speaking children ranging in age from 4;0 (years;months) to 5;5 (37.0% were multilingual). The mean ICS score was 4.4 (SD = 0.7) out of a possible total score of 5. Children's speech was reported to be most intelligible to their parents, followed by their immediate family, friends, and teachers; children's speech was least intelligible to strangers. The ICS had high internal consistency (α = .94). Significant differences in scores were identified on the basis of sex and age but not on the basis of socioeconomic status or the number of languages spoken. There were significant differences in scores between children whose parents had concerns about their child's speech (M = 3.9) and those who did not (M = 4.6). A sensitivity of .82 and a specificity of .58 were established as the optimal cutoff. Test-retest reliability and criterion validity were established for 184 children with a speech sound disorder. There was a significant low correlation between the ICS mean score and percentage of phonemes correct (r = .30), percentage of consonants correct (r = .24), and percentage of vowels correct (r = .30) on the Diagnostic Evaluation of Articulation and Phonology (Dodd, Hua, Crosbie, Holm, & Ozanne, 2002). Thirty-one parents completed the ICS related to English and another language spoken by their child with a speech sound disorder. The significant correlations between the scores suggest that the ICS may be robust between languages. This article provides normative ICS data for English-speaking children and additional validation of the psychometric properties of the ICS. The robustness of the ICS was suggested because mean ICS scores were not affected by socioeconomic status, number of languages spoken, or whether the ICS was completed in relation to English or another language. The ICS is recommended as a screening measure of children's speech intelligibility.
Speech intelligibility in complex acoustic environments in young children

NASA Astrophysics Data System (ADS)

Litovsky, Ruth

2003-04-01

While the auditory system undergoes tremendous maturation during the first few years of life, it has become clear that in complex scenarios when multiple sounds occur and when echoes are present, children's performance is significantly worse than their adult counterparts. The ability of children (3-7 years of age) to understand speech in a simulated multi-talker environment and to benefit from spatial separation of the target and competing sounds was investigated. In these studies, competing sources vary in number, location, and content (speech, modulated or unmodulated speech-shaped noise and time-reversed speech). The acoustic spaces were also varied in size and amount of reverberation. Finally, children with chronic otitis media who received binaural training were tested pre- and post-training on a subset of conditions. Results indicated the following. (1) Children experienced significantly more masking than adults, even in the simplest conditions tested. (2) When the target and competing sounds were spatially separated speech intelligibility improved, but the amount varied with age, type of competing sound, and number of competitors. (3) In a large reverberant classroom there was no benefit of spatial separation. (4) Binaural training improved speech intelligibility performance in children with otitis media. Future work includes similar studies in children with unilateral and bilateral cochlear implants. [Work supported by NIDCD, DRF, and NOHR.
Speech intelligibility in cerebral palsy children attending an art therapy program.

PubMed

Wilk, Magdalena; Pachalska, Maria; Lipowska, Małgorzata; Herman-Sucharska, Izabela; Makarowski, Ryszard; Mirski, Andrzej; Jastrzebowska, Grazyna

2010-05-01

Dysarthia is a common sequela of cerebral palsy (CP), directly affecting both the intelligibility of speech and the child's psycho-social adjustment. Speech therapy focused exclusively on the articulatory organs does not always help CP children to speak more intelligibly. The program of art therapy described here has proven to be helpful for these children. From among all the CP children enrolled in our art therapy program from 2005 to 2009, we selected a group of 14 boys and girls (average age 15.3) with severe dysarthria at baseline but no other language or cognitive disturbances. Our retrospective study was based on results from the Auditory Dysarthria Scale and neuropsychological tests for fluency, administered routinely over the 4 months of art therapy. All 14 children in the study group showed some degree of improvement after art therapy in all tested parameters. On the Auditory Dysarthia Scale, highly significant improvements were noted in overall intelligibility (p<0.0001), with significant improvement (p<0.001) in volume, tempo, and control of pauses. The least improvement was noted in the most purely motor parameters. All 14 children also exhibited significant improvement in fluency. Art therapy improves the intelligibility of speech in children with cerebral palsy, even when language functions are not as such the object of therapeutic intervention.
Measures to Evaluate the Effects of DBS on Speech Production

PubMed Central

Weismer, Gary; Yunusova, Yana; Bunton, Kate

2011-01-01

The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066
Age-related changes to spectral voice characteristics affect judgments of prosodic, segmental, and talker attributes for child and adult speech

PubMed Central

Dilley, Laura C.; Wieland, Elizabeth A.; Gamache, Jessica L.; McAuley, J. Devin; Redford, Melissa A.

2013-01-01

Purpose As children mature, changes in voice spectral characteristics covary with changes in speech, language, and behavior. Spectral characteristics were manipulated to alter the perceived ages of talkers’ voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Method Speech was modified by lowering formants and fundamental frequency, for 5-year-old children’s utterances, or raising them, for adult caregivers’ utterances. Next, participants differing in awareness of the manipulation (Exp. 1a) or amount of speech-language training (Exp. 1b) made judgments of prosodic, segmental, and talker attributes. Exp. 2 investigated the effects of spectral modification on intelligibility. Finally, in Exp. 3 trained analysts used formal prosody coding to assess prosodic characteristics of spectrally-modified and unmodified speech. Results Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Conclusions Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work. PMID:23275414
Speech-Message Extraction from Interference Introduced by External Distributed Sources

NASA Astrophysics Data System (ADS)

Kanakov, V. A.; Mironov, N. A.

2017-08-01

The problem of this study involves the extraction of a speech signal originating from a certain spatial point and calculation of the intelligibility of the extracted voice message. It is solved by the method of decreasing the influence of interference from the speech-message sources on the extracted signal. This method is based on introducing the time delays, which depend on the spatial coordinates, to the recording channels. Audio records of the voices of eight different people were used as test objects during the studies. It is proved that an increase in the number of microphones improves intelligibility of the speech message which is extracted from interference.
A simulation study of harmonics regeneration in noise reduction for electric and acoustic stimulation.

PubMed

Hu, Yi

2010-05-01

Recent research results show that combined electric and acoustic stimulation (EAS) significantly improves speech recognition in noise, and it is generally established that access to the improved F0 representation of target speech, along with the glimpse cues, provide the EAS benefits. Under noisy listening conditions, noise signals degrade these important cues by introducing undesired temporal-frequency components and corrupting harmonics structure. In this study, the potential of combining noise reduction and harmonics regeneration techniques was investigated to further improve speech intelligibility in noise by providing improved beneficial cues for EAS. Three hypotheses were tested: (1) noise reduction methods can improve speech intelligibility in noise for EAS; (2) harmonics regeneration after noise reduction can further improve speech intelligibility in noise for EAS; and (3) harmonics sideband constraints in frequency domain (or equivalently, amplitude modulation in temporal domain), even deterministic ones, can provide additional benefits. Test results demonstrate that combining noise reduction and harmonics regeneration can significantly improve speech recognition in noise for EAS, and it is also beneficial to preserve the harmonics sidebands under adverse listening conditions. This finding warrants further work into the development of algorithms that regenerate harmonics and the related sidebands for EAS processing under noisy conditions.
Intelligibility Evaluation of Pathological Speech through Multigranularity Feature Extraction and Optimization.

PubMed

Fang, Chunying; Li, Haifeng; Ma, Lin; Zhang, Mancai

2017-01-01

Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S -transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S -transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F -score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation.
Improving the speech intelligibility in classrooms

NASA Astrophysics Data System (ADS)

Lam, Choi Ling Coriolanus

One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations of the reverberation time, the indoor ambient noise (or background noise level), the signal-to-noise ratio, and the speech transmission index, it aims to establish a guideline for improving the speech intelligibility in classrooms for any countries and any environmental conditions. The study showed that the acoustical conditions of most of the measured classrooms in Hong Kong are unsatisfactory. The selection of materials inside a classroom is important for improving speech intelligibility at design stage, especially the acoustics ceiling, to shorten the reverberation time inside the classroom. The signal-to-noise should be higher than 11dB(A) for over 70% of speech perception, either tonal or non-tonal languages, without the usage of address system. The unexpected results bring out a call to revise the standard design and to devise acceptable standards for classrooms in Hong Kong. It is also demonstrated a method for assessment on the classroom in other cities with similar environmental conditions.
Signal Processing Methods for Removing the Effects of Whole Body Vibration upon Speech

NASA Technical Reports Server (NTRS)

Bitner, Rachel M.; Begault, Durand R.

2014-01-01

Humans may be exposed to whole-body vibration in environments where clear speech communications are crucial, particularly during the launch phases of space flight and in high-performance aircraft. Prior research has shown that high levels of vibration cause a decrease in speech intelligibility. However, the effects of whole-body vibration upon speech are not well understood, and no attempt has been made to restore speech distorted by whole-body vibration. In this paper, a model for speech under whole-body vibration is proposed and a method to remove its effect is described. The method described reduces the perceptual effects of vibration, yields higher ASR accuracy scores, and may significantly improve intelligibility. Possible applications include incorporation within communication systems to improve radio-communication systems in environments such a spaceflight, aviation, or off-road vehicle operations.
Infant Perception of Atypical Speech Signals

ERIC Educational Resources Information Center

Vouloumanos, Athena; Gelfand, Hanna M.

2013-01-01

The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…
Intensive Speech and Language Therapy for Older Children with Cerebral Palsy: A Systems Approach

ERIC Educational Resources Information Center

Pennington, Lindsay; Miller, Nick; Robson, Sheila; Steen, Nick

2010-01-01

Aim: To investigate whether speech therapy using a speech systems approach to controlling breath support, phonation, and speech rate can increase the speech intelligibility of children with dysarthria and cerebral palsy (CP). Method: Sixteen children with dysarthria and CP participated in a modified time series design. Group characteristics were…
Hybridizing Conversational and Clear Speech to Investigate the Source of Increased Intelligibility in Speakers with Parkinson's Disease

ERIC Educational Resources Information Center

Tjaden, Kris; Kain, Alexander; Lam, Jennifer

2014-01-01

Purpose: A speech analysis-resynthesis paradigm was used to investigate segmental and suprasegmental acoustic variables explaining intelligibility variation for 2 speakers with Parkinson's disease (PD). Method: Sentences were read in conversational and clear styles. Acoustic characteristics from clear sentences were extracted and applied to…
Intelligibility and Clarity of Reverberant Speech: Effects of Wide Dynamic Range Compression Release Time and Working Memory

ERIC Educational Resources Information Center

Reinhart, Paul N.; Souza, Pamela E.

2016-01-01

Purpose: The purpose of this study was to examine the effects of varying wide dynamic range compression (WDRC) release time on intelligibility and clarity of reverberant speech. The study also considered the role of individual working memory. Method: Thirty older listeners with mild to moderately-severe sloping sensorineural hearing loss…
A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals with Parkinson's Disease

ERIC Educational Resources Information Center

Kim, Yunjung; Choi, Yaelin

2017-01-01

Purpose: The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method: A total of 48 speakers from the 4 speaker groups (AE speakers with…
The Use of Artificial Neural Networks to Estimate Speech Intelligibility from Acoustic Variables: A Preliminary Analysis.

ERIC Educational Resources Information Center

Metz, Dale Evan; And Others

1992-01-01

A preliminary scheme for estimating the speech intelligibility of hearing-impaired speakers from acoustic parameters, using a computerized artificial neural network to process mathematically the acoustic input variables, is outlined. Tests with 60 hearing-impaired speakers found the scheme to be highly accurate in identifying speakers separated by…
Effects of Visual Information on Intelligibility of Open and Closed Class Words in Predictable Sentences Produced by Speakers with Dysarthria

ERIC Educational Resources Information Center

Hustad, Katherine C.; Dardis, Caitlin M.; Mccourt, Kelly A.

2007-01-01

This study examined the independent and interactive effects of visual information and linguistic class of words on intelligibility of dysarthric speech. Seven speakers with dysarthria participated in the study, along with 224 listeners who transcribed speech samples in audiovisual (AV) or audio-only (AO) listening conditions. Orthographic…
An Evaluation of Articulatory Working Space Area in Vowel Production of Adults with Down Syndrome

ERIC Educational Resources Information Center

Bunton, Kate; Leddy, Mark

2011-01-01

Many adolescents and adults with Down syndrome have reduced speech intelligibility. Reasons for this reduction may relate to differences in anatomy and physiology, both of which are important for creating an intelligible speech signal. The purpose of this study was to document acoustic vowel space and articulatory working space for two adult…

Acoustics of Clear Speech: Effect of Instruction

ERIC Educational Resources Information Center

Lam, Jennifer; Tjaden, Kris; Wilding, Greg

2012-01-01

Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…
Got EQ?: Increasing Cultural and Clinical Competence through Emotional Intelligence

ERIC Educational Resources Information Center

Robertson, Shari A.

2007-01-01

Cultural intelligence has been described across three parameters of human behavior: cognitive intelligence, emotional intelligence (EQ), and physical intelligence. Each contributes a unique and important perspective to the ability of speech-language pathologists and audiologists to provide benefits to their clients regardless of cultural…
Preliminary evaluation of synthetic speech

DOT National Transportation Integrated Search

1972-08-01

The report briefly discusses the methods for storing and generating synthetic speech and a preliminary evaluation of the intelligibility of a speech synthesizer having a 75-word vocabulary selected for air traffic control messages. A program is sugge...
The intelligibility in Context Scale: validity and reliability of a subjective rating measure.

PubMed

McLeod, Sharynne; Harrison, Linda J; McCormack, Jane

2012-04-01

To describe a new measure of functional intelligibility, the Intelligibility in Context Scale (ICS), and evaluate its validity, reliability, and sensitivity using 3 clinical measures of severity of speech sound disorder: (a) percentage of phonemes correct (PPC), (b) percentage of consonants correct (PCC), and (c) percentage of vowels correct (PVC). Speech skills of 120 preschool children (109 with parent-/teacher-identified concern about how they talked and made speech sounds and 11 with no identified concern) were assessed with the Diagnostic Evaluation of Articulation and Phonology (Dodd, Hua, Crosbie, Holm, & Ozanne, 2002). Parents completed the 7-item ICS, which rates the degree to which children's speech is understood by different communication partners (parents, immediate family, extended family, friends, acquaintances, teachers, and strangers) on a 5-point scale. Parents' ratings showed that most children were always (5) or usually (4) understood by parents, immediate family, and teachers, but only sometimes (3) by strangers. Factor analysis confirmed the internal consistency of the ICS items; therefore, ratings were averaged to form an overall intelligibility score. The ICS had high internal reliability (α = .93), sensitivity, and construct validity. Criterion validity was established through significant correlations between the ICS and PPC (r = .54), PCC (r = .54), and PVC (r = .36). The ICS is a promising new measure of functional intelligibility. These data provide initial support for the ICS as an easily administered, valid, and reliable estimate of preschool children's intelligibility when speaking with people of varying levels of familiarity and authority.
Audiovisual Asynchrony Detection in Human Speech

ERIC Educational Resources Information Center

Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

2011-01-01

Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…
Cognitive Functions in Childhood Apraxia of Speech

ERIC Educational Resources Information Center

Nijland, Lian; Terband, Hayo; Maassen, Ben

2015-01-01

Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…
The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy

NASA Astrophysics Data System (ADS)

Liu, Huei-Mei; Tsao, Feng-Ming; Kuhl, Patricia K.

2005-06-01

The purpose of this study was to examine the effect of reduced vowel working space on dysarthric talkers' speech intelligibility using both acoustic and perceptual approaches. In experiment 1, the acoustic-perceptual relationship between vowel working space area and speech intelligibility was examined in Mandarin-speaking young adults with cerebral palsy. Subjects read aloud 18 bisyllabic words containing the vowels /eye/, /aye/, and /you/ using their normal speaking rate. Each talker's words were identified by three normal listeners. The percentage of correct vowel and word identification were calculated as vowel intelligibility and word intelligibility, respectively. Results revealed that talkers with cerebral palsy exhibited smaller vowel working space areas compared to ten age-matched controls. The vowel working space area was significantly correlated with vowel intelligibility (r=0.632, p<0.005) and with word intelligibility (r=0.684, p<0.005). Experiment 2 examined whether tokens of expanded vowel working spaces were perceived as better vowel exemplars and represented with greater perceptual spaces than tokens of reduced vowel working spaces. The results of the perceptual experiment support this prediction. The distorted vowels of talkers with cerebral palsy compose a smaller acoustic space that results in shrunken intervowel perceptual distances for listeners. .
Acoustic richness modulates the neural networks supporting intelligible speech processing.

PubMed

Lee, Yune-Sang; Min, Nam Eun; Wingfield, Arthur; Grossman, Murray; Peelle, Jonathan E

2016-03-01

The information contained in a sensory signal plays a critical role in determining what neural processes are engaged. Here we used interleaved silent steady-state (ISSS) functional magnetic resonance imaging (fMRI) to explore how human listeners cope with different degrees of acoustic richness during auditory sentence comprehension. Twenty-six healthy young adults underwent scanning while hearing sentences that varied in acoustic richness (high vs. low spectral detail) and syntactic complexity (subject-relative vs. object-relative center-embedded clause structures). We manipulated acoustic richness by presenting the stimuli as unprocessed full-spectrum speech, or noise-vocoded with 24 channels. Importantly, although the vocoded sentences were spectrally impoverished, all sentences were highly intelligible. These manipulations allowed us to test how intelligible speech processing was affected by orthogonal linguistic and acoustic demands. Acoustically rich speech showed stronger activation than acoustically less-detailed speech in a bilateral temporoparietal network with more pronounced activity in the right hemisphere. By contrast, listening to sentences with greater syntactic complexity resulted in increased activation of a left-lateralized network including left posterior lateral temporal cortex, left inferior frontal gyrus, and left dorsolateral prefrontal cortex. Significant interactions between acoustic richness and syntactic complexity occurred in left supramarginal gyrus, right superior temporal gyrus, and right inferior frontal gyrus, indicating that the regions recruited for syntactic challenge differed as a function of acoustic properties of the speech. Our findings suggest that the neural systems involved in speech perception are finely tuned to the type of information available, and that reducing the richness of the acoustic signal dramatically alters the brain's response to spoken language, even when intelligibility is high. Copyright © 2015 Elsevier B.V. All rights reserved.
The Influence of Cochlear Mechanical Dysfunction, Temporal Processing Deficits, and Age on the Intelligibility of Audible Speech in Noise for Hearing-Impaired Listeners

PubMed Central

Johannesen, Peter T.; Pérez-González, Patricia; Kalluri, Sridhar; Blanco, José L.

2016-01-01

The aim of this study was to assess the relative importance of cochlear mechanical dysfunction, temporal processing deficits, and age on the ability of hearing-impaired listeners to understand speech in noisy backgrounds. Sixty-eight listeners took part in the study. They were provided with linear, frequency-specific amplification to compensate for their audiometric losses, and intelligibility was assessed for speech-shaped noise (SSN) and a time-reversed two-talker masker (R2TM). Behavioral estimates of cochlear gain loss and residual compression were available from a previous study and were used as indicators of cochlear mechanical dysfunction. Temporal processing abilities were assessed using frequency modulation detection thresholds. Age, audiometric thresholds, and the difference between audiometric threshold and cochlear gain loss were also included in the analyses. Stepwise multiple linear regression models were used to assess the relative importance of the various factors for intelligibility. Results showed that (a) cochlear gain loss was unrelated to intelligibility, (b) residual cochlear compression was related to intelligibility in SSN but not in a R2TM, (c) temporal processing was strongly related to intelligibility in a R2TM and much less so in SSN, and (d) age per se impaired intelligibility. In summary, all factors affected intelligibility, but their relative importance varied across maskers. PMID:27604779
STANFORD ARTIFICIAL INTELLIGENCE PROJECT.

DTIC Science & Technology

ARTIFICIAL INTELLIGENCE , GAME THEORY, DECISION MAKING, BIONICS, AUTOMATA, SPEECH RECOGNITION, GEOMETRIC FORMS, LEARNING MACHINES, MATHEMATICAL MODELS, PATTERN RECOGNITION, SERVOMECHANISMS, SIMULATION, BIBLIOGRAPHIES.
Discrepant visual speech facilitates covert selective listening in "cocktail party" conditions.

PubMed

Williams, Jason A

2012-06-01

The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.
Rasch Analysis of Word Identification and Magnitude Estimation Scaling Responses in Measuring Naive Listeners' Judgments of Speech Intelligibility of Children with Severe-to-Profound Hearing Impairments

ERIC Educational Resources Information Center

Beltyukova, Svetlana A.; Stone, Gregory M.; Ellis, Lee W.

2008-01-01

Purpose: Speech intelligibility research typically relies on traditional evidence of reliability and validity. This investigation used Rasch analysis to enhance understanding of the functioning and meaning of scores obtained with 2 commonly used procedures: word identification (WI) and magnitude estimation scaling (MES). Method: Narrative samples…
The Speech Intelligibility Index and the Pure-Tone Average as Predictors of Lexical Ability in Children Fit with Hearing Aids

ERIC Educational Resources Information Center

Stiles, Derek J.; Bentler, Ruth A.; McGregor, Karla K.

2012-01-01

Purpose: To determine whether a clinically obtainable measure of audibility, the aided Speech Intelligibility Index (SII; American National Standards Institute, 2007), is more sensitive than the pure-tone average (PTA) at predicting the lexical abilities of children who wear hearing aids (CHA). Method: School-age CHA and age-matched children with…
[Relevance of psychosocial factors in speech rehabilitation after laryngectomy].

PubMed

Singer, S; Fuchs, M; Dietz, A; Klemm, E; Kienast, U; Meyer, A; Oeken, J; Täschner, R; Wulke, C; Schwarz, R

2007-12-01

Often it is assumed that psychosocial and sociodemographic factors cause the success of voice rehabilitation after laryngectomy. Aim of this study was to analyze the association between these parameters. Based on tumor registries of six ENT-clinics all patients were surveyed, who were laryngectomized in the years before (N = 190). Success of voice rehabilitation has been assessed as speech intelligibility measured with the postlaryngectomy-telephone-intelligibility-test. For the assessment of the psychosocial parameters validated and standardized instruments were used if possible. Statistical analysis was done by multiple logistic regression analysis. Low speech intelligibility is associated with reduced conversations (OR 0.970) and social activity (OR 1.049). Patients are more likely to talk with esophageal voice when their motivation for learning the new voice was high (OR 7.835) and when they assessed their speech therapist as important for their motivation (OR 4.794). The risk to communicate merely by whispering is higher when patients live together with a partner (OR 5.293), when they talk seldomly (OR 1.017) and when they are not very active in social contexts (OR 0.966). Psychosocial factors can only partly explain how voice rehabilitation after laryngectomy becomes a success. Speech intelligibility is associated with active communication behaviour, whereas the use of an esophageal voice is correlated with motivation. It seems that the gaining of tracheoesophageal puncture voice is independent of psychosocial factors.
Comparing Binaural Pre-processing Strategies II: Speech Intelligibility of Bilateral Cochlear Implant Users.

PubMed

Baumgärtel, Regina M; Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-12-30

Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. © The Author(s) 2015.
Prosodic Features and Speech Naturalness in Individuals with Dysarthria

ERIC Educational Resources Information Center

Klopfenstein, Marie I.

2012-01-01

Despite the importance of speech naturalness to treatment outcomes, little research has been done on what constitutes speech naturalness and how to best maximize naturalness in relationship to other treatment goals like intelligibility. In addition, previous literature alludes to the relationship between prosodic aspects of speech and speech…
Speech Perception in Individuals with Auditory Neuropathy

ERIC Educational Resources Information Center

Zeng, Fan-Gang; Liu, Sheng

2006-01-01

Purpose: Speech perception in participants with auditory neuropathy (AN) was systematically studied to answer the following 2 questions: Does noise present a particular problem for people with AN: Can clear speech and cochlear implants alleviate this problem? Method: The researchers evaluated the advantage in intelligibility of clear speech over…
Speech Intelligibility and Prosody Production in Children with Cochlear Implants

PubMed Central

Chin, Steven B.; Bergeson, Tonya R.; Phan, Jennifer

2012-01-01

Objectives The purpose of the current study was to examine the relation between speech intelligibility and prosody production in children who use cochlear implants. Methods The Beginner's Intelligibility Test (BIT) and Prosodic Utterance Production (PUP) task were administered to 15 children who use cochlear implants and 10 children with normal hearing. Adult listeners with normal hearing judged the intelligibility of the words in the BIT sentences, identified the PUP sentences as one of four grammatical or emotional moods (i.e., declarative, interrogative, happy, or sad), and rated the PUP sentences according to how well they thought the child conveyed the designated mood. Results Percent correct scores were higher for intelligibility than for prosody and higher for children with normal hearing than for children with cochlear implants. Declarative sentences were most readily identified and received the highest ratings by adult listeners; interrogative sentences were least readily identified and received the lowest ratings. Correlations between intelligibility and all mood identification and rating scores except declarative were not significant. Discussion The findings suggest that the development of speech intelligibility progresses ahead of prosody in both children with cochlear implants and children with normal hearing; however, children with normal hearing still perform better than children with cochlear implants on measures of intelligibility and prosody even after accounting for hearing age. Problems with interrogative intonation may be related to more general restrictions on rising intonation, and the correlation results indicate that intelligibility and sentence intonation may be relatively dissociated at these ages. PMID:22717120
Using on-line altered auditory feedback treating Parkinsonian speech

NASA Astrophysics Data System (ADS)

Wang, Emily; Verhagen, Leo; de Vries, Meinou H.

2005-09-01

Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].
[The Russian-language version of the matrix test (RUMatrix) in free field in patients after cochlear implantation in the long term].

PubMed

Goykhburg, M V; Bakhshinyan, V V; Petrova, I P; Wazybok, A; Kollmeier, B; Tavartkiladze, G A

The deterioration of speech intelligibility in the patients using cochlear implantation (CI) systems is especially well apparent in the noisy environment. It explains why phrasal speech tests, such as a Matrix sentence test, have become increasingly more popular in the speech audiometry during rehabilitation after CI. The Matrix test allows to estimate speech perception by the patients in a real life situation. The objective of this study was to assess the effectiveness of audiological rehabilitation of CI patients using the Russian-language version of the matrix test (RUMatrix) in free field in the noisy environment. 33 patients aged from 5 to 40 years with a more than 3 year experience of using cochlear implants inserted at the National Research Center for Audiology and Hearing Rehabilitation were included in our study. Five of these patients were implanted bilaterally. The results of our study showed a statistically significant improvement of speech intelligibility in the noisy environment after the speech processor adjustment; dynamics of the signal-to-noise ratio changes was -1.7 dB (p<0.001). The RUMatrix test is a highly efficient method for the estimation of speech intelligibility in the patients undergoing clinical investigations in the noisy environment. The high degree of comparability of the RUMatrix test with the Matrix tests in other languages makes possible its application in international multicenter studies.

The speech intelligibility at the opera singing.

PubMed

Novák, A; Vokrál, J

2000-01-01

The authors investigated the speech intelligibility at opera singing. They analysed several arias sung by soprano voices from Czech opera "Rusalka" (The water Nymph), from Puccini's opera "La Boheme" and several parts of arias from "Il barbiere di Siviglia" sung by baritone, tenor, bass and soprano. The sonographic pictures of selected arias were compared with a subjective evaluation. The difference between both authors was about 70%. The opinion is, that the singer's formant is not the only one problem having its role in the speech intelligibility of opera singing. The important role is played by the ability to change the shape of vocal tract and the ability of rapid and exact articulatory movements. This ability influences the shape of transients that are important at the normal speaking and also in the singing.
Speech evaluation after palatal augmentation in patients undergoing glossectomy.

PubMed

de Carvalho-Teles, Viviane; Sennes, Luiz Ubirajara; Gielow, Ingrid

2008-10-01

To assess, in patients undergoing glossectomy, the influence of the palatal augmentation prosthesis on the speech intelligibility and acoustic spectrographic characteristics of the formants of oral vowels in Brazilian Portuguese, specifically the first 3 formants (F1 [/a,e,u/], F2 [/o,ó,u/], and F3 [/a,ó/]). Speech evaluation with and without a palatal augmentation prosthesis using blinded randomized listener judgments. Tertiary referral center. Thirty-six patients (33 men and 3 women) aged 30 to 80 (mean [SD], 53.9 [10.5]) years underwent glossectomy (14, total glossectomy; 12, total glossectomy and partial mandibulectomy; 6, hemiglossectomy; and 4, subtotal glossectomy) with use of the augmentation prosthesis for at least 3 months before inclusion in the study. Spontaneous speech intelligibility (assessed by expert listeners using a 4-category scale) and spectrographic formants assessment. We found a statistically significant improvement of spontaneous speech intelligibility and the average number of correctly identified syllables with the use of the prosthesis (P < .05). Statistically significant differences occurred for the F1 values of the vowels /a,e,u/; for F2 values, there was a significant difference of the vowels /o,ó,u/; and for F3 values, there was a significant difference of the vowels /a,ó/ (P < .001). The palatal augmentation prosthesis improved the intelligibility of spontaneous speech and syllables for patients who underwent glossectomy. It also increased the F2 and F3 values for all vowels and the F1 values for the vowels /o,ó,u/. This effect brought the values of many vowel formants closer to normal.
Developing the Alphabetic Principle to Aid Text-Based Augmentative and Alternative Communication Use by Adults With Low Speech Intelligibility and Intellectual Disabilities.

PubMed

Schmidt-Naylor, Anna C; Saunders, Kathryn J; Brady, Nancy C

2017-05-17

We explored alphabet supplementation as an augmentative and alternative communication strategy for adults with minimal literacy. Study 1's goal was to teach onset-letter selection with spoken words and assess generalization to untaught words, demonstrating the alphabetic principle. Study 2 incorporated alphabet supplementation within a naming task and then assessed effects on speech intelligibility. Three men with intellectual disabilities (ID) and low speech intelligibility participated. Study 1 used a multiple-probe design, across three 20-word sets, to show that our computer-based training improved onset-letter selection. We also probed generalization to untrained words. Study 2 taught onset-letter selection for 30 new words chosen for functionality. Five listeners transcribed speech samples of the 30 words in 2 conditions: speech only and speech with alphabet supplementation. Across studies 1 and 2, participants demonstrated onset-letter selection for at least 90 words. Study 1 showed evidence of the alphabetic principle for some but not all word sets. In study 2, participants readily used alphabet supplementation, enabling listeners to understand twice as many words. This is the first demonstration of alphabet supplementation in individuals with ID and minimal literacy. The large number of words learned holds promise both for improving communication and providing a foundation for improved literacy.
Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners With Simulated Age-Related Hearing Loss.

PubMed

Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian

2017-09-18

The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist audiologists/hearing-aid dispensers in the fine-tuning of hearing aids. Sixty young participants with normal hearing listened to speech materials mimicking the perceptual consequences of ARHL at different levels of severity. Two intelligibility tests (repetition of words and sentences) and 1 comprehension test (responding to oral commands by moving virtual objects) were administered. Several language models were developed and used by the ASR system in order to fit human performances. Strong significant positive correlations were observed between human and ASR scores, with coefficients up to .99. However, the spectral smearing used to simulate losses in frequency selectivity caused larger declines in ASR performance than in human performance. Both intelligibility and comprehension scores for listeners with simulated ARHL are highly correlated with the performances of an ASR-based system. In the future, it needs to be determined if the ASR system is similarly successful in predicting speech processing in noise and by older people with ARHL.
Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension

PubMed Central

Peelle, Jonathan E.; Gross, Joachim; Davis, Matthew H.

2013-01-01

A growing body of evidence shows that ongoing oscillations in auditory cortex modulate their phase to match the rhythm of temporally regular acoustic stimuli, increasing sensitivity to relevant environmental cues and improving detection accuracy. In the current study, we test the hypothesis that nonsensory information provided by linguistic content enhances phase-locked responses to intelligible speech in the human brain. Sixteen adults listened to meaningful sentences while we recorded neural activity using magnetoencephalography. Stimuli were processed using a noise-vocoding technique to vary intelligibility while keeping the temporal acoustic envelope consistent. We show that the acoustic envelopes of sentences contain most power between 4 and 7 Hz and that it is in this frequency band that phase locking between neural activity and envelopes is strongest. Bilateral oscillatory neural activity phase-locked to unintelligible speech, but this cerebro-acoustic phase locking was enhanced when speech was intelligible. This enhanced phase locking was left lateralized and localized to left temporal cortex. Together, our results demonstrate that entrainment to connected speech does not only depend on acoustic characteristics, but is also affected by listeners’ ability to extract linguistic information. This suggests a biological framework for speech comprehension in which acoustic and linguistic cues reciprocally aid in stimulus prediction. PMID:22610394
Phase-locked responses to speech in human auditory cortex are enhanced during comprehension.

PubMed

Peelle, Jonathan E; Gross, Joachim; Davis, Matthew H

2013-06-01

A growing body of evidence shows that ongoing oscillations in auditory cortex modulate their phase to match the rhythm of temporally regular acoustic stimuli, increasing sensitivity to relevant environmental cues and improving detection accuracy. In the current study, we test the hypothesis that nonsensory information provided by linguistic content enhances phase-locked responses to intelligible speech in the human brain. Sixteen adults listened to meaningful sentences while we recorded neural activity using magnetoencephalography. Stimuli were processed using a noise-vocoding technique to vary intelligibility while keeping the temporal acoustic envelope consistent. We show that the acoustic envelopes of sentences contain most power between 4 and 7 Hz and that it is in this frequency band that phase locking between neural activity and envelopes is strongest. Bilateral oscillatory neural activity phase-locked to unintelligible speech, but this cerebro-acoustic phase locking was enhanced when speech was intelligible. This enhanced phase locking was left lateralized and localized to left temporal cortex. Together, our results demonstrate that entrainment to connected speech does not only depend on acoustic characteristics, but is also affected by listeners' ability to extract linguistic information. This suggests a biological framework for speech comprehension in which acoustic and linguistic cues reciprocally aid in stimulus prediction.
Ongoing slow oscillatory phase modulates speech intelligibility in cooperation with motor cortical activity.

PubMed

Onojima, Takayuki; Kitajo, Keiichi; Mizuhara, Hiroaki

2017-01-01

Neural oscillation is attracting attention as an underlying mechanism for speech recognition. Speech intelligibility is enhanced by the synchronization of speech rhythms and slow neural oscillation, which is typically observed as human scalp electroencephalography (EEG). In addition to the effect of neural oscillation, it has been proposed that speech recognition is enhanced by the identification of a speaker's motor signals, which are used for speech production. To verify the relationship between the effect of neural oscillation and motor cortical activity, we measured scalp EEG, and simultaneous EEG and functional magnetic resonance imaging (fMRI) during a speech recognition task in which participants were required to recognize spoken words embedded in noise sound. We proposed an index to quantitatively evaluate the EEG phase effect on behavioral performance. The results showed that the delta and theta EEG phase before speech inputs modulated the participant's response time when conducting speech recognition tasks. The simultaneous EEG-fMRI experiment showed that slow EEG activity was correlated with motor cortical activity. These results suggested that the effect of the slow oscillatory phase was associated with the activity of the motor cortex during speech recognition.
Prosodic Stress, Information, and Intelligibility of Speech in Noise

DTIC Science & Technology

2009-02-28

across periods during which acoustic information has been suppressed. 15. SUBJECT TERMS Robust speech intelligibility Computational model of...Research Fellow at the Department of Computer Science at the University of Southern California). This research involved superimposing acoustic and...presented at an invitational-only session of the Acoustical Society of America’s and European Acoustic Association’s joint meeting in 2008. In summary, the
Deep Brain Stimulation of the Subthalamic Nucleus Parameter Optimization for Vowel Acoustics and Speech Intelligibility in Parkinson's Disease

ERIC Educational Resources Information Center

Knowles, Thea; Adams, Scott; Abeyesekera, Anita; Mancinelli, Cynthia; Gilmore, Greydon; Jog, Mandar

2018-01-01

Purpose: The settings of 3 electrical stimulation parameters were adjusted in 12 speakers with Parkinson's disease (PD) with deep brain stimulation of the subthalamic nucleus (STN-DBS) to examine their effects on vowel acoustics and speech intelligibility. Method: Participants were tested under permutations of low, mid, and high STN-DBS frequency,…
STI: An objective measure for the performance of voice communication systems

NASA Astrophysics Data System (ADS)

Houtgast, T.; Steeneken, H. J. M.

1981-06-01

A measuring device was developed for determining the quality of speech communication systems. It comprises two parts, a signal source which replaces the talker, producing an artificial speech-like signal, and an analysis part which replaces the listener, by which the signal at the receiving end of the system under test is evaluated. Each single measurement results in an index (ranging from 0-100%) which indicates the effect of that communication system on speech intelligibility. The index is called STI (Speech Transmission Index). A careful design of the characteristics of the test signal and of the type of signal analysis makes the present approach widely applicable. It was verified experimentally that a given STI implies a given effect on speech intelligibility, irrespective of the nature of the actual disturbance (noise interference, band-pass limiting, peak clipping, etc.).
Effect of Fundamental Frequency on Judgments of Electrolaryngeal Speech

ERIC Educational Resources Information Center

Nagle, Kathy F.; Eadie, Tanya L.; Wright, Derek R.; Sumida, Yumi A.

2012-01-01

Purpose: To determine (a) the effect of fundamental frequency (f0) on speech intelligibility, acceptability, and perceived gender in electrolaryngeal (EL) speakers, and (b) the effect of known gender on speech acceptability in EL speakers. Method: A 2-part study was conducted. In Part 1, 34 healthy adults provided speech recordings using…
Children with Comorbid Speech Sound Disorder and Specific Language Impairment Are at Increased Risk for Attention-Deficit/Hyperactivity Disorder

ERIC Educational Resources Information Center

McGrath, Lauren M.; Hutaff-Lee, Christa; Scott, Ashley; Boada, Richard; Shriberg, Lawrence D.; Pennington, Bruce F.

2008-01-01

This study focuses on the comorbidity between attention-deficit/hyperactivity disorder (ADHD) symptoms and speech sound disorder (SSD). SSD is a developmental disorder characterized by speech production errors that impact intelligibility. Previous research addressing this comorbidity has typically used heterogeneous groups of speech-language…
Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

ERIC Educational Resources Information Center

Klein, Harriet B.; Liu-Shea, May

2009-01-01

Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
Dramatic Effects of Speech Task on Motor and Linguistic Planning in Severely Dysfluent Parkinsonian Speech

ERIC Educational Resources Information Center

Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

2012-01-01

In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…
Dysarthria in Mandarin-Speaking Children with Cerebral Palsy: Speech Subsystem Profiles

ERIC Educational Resources Information Center

Chen, Li-Mei; Hustad, Katherine C.; Kent, Ray D.; Lin, Yu Ching

2018-01-01

Purpose: This study explored the speech characteristics of Mandarin-speaking children with cerebral palsy (CP) and typically developing (TD) children to determine (a) how children in the 2 groups may differ in their speech patterns and (b) the variables correlated with speech intelligibility for words and sentences. Method: Data from 6 children…
Development of coffee maker service robot using speech and face recognition systems using POMDP

NASA Astrophysics Data System (ADS)

Budiharto, Widodo; Meiliana; Santoso Gunawan, Alexander Agung

2016-07-01

There are many development of intelligent service robot in order to interact with user naturally. This purpose can be done by embedding speech and face recognition ability on specific tasks to the robot. In this research, we would like to propose Intelligent Coffee Maker Robot which the speech recognition is based on Indonesian language and powered by statistical dialogue systems. This kind of robot can be used in the office, supermarket or restaurant. In our scenario, robot will recognize user's face and then accept commands from the user to do an action, specifically in making a coffee. Based on our previous work, the accuracy for speech recognition is about 86% and face recognition is about 93% in laboratory experiments. The main problem in here is to know the intention of user about how sweetness of the coffee. The intelligent coffee maker robot should conclude the user intention through conversation under unreliable automatic speech in noisy environment. In this paper, this spoken dialog problem is treated as a partially observable Markov decision process (POMDP). We describe how this formulation establish a promising framework by empirical results. The dialog simulations are presented which demonstrate significant quantitative outcome.
The effect of compression and attention allocation on speech intelligibility

NASA Astrophysics Data System (ADS)

Choi, Sangsook; Carrell, Thomas

2003-10-01

Research investigating the effects of amplitude compression on speech intelligibility for individuals with sensorineural hearing loss has demonstrated contradictory results [Souza and Turner (1999)]. Because percent-correct measures may not be the best indicator of compression effectiveness, a speech intelligibility and motor coordination task was developed to provide data that may more thoroughly explain the perception of compressed speech signals. In the present study, a pursuit rotor task [Dlhopolsky (2000)] was employed along with word identification task to measure the amount of attention required to perceive compressed and non-compressed words in noise. Monosyllabic words were mixed with speech-shaped noise at a fixed signal-to-noise ratio and compressed using a wide dynamic range compression scheme. Participants with normal hearing identified each word with or without a simultaneous pursuit-rotor task. Also, participants completed the pursuit-rotor task without simultaneous word presentation. It was expected that the performance on the additional motor task would reflect effect of the compression better than simple word-accuracy measures. Results were complex. For example, in some conditions an irrelevant task actually improved performance on a simultaneous listening task. This suggests there might be an optimal level of attention required for recognition of monosyllabic words.
The design of a device for hearer and feeler differentiation, part A. [speech modulated hearing device

NASA Technical Reports Server (NTRS)

Creecy, R.

1974-01-01

A speech modulated white noise device is reported that gives the rhythmic characteristics of a speech signal for intelligible reception by deaf persons. The signal is composed of random amplitudes and frequencies as modulated by the speech envelope characteristics of rhythm and stress. Time intensity parameters of speech are conveyed through the vibro-tactile sensation stimuli.
Voice technology and BBN

NASA Technical Reports Server (NTRS)

Wolf, Jared J.

1977-01-01

The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.
Cognitive load during speech perception in noise: the influence of age, hearing loss, and cognition on the pupil response.

PubMed

Zekveld, Adriana A; Kramer, Sophia E; Festen, Joost M

2011-01-01

The aim of the present study was to evaluate the influence of age, hearing loss, and cognitive ability on the cognitive processing load during listening to speech presented in noise. Cognitive load was assessed by means of pupillometry (i.e., examination of pupil dilation), supplemented with subjective ratings. Two groups of subjects participated: 38 middle-aged participants (mean age = 55 yrs) with normal hearing and 36 middle-aged participants (mean age = 61 yrs) with hearing loss. Using three Speech Reception Threshold (SRT) in stationary noise tests, we estimated the speech-to-noise ratios (SNRs) required for the correct repetition of 50%, 71%, or 84% of the sentences (SRT50%, SRT71%, and SRT84%, respectively). We examined the pupil response during listening: the peak amplitude, the peak latency, the mean dilation, and the pupil response duration. For each condition, participants rated the experienced listening effort and estimated their performance level. Participants also performed the Text Reception Threshold (TRT) test, a test of processing speed, and a word vocabulary test. Data were compared with previously published data from young participants with normal hearing. Hearing loss was related to relatively poor SRTs, and higher speech intelligibility was associated with lower effort and higher performance ratings. For listeners with normal hearing, increasing age was associated with poorer TRTs and slower processing speed but with larger word vocabulary. A multivariate repeated-measures analysis of variance indicated main effects of group and SNR and an interaction effect between these factors on the pupil response. The peak latency was relatively short and the mean dilation was relatively small at low intelligibility levels for the middle-aged groups, whereas the reverse was observed for high intelligibility levels. The decrease in the pupil response as a function of increasing SNR was relatively small for the listeners with hearing loss. Spearman correlation coefficients indicated that the cognitive load was larger in listeners with better TRT performances as reflected by a longer peak latency (normal-hearing participants, SRT50% condition) and a larger peak amplitude and longer response duration (hearing-impaired participants, SRT50% and SRT84% conditions). Also, a larger word vocabulary was related to longer response duration in the SRT84% condition for the participants with normal hearing. The pupil response systematically increased with decreasing speech intelligibility. Ageing and hearing loss were related to less release from effort when increasing the intelligibility of speech in noise. In difficult listening conditions, these factors may induce cognitive overload relatively early or they may be associated with relatively shallow speech processing. More research is needed to elucidate the underlying mechanisms explaining these results. Better TRTs and larger word vocabulary were related to higher mental processing load across speech intelligibility levels. This indicates that utilizing linguistic ability to improve speech perception is associated with increased listening load.

Lingual–Alveolar Contact Pressure During Speech in Amyotrophic Lateral Sclerosis: Preliminary Findings

PubMed Central

Knollhoff, Stephanie; Barohn, Richard J.

2017-01-01

Purpose This preliminary study on lingual–alveolar contact pressures (LACP) in people with amyotrophic lateral sclerosis (ALS) had several aims: (a) to evaluate whether the protocol induced fatigue, (b) to compare LACP during speech (LACP-Sp) and during maximum isometric pressing (LACP-Max) in people with ALS (PALS) versus healthy controls, (c) to compare the percentage of LACP-Max utilized during speech (%Max) for PALS versus controls, and (d) to evaluate relationships between LACP-Sp and LACP-Max with word intelligibility. Method Thirteen PALS and 12 healthy volunteers produced /t, d, s, z, l, n/ sounds while LACP-Sp was recorded. LACP-Max was obtained before and after the speech protocol. Word intelligibility was obtained from auditory–perceptual judgments. Results LACP-Max values measured before and after completion of the speech protocol did not differ. LACP-Sp and LACP-Max were statistically lower in the ALS bulbar group compared with controls and PALS with only spinal symptoms. There was no statistical difference between groups for %Max. LACP-Sp and LACP-Max were correlated with word intelligibility. Conclusions It was feasible to obtain LACP-Sp measures without inducing fatigue. Reductions in LACP-Sp and LACP-Max for bulbar speakers might reflect tongue weakness. Although confirmation of results is needed, the data indicate that individuals with high word intelligibility maintained LACP-Sp at or above 2 kPa and LACP-Max at or above 50 kPa. PMID:28335033
Intra-oral pressure-based voicing control of electrolaryngeal speech with intra-oral vibrator.

PubMed

Takahashi, Hirokazu; Nakao, Masayuki; Kikuchi, Yataro; Kaga, Kimitaka

2008-07-01

In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.
Call sign intelligibility improvement using a spatial auditory display

NASA Technical Reports Server (NTRS)

Begault, Durand R.

1993-01-01

A spatial auditory display was used to convolve speech stimuli, consisting of 130 different call signs used in the communications protocol of NASA's John F. Kennedy Space Center, to different virtual auditory positions. An adaptive staircase method was used to determine intelligibility levels of the signal against diotic speech babble, with spatial positions at 30 deg azimuth increments. Non-individualized, minimum-phase approximations of head-related transfer functions were used. The results showed a maximal intelligibility improvement of about 6 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.
Perceptual Learning of Interrupted Speech

PubMed Central

Benard, Michel Ruben; Başkent, Deniz

2013-01-01

The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations. PMID:23469266
Parental and Spousal Self-Efficacy of Young Adults Who Are Deaf or Hard of Hearing: Relationship to Speech Intelligibility

ERIC Educational Resources Information Center

Adi-Bensaid, Limor; Michael, Rinat; Most, Tova; Gali-Cinamon, Rachel

2012-01-01

This study examined the parental and spousal self-efficacy (SE) of adults who are deaf and who are hard of hearing (d/hh) in relation to their speech intelligibility. Forty individuals with hearing loss completed self-report measures: Spousal SE in a relationship with a spouse who was hearing/deaf, parental SE to a child who was hearing/deaf, and…
Age-related Effects on Word Recognition: Reliance on Cognitive Control Systems with Structural Declines in Speech-responsive Cortex

PubMed Central

Walczak, Adam; Ahlstrom, Jayne; Denslow, Stewart; Horwitz, Amy; Dubno, Judy R.

2008-01-01

Speech recognition can be difficult and effortful for older adults, even for those with normal hearing. Declining frontal lobe cognitive control has been hypothesized to cause age-related speech recognition problems. This study examined age-related changes in frontal lobe function for 15 clinically normal hearing adults (21–75 years) when they performed a word recognition task that was made challenging by decreasing word intelligibility. Although there were no age-related changes in word recognition, there were age-related changes in the degree of activity within left middle frontal gyrus (MFG) and anterior cingulate (ACC) regions during word recognition. Older adults engaged left MFG and ACC regions when words were most intelligible compared to younger adults who engaged these regions when words were least intelligible. Declining gray matter volume within temporal lobe regions responsive to word intelligibility significantly predicted left MFG activity, even after controlling for total gray matter volume, suggesting that declining structural integrity of brain regions responsive to speech leads to the recruitment of frontal regions when words are easily understood. Electronic supplementary material The online version of this article (doi:10.1007/s10162-008-0113-3) contains supplementary material, which is available to authorized users. PMID:18274825
Early Sign Language Exposure and Cochlear Implantation Benefits.

PubMed

Geers, Ann E; Mitchell, Christine M; Warner-Czyz, Andrea; Wang, Nae-Yuh; Eisenberg, Laurie S

2017-07-01

Most children with hearing loss who receive cochlear implants (CI) learn spoken language, and parents must choose early on whether to use sign language to accompany speech at home. We address whether parents' use of sign language before and after CI positively influences auditory-only speech recognition, speech intelligibility, spoken language, and reading outcomes. Three groups of children with CIs from a nationwide database who differed in the duration of early sign language exposure provided in their homes were compared in their progress through elementary grades. The groups did not differ in demographic, auditory, or linguistic characteristics before implantation. Children without early sign language exposure achieved better speech recognition skills over the first 3 years postimplant and exhibited a statistically significant advantage in spoken language and reading near the end of elementary grades over children exposed to sign language. Over 70% of children without sign language exposure achieved age-appropriate spoken language compared with only 39% of those exposed for 3 or more years. Early speech perception predicted speech intelligibility in middle elementary grades. Children without sign language exposure produced speech that was more intelligible (mean = 70%) than those exposed to sign language (mean = 51%). This study provides the most compelling support yet available in CI literature for the benefits of spoken language input for promoting verbal development in children implanted by 3 years of age. Contrary to earlier published assertions, there was no advantage to parents' use of sign language either before or after CI. Copyright © 2017 by the American Academy of Pediatrics.
Acoustic assessment of speech privacy curtains in two nursing units

PubMed Central

Pope, Diana S.; Miller-Klein, Erik T.

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959
Intelligibility in speech maskers with a binaural cochlear implant sound coding strategy inspired by the contralateral medial olivocochlear reflex.

PubMed

Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Gorospe, José M; Ruiz, Santiago Santa Cruz; Benito, Fernando; Wilson, Blake S

2017-05-01

We have recently proposed a binaural cochlear implant (CI) sound processing strategy inspired by the contralateral medial olivocochlear reflex (the MOC strategy) and shown that it improves intelligibility in steady-state noise (Lopez-Poveda et al., 2016, Ear Hear 37:e138-e148). The aim here was to evaluate possible speech-reception benefits of the MOC strategy for speech maskers, a more natural type of interferer. Speech reception thresholds (SRTs) were measured in six bilateral and two single-sided deaf CI users with the MOC strategy and with a standard (STD) strategy. SRTs were measured in unilateral and bilateral listening conditions, and for target and masker stimuli located at azimuthal angles of (0°, 0°), (-15°, +15°), and (-90°, +90°). Mean SRTs were 2-5 dB better with the MOC than with the STD strategy for spatially separated target and masker sources. For bilateral CI users, the MOC strategy (1) facilitated the intelligibility of speech in competition with spatially separated speech maskers in both unilateral and bilateral listening conditions; and (2) led to an overall improvement in spatial release from masking in the two listening conditions. Insofar as speech is a more natural type of interferer than steady-state noise, the present results suggest that the MOC strategy holds potential for promising outcomes for CI users. Copyright © 2017. Published by Elsevier B.V.
Acoustic assessment of speech privacy curtains in two nursing units.

PubMed

Pope, Diana S; Miller-Klein, Erik T

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.
The Comprehension of Rapid Speech by the Blind: Part III. Final Report.

ERIC Educational Resources Information Center

Foulke, Emerson

Accounts of completed and ongoing research conducted from 1964 to 1968 are presented on the subject of accelerated speech as a substitute for the written word. Included are a review of the research on intelligibility and comprehension of accelerated speech, some methods for controlling the word rate of recorded speech, and a comparison of…
The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

ERIC Educational Resources Information Center

Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

2011-01-01

In a sample of 46 children aged 4-7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants' speech, prosody, and voice were compared with data from 40 typically-developing children, 13…
Cognitive processing load across a wide range of listening conditions: insights from pupillometry.

PubMed

Zekveld, Adriana A; Kramer, Sophia E

2014-03-01

The pupil response to speech masked by interfering speech was assessed across an intelligibility range from 0% to 99% correct. In total, 37 participants aged between 18 and 36 years and with normal hearing were included. Pupil dilation was largest at intermediate intelligibility levels, smaller at high intelligibility, and slightly smaller at very difficult levels. Participants who reported that they often gave up listening at low intelligibility levels had smaller pupil dilations in these conditions. Participants who were good at reading masked text had relatively large pupil dilation when intelligibility was low. We conclude that the pupil response is sensitive to processing load, and possibly reflects cognitive overload in difficult conditions. It seems affected by methodological aspects and individual abilities, but does not reflect subjective ratings. Copyright © 2014 Society for Psychophysiological Research.
APEX/SPIN: a free test platform to measure speech intelligibility.

PubMed

Francart, Tom; Hofmann, Michael; Vanthornhout, Jonas; Van Deun, Lieselot; van Wieringen, Astrid; Wouters, Jan

2017-02-01

Measuring speech intelligibility in quiet and noise is important in clinical practice and research. An easy-to-use free software platform for conducting speech tests is presented, called APEX/SPIN. The APEX/SPIN platform allows the use of any speech material in combination with any noise. A graphical user interface provides control over a large range of parameters, such as number of loudspeakers, signal-to-noise ratio and parameters of the procedure. An easy-to-use graphical interface is provided for calibration and storage of calibration values. To validate the platform, perception of words in quiet and sentences in noise were measured both with APEX/SPIN and with an audiometer and CD player, which is a conventional setup in current clinical practice. Five normal-hearing listeners participated in the experimental evaluation. Speech perception results were similar for the APEX/SPIN platform and conventional procedures. APEX/SPIN is a freely available and open source platform that allows the administration of all kinds of custom speech perception tests and procedures.
[The Freiburg speech intelligibility test : A pillar of speech audiometry in German-speaking countries].

PubMed

Hoth, S

2016-08-01

The Freiburg speech intelligibility test according to DIN 45621 was introduced around 60 years ago. For decades, and still today, the Freiburg test has been a standard whose relevance extends far beyond pure audiometry. It is used primarily to determine the speech perception threshold (based on two-digit numbers) and the ability to discriminate speech at suprathreshold presentation levels (based on monosyllabic nouns). Moreover, it is a measure of the degree of disability, the requirement for and success of technical hearing aids (auxiliaries directives), and the compensation for disability and handicap (Königstein recommendation). In differential audiological diagnostics, the Freiburg test contributes to the distinction between low- and high-frequency hearing loss, as well as to identification of conductive, sensory, neural, and central disorders. Currently, the phonemic and perceptual balance of the monosyllabic test lists is subject to critical discussions. Obvious deficiencies exist for testing speech recognition in noise. In this respect, alternatives such as sentence or rhyme tests in closed-answer inventories are discussed.
Developing the Alphabetic Principle to Aid Text-Based Augmentative and Alternative Communication Use by Adults With Low Speech Intelligibility and Intellectual Disabilities

PubMed Central

Schmidt-Naylor, Anna C.; Brady, Nancy C.

2017-01-01

Purpose We explored alphabet supplementation as an augmentative and alternative communication strategy for adults with minimal literacy. Study 1's goal was to teach onset-letter selection with spoken words and assess generalization to untaught words, demonstrating the alphabetic principle. Study 2 incorporated alphabet supplementation within a naming task and then assessed effects on speech intelligibility. Method Three men with intellectual disabilities (ID) and low speech intelligibility participated. Study 1 used a multiple-probe design, across three 20-word sets, to show that our computer-based training improved onset-letter selection. We also probed generalization to untrained words. Study 2 taught onset-letter selection for 30 new words chosen for functionality. Five listeners transcribed speech samples of the 30 words in 2 conditions: speech only and speech with alphabet supplementation. Results Across studies 1 and 2, participants demonstrated onset-letter selection for at least 90 words. Study 1 showed evidence of the alphabetic principle for some but not all word sets. In study 2, participants readily used alphabet supplementation, enabling listeners to understand twice as many words. Conclusions This is the first demonstration of alphabet supplementation in individuals with ID and minimal literacy. The large number of words learned holds promise both for improving communication and providing a foundation for improved literacy. PMID:28474087
[Computer assisted application of mandarin speech test materials].

PubMed

Zhang, Hua; Wang, Shuo; Chen, Jing; Deng, Jun-Min; Yang, Xiao-Lin; Guo, Lian-Sheng; Zhao, Xiao-Yan; Shao, Guang-Yu; Han, De-Min

2008-06-01

To design an intelligent speech test system with reliability and convenience using the computer software and to evaluate this system. First, the intelligent system was designed by the Delphi program language. Second, the seven monosyllabic word lists recorded on CD were separated by Cool Edit Pro v2.1 software and put into the system as test materials. Finally, the intelligent system was used to evaluate the equivalence of difficulty between seven lists. Fifty-five college students with normal hearing participated in the study. The seven monosyllabic word lists had equivalent difficulty (F = 1.582, P > 0.05) to the subjects between each other and the system was proved as reliability and convenience. The intelligent system has the feasibility in the clinical practice.
The association between intelligence and lifespan is mostly genetic.

PubMed

Arden, Rosalind; Luciano, Michelle; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M

2016-02-01

Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. The finding of common genetic effects between lifespan and intelligence has important implications for public health, and for those interested in the genetics of intelligence, lifespan or inequalities in health outcomes including lifespan. © The Author 2015; Published by Oxford University Press on behalf of the International Epidemiological Association.
The association between intelligence and lifespan is mostly genetic

PubMed Central

Arden, Rosalind; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M

2016-01-01

Abstract Background: Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. Methods: We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. Results: The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. Conclusions: The finding of common genetic effects between lifespan and intelligence has important implications for public health, and for those interested in the genetics of intelligence, lifespan or inequalities in health outcomes including lifespan. PMID:26213105
No association between prenatal exposure to psychotropics and intelligence at age five.

PubMed

Eriksen, Hanne-Lise Falgreen; Kesmodel, Ulrik Schiøler; Pedersen, Lars Henning; Mortensen, Erik Lykke

2015-05-01

To examine associations between prenatal exposure to selective serotonin reuptake inhibitors (SSRIs)/anxiolytics and intelligence assessed with a standard clinical intelligence test at age 5 years. Longitudinal follow-up study. Denmark, 2003-2008. A total of 1780 women and their children sampled from the Danish National Birth Cohort. Self-reported information on use of SSRI and anxiolytics was obtained from the Danish National Birth Cohort at the time of consent and from two prenatal interviews. Intelligence was assessed at age 5 years, and parental education, maternal intelligence quotient (IQ), maternal smoking and alcohol consumption in pregnancy, the child's age at testing, sex, and tester were included in the full model. The IQ of 13 medication-exposed children was compared with the IQ of 19 children whose mothers had untreated depression and 1748 control children. Wechsler Preschool and Primary Scale of Intelligence - Revised. In unadjusted analyses, children of mothers who used antidepressants or anxiolytics during pregnancy had higher verbal IQ; this association, however, was insignificant after adjustment for potentially confounding maternal and child factors. No consistent associations between IQ and fetal exposure to antidepressants and anxiolytics were observed, but the study had low statistical power, and there is an obvious need to conduct long-term follow-up studies with comprehensive cognitive assessment and sufficiently large samples of adolescent or adult offspring. © 2015 Nordic Federation of Societies of Obstetrics and Gynecology.

Perceptual learning for speech in noise after application of binary time-frequency masks

PubMed Central

Ahmadi, Mahnaz; Gross, Vauna L.; Sinex, Donal G.

2013-01-01

Ideal time-frequency (TF) masks can reject noise and improve the recognition of speech-noise mixtures. An ideal TF mask is constructed with prior knowledge of the target speech signal. The intelligibility of a processed speech-noise mixture depends upon the threshold criterion used to define the TF mask. The study reported here assessed the effect of training on the recognition of speech in noise after processing by ideal TF masks that did not restore perfect speech intelligibility. Two groups of listeners with normal hearing listened to speech-noise mixtures processed by TF masks calculated with different threshold criteria. For each group, a threshold criterion that initially produced word recognition scores between 0.56–0.69 was chosen for training. Listeners practiced with one set of TF-masked sentences until their word recognition performance approached asymptote. Perceptual learning was quantified by comparing word-recognition scores in the first and last training sessions. Word recognition scores improved with practice for all listeners with the greatest improvement observed for the same materials used in training. PMID:23464038
Functional connectivity between face-movement and speech-intelligibility areas during auditory-only speech perception.

PubMed

Schall, Sonja; von Kriegstein, Katharina

2014-01-01

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers' voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker's face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas.
Analysis of masking effects on speech intelligibility with respect to moving sound stimulus

NASA Astrophysics Data System (ADS)

Chen, Chiung Yao

2004-05-01

The purpose of this study is to compare the disturbed degree of speech by an immovable noise source and an apparent moving one (AMN). In the study of the sound localization, we found that source-directional sensitivity (SDS) well associates with the magnitude of interaural cross correlation (IACC). Ando et al. [Y. Ando, S. H. Kang, and H. Nagamatsu, J. Acoust. Soc. Jpn. (E) 8, 183-190 (1987)] reported that potential correlation between left and right inferior colliculus at auditory path in the brain is in harmony with the correlation function of amplitude input into two ear-canal entrances. We assume that the degree of disturbance under the apparent moving noisy source is probably different from that being installed in front of us within a constant distance in a free field (no reflection). Then, we found there is a different influence on speech intelligibility between a moving and a fixed source generated by 1/3-octave narrow-band noise with the center frequency 2 kHz. However, the reasons for the moving speed and the masking effects on speech intelligibility were uncertain.
Speech production in children with Down's syndrome: The effects of reading, naming and imitation.

PubMed

Knight, Rachael-Anne; Kurtz, Scilla; Georgiadou, Ioanna

2015-01-01

People with DS are known to have difficulties with expressive language, and often have difficulties with intelligibility. They often have stronger visual than verbal short-term memory skills and, therefore, reading has often been suggested as an intervention for speech and language in this population. However, there is as yet no firm evidence that reading can improve speech outcomes. This study aimed to compare reading, picture naming and repetition for the same 10 words, to identify if the speech of eight children with DS (aged 11-14 years) was more accurate, consistent and intelligible when reading. Results show that children were slightly, yet significantly, more accurate and intelligible when they read words compared with when they produced those words in naming or imitation conditions although the reduction in inconsistency was non-significant. The results of this small-scale study provide tentative support for previous claims about the benefits of reading for children with DS. The mechanisms behind a facilitatory effect of reading are considered, and directions are identified for future research.
Speech intelligibility and subjective benefit in single-sided deaf adults after cochlear implantation.

PubMed

Finke, Mareike; Strauß-Schier, Angelika; Kludt, Eugen; Büchner, Andreas; Illg, Angelika

2017-05-01

Treatment with cochlear implants (CIs) in single-sided deaf individuals started less than a decade ago. CIs can successfully reduce incapacitating tinnitus on the deaf ear and allow, so some extent, the restoration of binaural hearing. Until now, systematic evaluations of subjective CI benefit in post-lingually single-sided deaf individuals and analyses of speech intelligibility outcome for the CI in isolation have been lacking. For the prospective part of this study, the Bern Benefit in Single-Sided Deafness Questionnaire (BBSS) was administered to 48 single-sided deaf CI users to evaluate the subjectively perceived CI benefit across different listening situations. In the retrospective part, speech intelligibility outcome with the CI up to 12 month post-activation was compared between 100 single-sided deaf CI users and 125 bilaterally implanted CI users (2nd implant). The positive median ratings in the BBSS differed significantly from zero for all items suggesting that most individuals with single-sided deafness rate their CI as beneficial across listening situations. The speech perception scores in quiet and noise improved significantly over time in both groups of CI users. Speech intelligibility with the CI in isolation was significantly better in bilaterally implanted CI users (2nd implant) compared to the scores obtained from single-sided deaf CI users. Our results indicate that CI users with single-sided deafness can reach open set speech understanding with their CI in isolation, encouraging the extension of the CI indication to individuals with normal hearing on the contralateral ear. Compared to the performance reached with bilateral CI users' second implant, speech reception threshold are lower, indicating an aural preference and dominance of the normal hearing ear. The results from the BBSS propose good satisfaction with the CI across several listening situations. Copyright © 2017 Elsevier B.V. All rights reserved.
Synthesized Speech Output and Children: A Scoping Review

ERIC Educational Resources Information Center

Drager, Kathryn D. R.; Reichle, Joe; Pinkoski, Carrie

2010-01-01

Purpose: Many computer-based augmentative and alternative communication systems in use by children have speech output. This article (a) provides a scoping review of the literature addressing the intelligibility and listener comprehension of synthesized speech output with children and (b) discusses future research directions. Method: Studies…
Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness

PubMed Central

Kates, James M.; Arehart, Kathryn H.; Souza, Pamela E.

2013-01-01

Individual factors beyond the audiogram, such as age and cognitive abilities, can influence speech intelligibility and speech quality judgments. This paper develops a neural network framework for combining multiple subject factors into a single model that predicts speech intelligibility and quality for a nonlinear hearing-aid processing strategy. The nonlinear processing approach used in the paper is frequency compression, which is intended to improve the audibility of high-frequency speech sounds by shifting them to lower frequency regions where listeners with high-frequency loss have better hearing thresholds. An ensemble averaging approach is used for the neural network to avoid the problems associated with overfitting. Models are developed for two subject groups, one having nearly normal hearing and the other mild-to-moderate sloping losses. PMID:25669257
Using Speech Recall in Hearing Aid Fitting and Outcome Evaluation Under Ecological Test Conditions.

PubMed

Lunner, Thomas; Rudner, Mary; Rosenbom, Tove; Ågren, Jessica; Ng, Elaine Hoi Ning

2016-01-01

In adaptive Speech Reception Threshold (SRT) tests used in the audiological clinic, speech is presented at signal to noise ratios (SNRs) that are lower than those generally encountered in real-life communication situations. At higher, ecologically valid SNRs, however, SRTs are insensitive to changes in hearing aid signal processing that may be of benefit to listeners who are hard of hearing. Previous studies conducted in Swedish using the Sentence-final Word Identification and Recall test (SWIR) have indicated that at such SNRs, the ability to recall spoken words may be a more informative measure. In the present study, a Danish version of SWIR, known as the Sentence-final Word Identification and Recall Test in a New Language (SWIRL) was introduced and evaluated in two experiments. The objective of experiment 1 was to determine if the Swedish results demonstrating benefit from noise reduction signal processing for hearing aid wearers could be replicated in 25 Danish participants with mild to moderate symmetrical sensorineural hearing loss. The objective of experiment 2 was to compare direct-drive and skin-drive transmission in 16 Danish users of bone-anchored hearing aids with conductive hearing loss or mixed sensorineural and conductive hearing loss. In experiment 1, performance on SWIRL improved when hearing aid noise reduction was used, replicating the Swedish results and generalizing them across languages. In experiment 2, performance on SWIRL was better for direct-drive compared with skin-drive transmission conditions. These findings indicate that spoken word recall can be used to identify benefits from hearing aid signal processing at ecologically valid, positive SNRs where SRTs are insensitive.
Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners with Simulated Age-Related Hearing Loss

ERIC Educational Resources Information Center

Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian

2017-01-01

Purpose: The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist…
A generalized time-frequency subtraction method for robust speech enhancement based on wavelet filter banks modeling of human auditory system.

PubMed

Shao, Yu; Chang, Chip-Hong

2007-08-01

We present a new speech enhancement scheme for a single-microphone system to meet the demand for quality noise reduction algorithms capable of operating at a very low signal-to-noise ratio. A psychoacoustic model is incorporated into the generalized perceptual wavelet denoising method to reduce the residual noise and improve the intelligibility of speech. The proposed method is a generalized time-frequency subtraction algorithm, which advantageously exploits the wavelet multirate signal representation to preserve the critical transient information. Simultaneous masking and temporal masking of the human auditory system are modeled by the perceptual wavelet packet transform via the frequency and temporal localization of speech components. The wavelet coefficients are used to calculate the Bark spreading energy and temporal spreading energy, from which a time-frequency masking threshold is deduced to adaptively adjust the subtraction parameters of the proposed method. An unvoiced speech enhancement algorithm is also integrated into the system to improve the intelligibility of speech. Through rigorous objective and subjective evaluations, it is shown that the proposed speech enhancement system is capable of reducing noise with little speech degradation in adverse noise environments and the overall performance is superior to several competitive methods.
Objective and subjective assessment of tracheoesophageal prosthesis voice outcome.

PubMed

D'Alatri, Lucia; Bussu, Francesco; Scarano, Emanuele; Paludetti, Gaetano; Marchese, Maria Raffaella

2012-09-01

To investigate the relationships between objective measures and the results of subjective assessment of voice quality and speech intelligibility in patients submitted to total laryngectomy and tracheoesophageal (TE) puncture. Retrospective. Twenty patients implanted with voice prosthesis were studied. After surgery, the entire sample performed speech rehabilitation. The assessment protocol included maximum phonation time (MPT), number of syllables per deep breath, acoustic analysis of the sustained vowel /a/ and of a bisyllabic word, perceptual evaluation (pleasantness and intelligibility%), and self-assessment. The correlation between pleasantness and intelligibility% was statistically significant. Both the latter were significantly correlated with the acoustic signal type, the number of formant peaks, and the F2-F1 difference. The intelligibility% and number of formant peaks were significantly correlated with the MPT and number of syllables per deep breath. Moreover, significant correlations were found between the number of formant peaks and both intelligibility% and pleasantness. The higher the number of syllables per deep breath and the longer the MPT, significantly higher was the number of formant peaks and the intelligibility%. The study failed to show significant correlation between patient's self-assessment of voice quality and both pleasantness and communication effectiveness. The multidimensional assessment seems to be a reliable tool to evaluate the TE functional outcome. Particularly, the results showed that both pleasantness and intelligibility of TE speech are correlated to the availability of expired air and the function of the vocal tract. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Datalingvistik, 2000.

ERIC Educational Resources Information Center

Kjaersgaard, Poul Soren, Ed.

2002-01-01

Papers from the conference in this volume include the following: "Towards Corpus Annotation Standards--The MATE Workbench" (Laila Dybkjaer and Niels Ole Bernsen); "Danish Text-to-Speech Synthesis Based on Stored Acoustic Segments" (Charles Hoequist); "Toward a Method for the Automated Design of Semantic…
Comparisons of Auditory Performance and Speech Intelligibility after Cochlear Implant Reimplantation in Mandarin-Speaking Users

PubMed Central

Hwang, Chung-Feng; Ko, Hui-Chen; Tsou, Yung-Ting; Chan, Kai-Chieh; Fang, Hsuan-Yeh; Wu, Che-Ming

2016-01-01

Objectives. We evaluated the causes, hearing, and speech performance before and after cochlear implant reimplantation in Mandarin-speaking users. Methods. In total, 589 patients who underwent cochlear implantation in our medical center between 1999 and 2014 were reviewed retrospectively. Data related to demographics, etiologies, implant-related information, complications, and hearing and speech performance were collected. Results. In total, 22 (3.74%) cases were found to have major complications. Infection (n = 12) and hard failure of the device (n = 8) were the most common major complications. Among them, 13 were reimplanted in our hospital. The mean scores of the Categorical Auditory Performance (CAP) and the Speech Intelligibility Rating (SIR) obtained before and after reimplantation were 5.5 versus 5.8 and 3.7 versus 4.3, respectively. The SIR score after reimplantation was significantly better than preoperation. Conclusions. Cochlear implantation is a safe procedure with low rates of postsurgical revisions and device failures. The Mandarin-speaking patients in this study who received reimplantation had restored auditory performance and speech intelligibility after surgery. Device soft failure was rare in our series, calling attention to Mandarin-speaking CI users requiring revision of their implants due to undesirable symptoms or decreasing performance of uncertain cause. PMID:27413753
Examining explanations for fundamental frequency's contribution to speech intelligibility in noise

NASA Astrophysics Data System (ADS)

Schlauch, Robert S.; Miller, Sharon E.; Watson, Peter J.

2005-09-01

Laures and Weismer [JSLHR, 42, 1148 (1999)] reported that speech with natural variation in fundamental frequency (F0) is more intelligible in noise than speech with a flattened F0 contour. Cognitive-linguistic based explanations have been offered to account for this drop in intelligibility for the flattened condition, but a lower-level mechanism related to auditory streaming may be responsible. Numerous psychoacoustic studies have demonstrated that modulating a tone enables a listener to segregate it from background sounds. To test these rival hypotheses, speech recognition in noise was measured for sentences with six different F0 contours: unmodified, flattened at the mean, natural but exaggerated, reversed, and frequency modulated (rates of 2.5 and 5.0 Hz). The 180 stimulus sentences were produced by five talkers (30 sentences per condition). Speech recognition for fifteen listeners replicate earlier findings showing that flattening the F0 contour results in a roughly 10% reduction in recognition of key words compared with the natural condition. Although the exaggerated condition produced results comparable to those of the flattened condition, the other conditions with unnatural F0 contours all yielded significantly poorer performance than the flattened condition. These results support the cognitive, linguistic-based explanations for the reduction in performance.
The effect of varying talker identity and listening conditions on gaze behavior during audiovisual speech perception.

PubMed

Buchan, Julie N; Paré, Martin; Munhall, Kevin G

2008-11-25

During face-to-face conversation the face provides auditory and visual linguistic information, and also conveys information about the identity of the speaker. This study investigated behavioral strategies involved in gathering visual information while watching talking faces. The effects of varying talker identity and varying the intelligibility of speech (by adding acoustic noise) on gaze behavior were measured with an eyetracker. Varying the intelligibility of the speech by adding noise had a noticeable effect on the location and duration of fixations. When noise was present subjects adopted a vantage point that was more centralized on the face by reducing the frequency of the fixations on the eyes and mouth and lengthening the duration of their gaze fixations on the nose and mouth. Varying talker identity resulted in a more modest change in gaze behavior that was modulated by the intelligibility of the speech. Although subjects generally used similar strategies to extract visual information in both talker variability conditions, when noise was absent there were more fixations on the mouth when viewing a different talker every trial as opposed to the same talker every trial. These findings provide a useful baseline for studies examining gaze behavior during audiovisual speech perception and perception of dynamic faces.
Using listening difficulty ratings of conditions for speech communication in rooms

NASA Astrophysics Data System (ADS)

Sato, Hiroshi; Bradley, John S.; Morimoto, Masayuki

2005-03-01

The use of listening difficulty ratings of speech communication in rooms is explored because, in common situations, word recognition scores do not discriminate well among conditions that are near to acceptable. In particular, the benefits of early reflections of speech sounds on listening difficulty were investigated and compared to the known benefits to word intelligibility scores. Listening tests were used to assess word intelligibility and perceived listening difficulty of speech in simulated sound fields. The experiments were conducted in three types of sound fields with constant levels of ambient noise: only direct sound, direct sound with early reflections, and direct sound with early reflections and reverberation. The results demonstrate that (1) listening difficulty can better discriminate among these conditions than can word recognition scores; (2) added early reflections increase the effective signal-to-noise ratio equivalent to the added energy in the conditions without reverberation; (3) the benefit of early reflections on difficulty scores is greater than expected from the simple increase in early arriving speech energy with reverberation; (4) word intelligibility tests are most appropriate for conditions with signal-to-noise (S/N) ratios less than 0 dBA, and where S/N is between 0 and 15-dBA S/N, listening difficulty is a more appropriate evaluation tool. .
Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters.

PubMed

McNeer, Richard R; Bennett, Christopher L; Horn, Danielle Bodzin; Dudaryk, Roman

2017-06-01

Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m (average VR = 131.1 [34.2] m). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R = 0.666, F(2, 37) = 39.9, P < .0001). For Dmax (maximum distance where there is <15% loss of consonant articulation), both VR and VC β coefficients were significant. For RT60 and Dmax, after controlling for VC, partial correlations were 0.825 (P < .0001) and 0.718 (P < .0001), respectively, while after controlling for VR, partial correlations were -0.322 (P = .169) and 0.381 (P < .05), respectively. Our results suggest that the size and contents of an OR can predict a range of psychoacoustic indices of speech intelligibility. Specifically, increasing OR size correlated with worse speech intelligibility, while increasing amounts of OR contents correlated with improved speech intelligibility. This study provides valuable descriptive data and a predictive method for identifying existing ORs that may benefit from acoustic modifiers (eg, sound absorption panels). Additionally, it suggests that room dimensions and projected clinical use should be considered during the design phase of OR suites to optimize acoustic performance.
Linkage of Speech Sound Disorder to Reading Disability Loci

ERIC Educational Resources Information Center

Smith, Shelley D.; Pennington, Bruce F.; Boada, Richard; Shriberg, Lawrence D.

2005-01-01

Background: Speech sound disorder (SSD) is a common childhood disorder characterized by developmentally inappropriate errors in speech production that greatly reduce intelligibility. SSD has been found to be associated with later reading disability (RD), and there is also evidence for both a cognitive and etiological overlap between the two…
Cross-Channel Amplitude Sweeps Are Crucial to Speech Intelligibility

ERIC Educational Resources Information Center

Prendergast, Garreth; Green, Gary G. R.

2012-01-01

Classical views of speech perception argue that the static and dynamic characteristics of spectral energy peaks (formants) are the acoustic features that underpin phoneme recognition. Here we use representations where the amplitude modulations of sub-band filtered speech are described, precisely, in terms of co-sinusoidal pulses. These pulses are…
Effectiveness of Speech Therapy in Adults with Intellectual Disabilities

ERIC Educational Resources Information Center

Terband, Hayo; Coppens-Hofman, Marjolein C.; Reffeltrath, Maaike; Maassen, Ben A. M.

2018-01-01

Background: This study investigated the effect of speech therapy in a heterogeneous group of adults with intellectual disability. Method: Thirty-six adults with mild and moderate intellectual disabilities (IQs 40-70; age 18-40 years) with reported poor speech intelligibility received tailored training in articulation and listening skills delivered…

Noise and communication: a three-year update.

PubMed

Brammer, Anthony J; Laroche, Chantal

2012-01-01

Noise is omnipresent and impacts us all in many aspects of daily living. Noise can interfere with communication not only in industrial workplaces, but also in other work settings (e.g. open-plan offices, construction, and mining) and within buildings (e.g. residences, arenas, and schools). The interference of noise with communication can have significant social consequences, especially for persons with hearing loss, and may compromise safety (e.g. failure to perceive auditory warning signals), influence worker productivity and learning in children, affect health (e.g. vocal pathology, noise-induced hearing loss), compromise speech privacy, and impact social participation by the elderly. For workers, attempts have been made to: 1) Better define the auditory performance needed to function effectively and to directly measure these abilities when assessing Auditory Fitness for Duty, 2) design hearing protection devices that can improve speech understanding while offering adequate protection against loud noises, and 3) improve speech privacy in open-plan offices. As the elderly are particularly vulnerable to the effects of noise, an understanding of the interplay between auditory, cognitive, and social factors and its effect on speech communication and social participation is also critical. Classroom acoustics and speech intelligibility in children have also gained renewed interest because of the importance of effective speech comprehension in noise on learning. Finally, substantial work has been made in developing models aimed at better predicting speech intelligibility. Despite progress in various fields, the design of alarm signals continues to lag behind advancements in knowledge. This summary of the last three years' research highlights some of the most recent issues for the workplace, for older adults, and for children, as well as the effectiveness of warning sounds and models for predicting speech intelligibility. Suggestions for future work are also discussed.
Expanding the phenotypic profile of Kleefstra syndrome: A female with low-average intelligence and childhood apraxia of speech.

PubMed

Samango-Sprouse, Carole; Lawson, Patrick; Sprouse, Courtney; Stapleton, Emily; Sadeghin, Teresa; Gropman, Andrea

2016-05-01

Kleefstra syndrome (KS) is a rare neurogenetic disorder most commonly caused by deletion in the 9q34.3 chromosomal region and is associated with intellectual disabilities, severe speech delay, and motor planning deficits. To our knowledge, this is the first patient (PQ, a 6-year-old female) with a 9q34.3 deletion who has near normal intelligence, and developmental dyspraxia with childhood apraxia of speech (CAS). At 6, the Wechsler Preschool and Primary Intelligence testing (WPPSI-III) revealed a Verbal IQ of 81 and Performance IQ of 79. The Beery Buktenica Test of Visual Motor Integration, 5th Edition (VMI) indicated severe visual motor deficits: VMI = 51; Visual Perception = 48; Motor Coordination < 45. On the Receptive One Word Picture Vocabulary Test-R (ROWPVT-R), she had standard scores of 96 and 99 in contrast to an Expressive One Word Picture Vocabulary-R (EOWPVT-R) standard scores of 73 and 82, revealing a discrepancy in vocabulary domains on both evaluations. Preschool Language Scale-4 (PLS-4) on PQ's first evaluation reveals a significant difference between auditory comprehension and expressive communication with standard scores of 78 and 57, respectively, further supporting the presence of CAS. This patient's near normal intelligence expands the phenotypic profile as well as the prognosis associated with KS. The identification of CAS in this patient provides a novel explanation for the previously reported speech delay and expressive language disorder. Further research is warranted on the impact of CAS on intelligence and behavioral outcome in KS. Therapeutic and prognostic implications are discussed. © 2016 Wiley Periodicals, Inc.
Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing

PubMed Central

Doelling, Keith; Arnal, Luc; Ghitza, Oded; Poeppel, David

2013-01-01

A growing body of research suggests that intrinsic neuronal slow (< 10 Hz) oscillations in auditory cortex appear to track incoming speech and other spectro-temporally complex auditory signals. Within this framework, several recent studies have identified critical-band temporal envelopes as the specific acoustic feature being reflected by the phase of these oscillations. However, how this alignment between speech acoustics and neural oscillations might underpin intelligibility is unclear. Here we test the hypothesis that the ‘sharpness’ of temporal fluctuations in the critical band envelope acts as a temporal cue to speech syllabic rate, driving delta-theta rhythms to track the stimulus and facilitate intelligibility. We interpret our findings as evidence that sharp events in the stimulus cause cortical rhythms to re-align and parse the stimulus into syllable-sized chunks for further decoding. Using magnetoencephalographic recordings, we show that by removing temporal fluctuations that occur at the syllabic rate, envelope-tracking activity is reduced. By artificially reinstating these temporal fluctuations, envelope-tracking activity is regained. These changes in tracking correlate with intelligibility of the stimulus. Together, the results suggest that the sharpness of fluctuations in the stimulus, as reflected in the cochlear output, drive oscillatory activity to track and entrain to the stimulus, at its syllabic rate. This process likely facilitates parsing of the stimulus into meaningful chunks appropriate for subsequent decoding, enhancing perception and intelligibility. PMID:23791839
Effects of cross-language voice training on speech perception: Whose familiar voices are more intelligible?

PubMed Central

Levi, Susannah V.; Winters, Stephen J.; Pisoni, David B.

2011-01-01

Previous research has shown that familiarity with a talker’s voice can improve linguistic processing (herein, “Familiar Talker Advantage”), but this benefit is constrained by the context in which the talker’s voice is familiar. The current study examined how familiarity affects intelligibility by manipulating the type of talker information available to listeners. One group of listeners learned to identify bilingual talkers’ voices from English words, where they learned language-specific talker information. A second group of listeners learned the same talkers from German words, and thus only learned language-independent talker information. After voice training, both groups of listeners completed a word recognition task with English words produced by both familiar and unfamiliar talkers. Results revealed that English-trained listeners perceived more phonemes correct for familiar than unfamiliar talkers, while German-trained listeners did not show improved intelligibility for familiar talkers. The absence of a processing advantage in speech intelligibility for the German-trained listeners demonstrates limitations on the Familiar Talker Advantage, which crucially depends on the language context in which the talkers’ voices were learned; knowledge of how a talker produces linguistically relevant contrasts in a particular language is necessary to increase speech intelligibility for words produced by familiar talkers. PMID:22225059
Correlational Analysis of Speech Intelligibility Tests and Metrics for Speech Transmission

DTIC Science & Technology

2017-12-04

frequency scale (male voice; normal voice effort) ............................... 4 Fig. 2 Diagram of a speech communication system (Letowski...languages. Consonants contain mostly high frequency (above 1500 Hz) speech energy, but this energy is relatively small in comparison to that of the whole...voices (Letowski et al. 1993). Since the mid- frequency spectral region contains mostly vowel energy while consonants are high frequency sounds, an
Factors affecting articulation skills in children with velocardiofacial syndrome and children with cleft palate or velopharyngeal dysfunction: A preliminary report

PubMed Central

Baylis, Adriane L.; Munson, Benjamin; Moller, Karlind T.

2010-01-01

Objective To examine the influence of speech perception, cognition, and implicit phonological learning on articulation skills of children with Velocardiofacial syndrome (VCFS) and children with cleft palate or velopharyngeal dysfunction (VPD). Design Cross-sectional group experimental design. Participants 8 children with VCFS and 5 children with non-syndromic cleft palate or VPD. Methods and Measures All children participated in a phonetic inventory task, speech perception task, implicit priming nonword repetition task, conversational sample, nonverbal intelligence test, and hearing screening. Speech tasks were scored for percentage of phonemes correctly produced. Group differences and relations among measures were examined using nonparametric statistics. Results Children in the VCFS group demonstrated significantly poorer articulation skills and lower standard scores of nonverbal intelligence compared to the children with cleft palate or VPD. There were no significant group differences in speech perception skills. For the implicit priming task, both groups of children were more accurate in producing primed nonwords than unprimed nonwords. Nonverbal intelligence and severity of velopharyngeal inadequacy for speech were correlated with articulation skills. Conclusions In this study, children with VCFS had poorer articulation skills compared to children with cleft palate or VPD. Articulation difficulties seen in the children with VCFS did not appear to be associated with speech perception skills or the ability to learn new phonological representations. Future research should continue to examine relationships between articulation, cognition, and velopharyngeal dysfunction in a larger sample of children with cleft palate and VCFS. PMID:18333642
Auditory cortex activation to natural speech and simulated cochlear implant speech measured with functional near-infrared spectroscopy.

PubMed

Pollonini, Luca; Olds, Cristen; Abaya, Homer; Bortfeld, Heather; Beauchamp, Michael S; Oghalai, John S

2014-03-01

The primary goal of most cochlear implant procedures is to improve a patient's ability to discriminate speech. To accomplish this, cochlear implants are programmed so as to maximize speech understanding. However, programming a cochlear implant can be an iterative, labor-intensive process that takes place over months. In this study, we sought to determine whether functional near-infrared spectroscopy (fNIRS), a non-invasive neuroimaging method which is safe to use repeatedly and for extended periods of time, can provide an objective measure of whether a subject is hearing normal speech or distorted speech. We used a 140 channel fNIRS system to measure activation within the auditory cortex in 19 normal hearing subjects while they listed to speech with different levels of intelligibility. Custom software was developed to analyze the data and compute topographic maps from the measured changes in oxyhemoglobin and deoxyhemoglobin concentration. Normal speech reliably evoked the strongest responses within the auditory cortex. Distorted speech produced less region-specific cortical activation. Environmental sounds were used as a control, and they produced the least cortical activation. These data collected using fNIRS are consistent with the fMRI literature and thus demonstrate the feasibility of using this technique to objectively detect differences in cortical responses to speech of different intelligibility. Copyright © 2013 Elsevier B.V. All rights reserved.
Functional Connectivity between Face-Movement and Speech-Intelligibility Areas during Auditory-Only Speech Perception

PubMed Central

Schall, Sonja; von Kriegstein, Katharina

2014-01-01

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers’ voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker’s face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas. PMID:24466026
An Acoustic Study of the Relationships among Neurologic Disease, Dysarthria Type, and Severity of Dysarthria

ERIC Educational Resources Information Center

Kim, Yunjung; Kent, Raymond D.; Weismer, Gary

2011-01-01

Purpose: This study examined acoustic predictors of speech intelligibility in speakers with several types of dysarthria secondary to different diseases and conducted classification analysis solely by acoustic measures according to 3 variables (disease, speech severity, and dysarthria type). Method: Speech recordings from 107 speakers with…
Effects of Neurosurgical Management of Parkinson's Disease on Speech Characteristics and Oromotor Function.

ERIC Educational Resources Information Center

Farrell, Anna; Theodoros, Deborah; Ward, Elizabeth; Hall, Bruce; Silburn, Peter

2005-01-01

The present study examined the effects of neurosurgical management of Parkinson's disease (PD), including the procedures of pallidotomy, thalamotomy, and deep-brain stimulation (DBS) on perceptual speech characteristics, speech intelligibility, and oromotor function in a group of 22 participants with PD. The surgical participant group was compared…
Sentence-Level Movements in Parkinson's Disease: Loud, Clear, and Slow Speech

ERIC Educational Resources Information Center

Kearney,Elaine; Giles, Renuka; Haworth, Brandon; Faloutsos, Petros; Baljko, Melanie; Yunusova, Yana

2017-01-01

Purpose: To further understand the effect of Parkinson's disease (PD) on articulatory movements in speech and to expand our knowledge of therapeutic treatment strategies, this study examined movements of the jaw, tongue blade, and tongue dorsum during sentence production with respect to speech intelligibility and compared the effect of varying…
Intensive Dysarthria Therapy for Older Children with Cerebral Palsy: Findings from Six Cases

ERIC Educational Resources Information Center

Pennington, Lindsay; Smallman, Claire; Farrier, Faith

2006-01-01

Children with cerebral palsy often have speech, language and communication difficulties that affect their access to social and educational activities. Speech and language therapy to improve the intelligibility of the speech of children with cerebral palsy has long been advocated, but there is a dearth of research investigating therapy…
Relationship between Speech, Oromotor, Language and Cognitive Abilities in Children with Down's Syndrome

ERIC Educational Resources Information Center

Cleland, Joanne; Wood, Sara; Hardcastle, William; Wishart, Jennifer; Timmins, Claire

2010-01-01

Background: Children and young people with Down's syndrome present with deficits in expressive speech and language, accompanied by strengths in vocabulary comprehension compared with non-verbal mental age. Intelligibility is particularly low, but whether speech is delayed or disordered is a controversial topic. Most studies suggest a delay, but no…
Visemic Processing in Audiovisual Discrimination of Natural Speech: A Simultaneous fMRI-EEG Study

ERIC Educational Resources Information Center

Dubois, Cyril; Otzenberger, Helene; Gounot, Daniel; Sock, Rudolph; Metz-Lutz, Marie-Noelle

2012-01-01

In a noisy environment, visual perception of articulatory movements improves natural speech intelligibility. Parallel to phonemic processing based on auditory signal, visemic processing constitutes a counterpart based on "visemes", the distinctive visual units of speech. Aiming at investigating the neural substrates of visemic processing in a…
Multichannel spatial auditory display for speech communications

NASA Technical Reports Server (NTRS)

Begault, D. R.; Erbe, T.; Wenzel, E. M. (Principal Investigator)

1994-01-01

A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.
Multi-channel spatial auditory display for speech communications

NASA Astrophysics Data System (ADS)

Begault, Durand; Erbe, Tom

1993-10-01

A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.
Multichannel spatial auditory display for speech communications.

PubMed

Begault, D R; Erbe, T

1994-10-01

A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.
Multi-channel spatial auditory display for speech communications

NASA Technical Reports Server (NTRS)

Begault, Durand; Erbe, Tom

1993-01-01

A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.
An Analysis of Individual Differences in Recognizing Monosyllabic Words Under the Speech Intelligibility Index Framework

PubMed Central

Shen, Yi; Kern, Allison B.

2018-01-01

Individual differences in the recognition of monosyllabic words, either in isolation (NU6 test) or in sentence context (SPIN test), were investigated under the theoretical framework of the speech intelligibility index (SII). An adaptive psychophysical procedure, namely the quick-band-importance-function procedure, was developed to enable the fitting of the SII model to individual listeners. Using this procedure, the band importance function (i.e., the relative weights of speech information across the spectrum) and the link function relating the SII to recognition scores can be simultaneously estimated while requiring only 200 to 300 trials of testing. Octave-frequency band importance functions and link functions were estimated separately for NU6 and SPIN materials from 30 normal-hearing listeners who were naïve to speech recognition experiments. For each type of speech material, considerable individual differences in the spectral weights were observed in some but not all frequency regions. At frequencies where the greatest intersubject variability was found, the spectral weights were correlated between the two speech materials, suggesting that the variability in spectral weights reflected listener-originated factors. PMID:29532711
Acoustic and perceptual effects of magnifying interaural difference cues in a simulated "binaural" hearing aid.

PubMed

de Taillez, Tobias; Grimm, Giso; Kollmeier, Birger; Neher, Tobias

2018-06-01

To investigate the influence of an algorithm designed to enhance or magnify interaural difference cues on speech signals in noisy, spatially complex conditions using both technical and perceptual measurements. To also investigate the combination of interaural magnification (IM), monaural microphone directionality (DIR), and binaural coherence-based noise reduction (BC). Speech-in-noise stimuli were generated using virtual acoustics. A computational model of binaural hearing was used to analyse the spatial effects of IM. Predicted speech quality changes and signal-to-noise-ratio (SNR) improvements were also considered. Additionally, a listening test was carried out to assess speech intelligibility and quality. Listeners aged 65-79 years with and without sensorineural hearing loss (N = 10 each). IM increased the horizontal separation of concurrent directional sound sources without introducing any major artefacts. In situations with diffuse noise, however, the interaural difference cues were distorted. Preprocessing the binaural input signals with DIR reduced distortion. IM influenced neither speech intelligibility nor speech quality. The IM algorithm tested here failed to improve speech perception in noise, probably because of the dispersion and inconsistent magnification of interaural difference cues in complex environments.

Processing load induced by informational masking is related to linguistic abilities.

PubMed

Koelewijn, Thomas; Zekveld, Adriana A; Festen, Joost M; Rönnberg, Jerker; Kramer, Sophia E

2012-01-01

It is often assumed that the benefit of hearing aids is not primarily reflected in better speech performance, but that it is reflected in less effortful listening in the aided than in the unaided condition. Before being able to assess such a hearing aid benefit the present study examined how processing load while listening to masked speech relates to inter-individual differences in cognitive abilities relevant for language processing. Pupil dilation was measured in thirty-two normal hearing participants while listening to sentences masked by fluctuating noise or interfering speech at either 50% and 84% intelligibility. Additionally, working memory capacity, inhibition of irrelevant information, and written text reception was tested. Pupil responses were larger during interfering speech as compared to fluctuating noise. This effect was independent of intelligibility level. Regression analysis revealed that high working memory capacity, better inhibition, and better text reception were related to better speech reception thresholds. Apart from a positive relation to speech recognition, better inhibition and better text reception are also positively related to larger pupil dilation in the single-talker masker conditions. We conclude that better cognitive abilities not only relate to better speech perception, but also partly explain higher processing load in complex listening conditions.
Synthesized speech rate and pitch effects on intelligibility of warning messages for pilots

NASA Technical Reports Server (NTRS)

Simpson, C. A.; Marchionda-Frost, K.

1984-01-01

In civilian and military operations, a future threat-warning system with a voice display could warn pilots of other traffic, obstacles in the flight path, and/or terrain during low-altitude helicopter flights. The present study was conducted to learn whether speech rate and voice pitch of phoneme-synthesized speech affects pilot accuracy and response time to typical threat-warning messages. Helicopter pilots engaged in an attention-demanding flying task and listened for voice threat warnings presented in a background of simulated helicopter cockpit noise. Performance was measured by flying-task performance, threat-warning intelligibility, and response time. Pilot ratings were elicited for the different voice pitches and speech rates. Significant effects were obtained only for response time and for pilot ratings, both as a function of speech rate. For the few cases when pilots forgot to respond to a voice message, they remembered 90 percent of the messages accurately when queried for their response 8 to 10 sec later.
Artificial intelligence, expert systems, computer vision, and natural language processing

NASA Technical Reports Server (NTRS)

Gevarter, W. B.

1984-01-01

An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Processing Techniques for Intelligibility Improvement to Speech with Co-Channel Interference.

DTIC Science & Technology

1983-09-01

processing was found to be always less than in the original unprocessed co-channel sig- nali also as the length of the comb filter increased, the...7 D- i35 702 PROCESSING TECHNIQUES FOR INTELLIGIBILITY IMPRO EMENT 1𔃼.TO SPEECH WITH CO-C..(U) SIGNAL TECHNOLOGY INC GOLETACA B A HANSON ET AL SEP...11111111122 11111.25 1111 .4 111.6 MICROCOPY RESOLUTION TEST CHART NATIONAL BUREAU Of STANDARDS- 1963-A RA R.83-225 Set ,’ember 1983 PROCESSING
An Experimental Determination of the Intelligibility of Two Different Speech Synthesizers in Noise.

DTIC Science & Technology

1987-12-01

to control the signal to noise ratios and to maintain an overall SPL of 80 dBC. The calibration of the speech signal was not easy due to its non ...ft" 432 AN EXPERIMENTAL DETERMINATION OF THE INTELLIGIBILITY OF 1.11 TiHO DIFFERENT SPE (U) PENNSYLVANIA STATE UNIV UNIVERSITY PARK APPLIED RESEARCH ...0 0 * Applied Research Laboratory The Pennsylvania State University CD4 A’ - 4,. ,__ TOO AN EcPERi.taTrAL DETEMINATION OF TlE IIjZINTELLIGIBILITY OF
Reception of distorted speech.

DOT National Transportation Integrated Search

1973-12-01

Noise, either in the form of masking or in the form of distortion products, interferes with speech intelligibility. When the signal-to-noise ratio is bad enough, articulation can drop to unacceptably--even dangerously--low levels. However, listeners ...
Lexical effects on speech production and intelligibility in Parkinson's disease

NASA Astrophysics Data System (ADS)

Chiu, Yi-Fang

Individuals with Parkinson's disease (PD) often have speech deficits that lead to reduced speech intelligibility. Previous research provides a rich database regarding the articulatory deficits associated with PD including restricted vowel space (Skodda, Visser, & Schlegel, 2011) and flatter formant transitions (Tjaden & Wilding, 2004; Walsh & Smith, 2012). However, few studies consider the effect of higher level structural variables of word usage frequency and the number of similar sounding words (i.e. neighborhood density) on lower level articulation or on listeners' perception of dysarthric speech. The purpose of the study is to examine the interaction of lexical properties and speech articulation as measured acoustically in speakers with PD and healthy controls (HC) and the effect of lexical properties on the perception of their speech. Individuals diagnosed with PD and age-matched healthy controls read sentences with words that varied in word frequency and neighborhood density. Acoustic analysis was performed to compare second formant transitions in diphthongs, an indicator of the dynamics of tongue movement during speech production, across different lexical characteristics. Young listeners transcribed the spoken sentences and the transcription accuracy was compared across lexical conditions. The acoustic results indicate that both PD and HC speakers adjusted their articulation based on lexical properties but the PD group had significant reductions in second formant transitions compared to HC. Both groups of speakers increased second formant transitions for words with low frequency and low density, but the lexical effect is diphthong dependent. The change in second formant slope was limited in the PD group when the required formant movement for the diphthong is small. The data from listeners' perception of the speech by PD and HC show that listeners identified high frequency words with greater accuracy suggesting the use of lexical knowledge during the recognition process. The relationship between acoustic results and perceptual accuracy is limited in this study suggesting that listeners incorporate acoustic and non-acoustic information to maximize speech intelligibility.
Intelligence and Schooling. Fueling the Education Explosion: Proceedings of Conference 2 (Cleveland, Ohio, November 17-18, 1983).

ERIC Educational Resources Information Center

Gardner, Mary, Ed.; Reed-Mundell, Charlene, Ed.

These proceedings contain presentations from a conference whose major topics were real-world intelligence, artificial intelligence, and linkage between the education and corporate sectors. "People, Perspectives...Potential and Possibilities" (Elyse S. Fleming), which was the conference's closing speech, briefly summarizes the information…
Soundscape elaboration from anthrophonic adaptation of community noise

NASA Astrophysics Data System (ADS)

Teddy Badai Samodra, FX

2018-03-01

Under the situation of an urban environment, noise has been a critical issue in affecting the indoor environment. A reliable approach is required for evaluation of the community noise as one factor of anthrophonic in the urban environment. This research investigates the level of noise exposure from different community noise sources and elaborates the advantage of the noise disadvantages for soundscape innovation. Integrated building element design as a protector for noise control and speech intelligibility compliance using field experiment and MATLAB programming and modeling are also carried out. Meanwhile, for simulation analysis and building acoustic optimization, Sound Reduction-Speech Intelligibility and Reverberation Time are the main parameters for identifying tropical building model as case study object. The results show that the noise control should consider its integration with the other critical issue, thermal control, in an urban environment. The 1.1 second of reverberation time for speech activities and noise reduction more than 28.66 dBA for critical frequency (20 Hz), the speech intelligibility index could be reached more than fair assessment, 0.45. Furthermore, the environmental psychology adaptation result “Close The Opening” as the best method in high noise condition and personal adjustment as the easiest and the most adaptable way.
Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

NASA Astrophysics Data System (ADS)

Mapp, Peter

2002-11-01

Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
The effect of compression and attention allocation on speech intelligibility. II

NASA Astrophysics Data System (ADS)

Choi, Sangsook; Carrell, Thomas

2004-05-01

Previous investigations of the effects of amplitude compression on measures of speech intelligibility have shown inconsistent results. Recently, a novel paradigm was used to investigate the possibility of more consistent findings with a measure of speech perception that is not based entirely on intelligibility (Choi and Carrell, 2003). That study exploited a dual-task paradigm using a pursuit rotor online visual-motor tracking task (Dlhopolsky, 2000) along with a word repetition task. Intensity-compressed words caused reduced performance on the tracking task as compared to uncompressed words when subjects engaged in a simultaneous word repetition task. This suggested an increased cognitive load when listeners processed compressed words. A stronger result might be obtained if a single resource (linguistic) is required rather than two (linguistic and visual-motor) resources. In the present experiment a visual lexical decision task and an auditory word repetition task were used. The visual stimuli for the lexical decision task were blurred and presented in a noise background. The compressed and uncompressed words for repetition were placed in speech-shaped noise. Participants with normal hearing and vision conducted word repetition and lexical decision tasks both independently and simultaneously. The pattern of results is discussed and compared to the previous study.
Speech intelligibility index predictions for young and old listeners in automobile noise: Can the index be improved by incorporating factors other than absolute threshold?

NASA Astrophysics Data System (ADS)

Saweikis, Meghan; Surprenant, Aimée M.; Davies, Patricia; Gallant, Don

2003-10-01

While young and old subjects with comparable audiograms tend to perform comparably on speech recognition tasks in quiet environments, the older subjects have more difficulty than the younger subjects with recognition tasks in degraded listening conditions. This suggests that factors other than an absolute threshold may account for some of the difficulty older listeners have on recognition tasks in noisy environments. Many metrics, including the Speech Intelligibility Index (SII), used to measure speech intelligibility, only consider an absolute threshold when accounting for age related hearing loss. Therefore these metrics tend to overestimate the performance for elderly listeners in noisy environments [Tobias et al., J. Acoust. Soc. Am. 83, 859-895 (1988)]. The present studies examine the predictive capabilities of the SII in an environment with automobile noise present. This is of interest because people's evaluation of the automobile interior sound is closely linked to their ability to carry on conversations with their fellow passengers. The four studies examine whether, for subjects with age related hearing loss, the accuracy of the SII can be improved by incorporating factors other than an absolute threshold into the model. [Work supported by Ford Motor Company.
An Intelligibility Assessment of Toddlers with Cleft Lip and Palate Who Received and Did Not Receive Presurgical Infant Orthopedic Treatment.

ERIC Educational Resources Information Center

Konst, Emmy M.; Weersink-Braks, Hanny; Rietveld, Toni; Peters, Herman

2000-01-01

The influence of presurgical infant orthopedic treatment (PIO) on speech intelligibility was evaluated with 10 toddlers who used PIO during the first year of life and 10 who did not. Treated children were rated as exhibiting greater intelligibility, however, transcription data indicated there were not group differences in actual intelligibility.…
Evaluation of selected speech parameters after prosthesis supply in patients with maxillary or mandibular defects.

PubMed

Müller, Rainer; Höhlein, Andreas; Wolf, Annette; Markwardt, Jutta; Schulz, Matthias C; Range, Ursula; Reitemeier, Bernd

2013-01-01

Ablative surgery of oropharyngeal tumors frequently leads to defects in the speech organs, resulting in impairment of speech up to the point of unintelligibility. The aim of the present study was the assessment of selected parameters of speech with and without resection prostheses. The speech sounds of 22 patients suffering from maxillary and mandibular defects were recorded using a digital audio tape (DAT) recorder with and without resection prostheses. Evaluation of the resonance and the production of the sounds /s/, /sch/, and /ch/ was performed by 2 experienced speech therapists. Additionally, the patients completed a non-standardized questionnaire containing a linguistic self-assessment. After prosthesis supply, the number of patients with rhinophonia aperta decreased from 7 to 2 while the number of patients with intelligible speech increased from 2 to 20. Correct production of the sounds /s/, /sch/, and /ch/ increased from 2 to 13 patients. A significant improvement of the evaluated parameters could be observed only in patients with maxillary defects. The linguistic self-assessment showed a higher satisfaction in patients with maxillary defects. In patients with maxillary defects due to ablative tumor surgery, an increase in speech performance and intelligibility is possible by supplying resection prostheses. © 2013 S. Karger GmbH, Freiburg.
The effect of noise-induced hearing loss on the intelligibility of speech in noise

NASA Astrophysics Data System (ADS)

Smoorenburg, G. F.; Delaat, J. A. P. M.; Plomp, R.

1981-06-01

Speech reception thresholds, both in quiet and in noise, and tone audiograms were measured for 14 normal ears (7 subjects) and 44 ears (22 subjects) with noise-induced hearing loss. Maximum hearing loss in the 4-6 kHz region equalled 40 to 90 dB (losses exceeded by 90% and 10%, respectively). Hearing loss for speech in quiet measured with respect to the median speech reception threshold for normal ears ranged from 1.8 dB to 13.4 dB. For speech in noise the numbers are 1.2 dB to 7.0 dB which means that the subjects with noise-induced hearing loss need a 1.2 to 7.0 dB higher signal-to-noise ratio than normal to understand sentences equally well. A hearing loss for speech of 1 dB corresponds to a decrease in sentence intelligibility of 15 to 20%. The relation between hearing handicap conceived as a reduced ability to understand speech and tone audiogram is discussed. The higher signal-to-noise ratio needed by people with noise-induced hearing loss to understand speech in noisy environments is shown to be due partly to the decreased bandwidth of their hearing caused by the noise dip.
Computer-Mediated Assessment of Intelligibility in Aphasia and Apraxia of Speech

PubMed Central

Haley, Katarina L.; Roth, Heidi; Grindstaff, Enetta; Jacks, Adam

2011-01-01

Background Previous work indicates that single word intelligibility tests developed for dysarthria are sensitive to segmental production errors in aphasic individuals with and without apraxia of speech. However, potential listener learning effects and difficulties adapting elicitation procedures to coexisting language impairments limit their applicability to left hemisphere stroke survivors. Aims The main purpose of this study was to examine basic psychometric properties for a new monosyllabic intelligibility test developed for individuals with aphasia and/or AOS. A related purpose was to examine clinical feasibility and potential to standardize a computer-mediated administration approach. Methods & Procedures A 600-item monosyllabic single word intelligibility test was constructed by assembling sets of phonetically similar words. Custom software was used to select 50 target words from this test in a pseudo-random fashion and to elicit and record production of these words by 23 speakers with aphasia and 20 neurologically healthy participants. To evaluate test-retest reliability, two identical sets of 50-word lists were elicited by requesting repetition after a live speaker model. To examine the effect of a different word set and auditory model, an additional set of 50 different words was elicited with a pre-recorded model. The recorded words were presented to normal-hearing listeners for identification via orthographic and multiple-choice response formats. To examine construct validity, production accuracy for each speaker was estimated via phonetic transcription and rating of overall articulation. Outcomes & Results Recording and listening tasks were completed in less than six minutes for all speakers and listeners. Aphasic speakers were significantly less intelligible than neurologically healthy speakers and displayed a wide range of intelligibility scores. Test-retest and inter-listener reliability estimates were strong. No significant difference was found in scores based on recordings from a live model versus a pre-recorded model, but some individual speakers favored the live model. Intelligibility test scores correlated highly with segmental accuracy derived from broad phonetic transcription of the same speech sample and a motor speech evaluation. Scores correlated moderately with rated articulation difficulty. Conclusions We describe a computerized, single-word intelligibility test that yields clinically feasible, reliable, and valid measures of segmental speech production in adults with aphasia. This tool can be used in clinical research to facilitate appropriate participant selection and to establish matching across comparison groups. For a majority of speakers, elicitation procedures can be standardized by using a pre-recorded auditory model for repetition. This assessment tool has potential utility for both clinical assessment and outcomes research. PMID:22215933
GRIN2A: an aptly named gene for speech dysfunction.

PubMed

Turner, Samantha J; Mayes, Angela K; Verhoeven, Andrea; Mandelstam, Simone A; Morgan, Angela T; Scheffer, Ingrid E

2015-02-10

To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. © 2015 American Academy of Neurology.
Objective and Subjective Measures of Simultaneous vs Sequential Bilateral Cochlear Implants in Adults: A Randomized Clinical Trial.

PubMed

Kraaijenga, Véronique J C; Ramakers, Geerte G J; Smulders, Yvette E; van Zon, Alice; Stegeman, Inge; Smit, Adriana L; Stokroos, Robert J; Hendrice, Nadia; Free, Rolien H; Maat, Bert; Frijns, Johan H M; Briaire, Jeroen J; Mylanus, E A M; Huinck, Wendy J; Van Zanten, Gijsbert A; Grolman, Wilko

2017-09-01

To date, no randomized clinical trial on the comparison between simultaneous and sequential bilateral cochlear implants (BiCIs) has been performed. To investigate the hearing capabilities and the self-reported benefits of simultaneous BiCIs compared with those of sequential BiCIs. A multicenter randomized clinical trial was conducted between January 12, 2010, and September 2, 2012, at 5 tertiary referral centers among 40 participants eligible for BiCIs. Main inclusion criteria were postlingual severe to profound hearing loss, age 18 to 70 years, and a maximum duration of 10 years without hearing aid use in both ears. Data analysis was conducted from May 24 to June 12, 2016. The simultaneous BiCI group received 2 cochlear implants during 1 surgical procedure. The sequential BiCI group received 2 cochlear implants with an interval of 2 years between implants. First, the results 1 year after receiving simultaneous BiCIs were compared with the results 1 year after receiving sequential BiCIs. Second, the results of 3 years of follow-up for both groups were compared separately. The primary outcome measure was speech intelligibility in noise from straight ahead. Secondary outcome measures were speech intelligibility in noise from spatially separated sources, speech intelligibility in silence, localization capabilities, and self-reported benefits assessed with various hearing and quality of life questionnaires. Nineteen participants were randomized to receive simultaneous BiCIs (11 women and 8 men; median age, 52 years [interquartile range, 36-63 years]), and another 19 participants were randomized to undergo sequential BiCIs (8 women and 11 men; median age, 54 years [interquartile range, 43-64 years]). Three patients did not receive a second cochlear implant and were unavailable for follow-up. Comparable results were found 1 year after simultaneous or sequential BiCIs for speech intelligibility in noise from straight ahead (difference, 0.9 dB [95% CI, -3.1 to 4.4 dB]) and all secondary outcome measures except for localization with a 30° angle between loudspeakers (difference, -10% [95% CI, -20.1% to 0.0%]). In the sequential BiCI group, all participants performed significantly better after the BiCIs on speech intelligibility in noise from spatially separated sources and on all localization tests, which was consistent with most of the participants' self-reported hearing capabilities. Speech intelligibility-in-noise results improved in the simultaneous BiCI group up to 3 years following the BiCIs. This study shows comparable objective and subjective hearing results 1 year after receiving simultaneous BiCIs and sequential BiCIs with an interval of 2 years between implants. It also shows a significant benefit of sequential BiCIs over a unilateral cochlear implant. Until 3 years after receiving simultaneous BiCIs, speech intelligibility in noise significantly improved compared with previous years. trialregister.nl Identifier: NTR1722.
Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding?

PubMed

Meeuws, Matthias; Pascoal, David; Bermejo, Iñigo; Artaso, Miguel; De Ceulaer, Geert; Govaerts, Paul J

2017-07-01

The software application FOX ('Fitting to Outcome eXpert') is an intelligent agent to assist in the programing of cochlear implant (CI) processors. The current version utilizes a mixture of deterministic and probabilistic logic which is able to improve over time through a learning effect. This study aimed at assessing whether this learning capacity yields measurable improvements in speech understanding. A retrospective study was performed on 25 consecutive CI recipients with a median CI use experience of 10 years who came for their annual CI follow-up fitting session. All subjects were assessed by means of speech audiometry with open set monosyllables at 40, 55, 70, and 85 dB SPL in quiet with their home MAP. Other psychoacoustic tests were executed depending on the audiologist's clinical judgment. The home MAP and the corresponding test results were entered into FOX. If FOX suggested to make MAP changes, they were implemented and another speech audiometry was performed with the new MAP. FOX suggested MAP changes in 21 subjects (84%). The within-subject comparison showed a significant median improvement of 10, 3, 1, and 7% at 40, 55, 70, and 85 dB SPL, respectively. All but two subjects showed an instantaneous improvement in their mean speech audiometric score. Persons with long-term CI use, who received a FOX-assisted CI fitting at least 6 months ago, display improved speech understanding after MAP modifications, as recommended by the current version of FOX. This can be explained only by intrinsic improvements in FOX's algorithms, as they have resulted from learning. This learning is an inherent feature of artificial intelligence and it may yield measurable benefit in speech understanding even in long-term CI recipients.
Functional outcome after total and subtotal glossectomy with free flap reconstruction.

PubMed

Yanai, Chie; Kikutani, Takesi; Adachi, Masatosi; Thoren, Hanna; Suzuki, Munekazu; Iizuka, Tateyuki

2008-07-01

The aim of this study was to evaluate postoperative oral functions of patients who had undergone total or subtotal (75%) glossectomy with preservation of the larynx for oral squamous cell carcinomas. Speech intelligibility and swallowing capacity of 17 patients who had been treated between 1992 and 2002 were scored and classified using standard protocols 6 to 36 months postoperatively. The outcomes were finally rated as good, acceptable, or poor. The 4-year disease-specific survival rate was 64%. Speech intelligibility and swallowing capacity were satisfactory (acceptable or good) in 82.3%. Only 3 patients were still dependent on tube feeding. Good speech perceptibility did not always go together with normal diet tolerance, however. Our satisfactory results are attributable to the use of large, voluminous soft tissue flaps for reconstruction, and to the instigation of postoperative swallowing and speech therapy on a routine basis and at an early juncture.

Exploring the Relationship Between Working Memory, Compressor Speed, and Background Noise Characteristics.

PubMed

Ohlenforst, Barbara; Souza, Pamela E; MacDonald, Ewen N

2016-01-01

Previous work has shown that individuals with lower working memory demonstrate reduced intelligibility for speech processed with fast-acting compression amplification. This relationship has been noted in fluctuating noise, but the extent of noise modulation that must be present to elicit such an effect is unknown. This study expanded on previous study by exploring the effect of background noise modulations in relation to compression speed and working memory ability, using a range of signal to noise ratios. Twenty-six older participants between ages 61 and 90 years were grouped by high or low working memory according to their performance on a reading span test. Speech intelligibility was measured for low-context sentences presented in background noise, where the noise varied in the extent of amplitude modulation. Simulated fast- or slow-acting compression amplification combined with individual frequency-gain shaping was applied to compensate for the individual's hearing loss. Better speech intelligibility scores were observed for participants with high working memory when fast compression was applied than when slow compression was applied. The low working memory group behaved in the opposite way and performed better under slow compression compared with fast compression. There was also a significant effect of the extent of amplitude modulation in the background noise, such that the magnitude of the score difference (fast versus slow compression) depended on the number of talkers in the background noise. The presented signal to noise ratios were not a significant factor on the measured intelligibility performance. In agreement with earlier research, high working memory allowed better speech intelligibility when fast compression was applied in modulated background noise. In the present experiment, that effect was present regardless of the extent of background noise modulation.
Speech and Language Deficits in Early-Treated Children with Galactosemia.

ERIC Educational Resources Information Center

Waisbren, Susan E.; And Others

1983-01-01

Intelligence and speech-language development of eight children (3.6 to 11.6 years old) with classic galactosemia were assessed by standardized tests. Each of the children had delays of early speech difficulties, and all but one had language disorders in at least one area. Available from: Journal of Pediatrics, C.V. Mosby Co., 11830 Westline…
Effects of Alphabet-Supplemented Speech on Brain Activity of Listeners: An fMRI Study

ERIC Educational Resources Information Center

Fercho, Kelene; Baugh, Lee A.; Hanson, Elizabeth K.

2015-01-01

Purpose: The purpose of this article was to examine the neural mechanisms associated with increases in speech intelligibility brought about through alphabet supplementation. Method: Neurotypical participants listened to dysarthric speech while watching an accompanying video of a hand pointing to the 1st letter spoken of each word on an alphabet…
Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters

PubMed Central

Bennett, Christopher L.; Horn, Danielle Bodzin; Dudaryk, Roman

2017-01-01

INTRODUCTION: Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. METHODS: We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. RESULTS: A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m3 (average VR = 131.1 [34.2] m3). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R2 = 0.666, F(2, 37) = 39.9, P < .0001). For Dmax (maximum distance where there is <15% loss of consonant articulation), both VR and VC β coefficients were significant. For RT60 and Dmax, after controlling for VC, partial correlations were 0.825 (P < .0001) and 0.718 (P < .0001), respectively, while after controlling for VR, partial correlations were −0.322 (P = .169) and 0.381 (P < .05), respectively. CONCLUSIONS: Our results suggest that the size and contents of an OR can predict a range of psychoacoustic indices of speech intelligibility. Specifically, increasing OR size correlated with worse speech intelligibility, while increasing amounts of OR contents correlated with improved speech intelligibility. This study provides valuable descriptive data and a predictive method for identifying existing ORs that may benefit from acoustic modifiers (eg, sound absorption panels). Additionally, it suggests that room dimensions and projected clinical use should be considered during the design phase of OR suites to optimize acoustic performance. PMID:28525511
Ten-year follow-up of a consecutive series of children with multichannel cochlear implants.

PubMed

Uziel, Alain S; Sillon, Martine; Vieu, Adrienne; Artieres, Françoise; Piron, Jean-Pierre; Daures, Jean-Pierre; Mondain, Michel

2007-08-01

To assess a group of children who consecutively received implants more than 10 years after implantation with regard to speech perception, speech intelligibility, receptive language level, and academic/occupational status. A prospective longitudinal study. Pediatric referral center for cochlear implantation. Eighty-two prelingually deafened children received the Nucleus multichannel cochlear implant. Cochlear implantation with Cochlear Nucleus CI22 implant. The main outcome measures were open-set Phonetically Balanced Kindergarten word test, discrimination of sentences in noise, connective discourse tracking (CDT) using voice and telephone, speech intelligibility rating (SIR), vocabulary knowledge measured using the Peabody Picture Vocabulary Test (Revised), academic performance on French language, foreign language, and mathematics, and academic/occupational status. After 10 years of implant experience, 79 children (96%) reported that they always wear the device; 79% (65 of 82 children) could use the telephone. The mean scores were 72% for the Phonetically Balanced Kindergarten word test, 44% for word recognition in noise, 55.3 words per minute for the CDT, and 33 words per minute for the CDT via telephone. Thirty-three children (40%) developed speech intelligible to the average listener (SIR 5), and 22 (27%) developed speech intelligible to a listener with little experience of deaf person's speech (SIR 4). The measures of vocabulary showed that most (76%) of children who received implants scored below the median value of their normally hearing peers. The age at implantation was the most important factor that may influence the postimplant outcomes. Regarding educational/vocational status, 6 subjects attend universities, 3 already have a professional activity, 14 are currently at high school level, 32 are at junior high school level, 6 additional children are enrolled in a special unit for children with disability, and 3 children are still attending elementary schools. Seventeen are in further noncompulsory education studying a range of subjects at vocational level. This long-term report shows that many profoundly hearing-impaired children using cochlear implants can develop functional levels of speech perception and production, attain age-appropriate oral language, develop competency level in a language other than their primary language, and achieve satisfactory academic performance.
Perceptual evaluation and acoustic analysis of pneumatic artificial larynx.

PubMed

Xu, Jie Jie; Chen, Xi; Lu, Mei Ping; Qiao, Ming Zhe

2009-12-01

To investigate the perceptual and acoustic characteristics of the pneumatic artificial larynx (PAL) and evaluate its speech ability and clinical value. Prospective study. The study was conducted in the Voice Lab, Department of Otorhinolaryngology, The First Affiliated Hospital of Nanjing Medical University. Forty-six laryngectomy patients using the PAL were rated for intelligibility and fluency of speech. The voice signals of sustained vowel /a/ for 40 healthy controls and 42 successful patients using the PAL were measured by a computer system. The acoustic parameters and sound spectrographs were analyzed and compared between the two groups. Forty-two of 46 patients using the PAL (91.3%) acquired successful speech capability. The intelligibility scores of 42 successful PAL speakers ranged from 71 to 95 percent, and the intelligibility range of four unsuccessful speakers was 30 to 50 percent. The fluency was judged as good or excellent in 42 successful patients, and poor or fair in four unsuccessful patients. There was no significant difference in average fundamental frequency, maximum intensity, jitter, shimmer, and normalized noise energy (NNE) between 42 successful PAL speakers and 40 healthy controls, while the maximum phonation time (MPT) of PAL speakers was slightly lower than that of the controls. The sound spectrographs of the patients using the PAL approximated those of the healthy controls. The PAL has the advantage of a high percentage of successful vocal rehabilitation. PAL speech is fluent and intelligible. The acoustic characteristics of the PAL are similar to those of a normal voice.
Effects of Instantaneous Multiband Dynamic Compression on Speech Intelligibility

NASA Astrophysics Data System (ADS)

Herzke, Tobias; Hohmann, Volker

2005-12-01

The recruitment phenomenon, that is, the reduced dynamic range between threshold and uncomfortable level, is attributed to the loss of instantaneous dynamic compression on the basilar membrane. Despite this, hearing aids commonly use slow-acting dynamic compression for its compensation, because this was found to be the most successful strategy in terms of speech quality and intelligibility rehabilitation. Former attempts to use fast-acting compression gave ambiguous results, raising the question as to whether auditory-based recruitment compensation by instantaneous compression is in principle applicable in hearing aids. This study thus investigates instantaneous multiband dynamic compression based on an auditory filterbank. Instantaneous envelope compression is performed in each frequency band of a gammatone filterbank, which provides a combination of time and frequency resolution comparable to the normal healthy cochlea. The gain characteristics used for dynamic compression are deduced from categorical loudness scaling. In speech intelligibility tests, the instantaneous dynamic compression scheme was compared against a linear amplification scheme, which used the same filterbank for frequency analysis, but employed constant gain factors that restored the sound level for medium perceived loudness in each frequency band. In subjective comparisons, five of nine subjects preferred the linear amplification scheme and would not accept the instantaneous dynamic compression in hearing aids. Four of nine subjects did not perceive any quality differences. A sentence intelligibility test in noise (Oldenburg sentence test) showed little to no negative effects of the instantaneous dynamic compression, compared to linear amplification. A word intelligibility test in quiet (one-syllable rhyme test) showed that the subjects benefit from the larger amplification at low levels provided by instantaneous dynamic compression. Further analysis showed that the increase in intelligibility resulting from a gain provided by instantaneous compression is as high as from a gain provided by linear amplification. No negative effects of the distortions introduced by the instantaneous compression scheme in terms of speech recognition are observed.
Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use

PubMed Central

Gieseler, Anja; Tahden, Maike A. S.; Thiel, Christiane M.; Wagener, Kirsten C.; Meis, Markus; Colonius, Hans

2017-01-01

Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners (mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically relevant auditory and non-auditory measures for speech-in-noise prediction using auditory (audiogram, categorical loudness scaling) and cognitive tests (verbal-intelligence test, screening test of dementia), as well as questionnaires assessing various self-reported measures (health status, socio-economic status, and subjective hearing problems). Using stepwise linear regression analysis, 62% of the variance in unaided speech-in-noise performance was explained, with measures Pure-tone average (PTA), Age, and Verbal intelligence emerging as the three most important predictors. In the complementary analysis, those individuals with the same hearing loss profile were separated into hearing aid users (HAU) and non-users (NU), and were then compared regarding potential differences in the test measures and in explaining unaided speech-in-noise recognition. The groupwise comparisons revealed significant differences in auditory measures and self-reported subjective hearing problems, while no differences in the cognitive domain were found. Furthermore, groupwise regression analyses revealed that Verbal intelligence had a predictive value in both groups, whereas Age and PTA only emerged significant in the group of hearing aid NU. PMID:28270784
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality.

PubMed

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.
Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use.

PubMed

Gieseler, Anja; Tahden, Maike A S; Thiel, Christiane M; Wagener, Kirsten C; Meis, Markus; Colonius, Hans

2017-01-01

Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners ( mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically relevant auditory and non-auditory measures for speech-in-noise prediction using auditory (audiogram, categorical loudness scaling) and cognitive tests (verbal-intelligence test, screening test of dementia), as well as questionnaires assessing various self-reported measures (health status, socio-economic status, and subjective hearing problems). Using stepwise linear regression analysis, 62% of the variance in unaided speech-in-noise performance was explained, with measures Pure-tone average (PTA), Age , and Verbal intelligence emerging as the three most important predictors. In the complementary analysis, those individuals with the same hearing loss profile were separated into hearing aid users (HAU) and non-users (NU), and were then compared regarding potential differences in the test measures and in explaining unaided speech-in-noise recognition. The groupwise comparisons revealed significant differences in auditory measures and self-reported subjective hearing problems, while no differences in the cognitive domain were found. Furthermore, groupwise regression analyses revealed that Verbal intelligence had a predictive value in both groups, whereas Age and PTA only emerged significant in the group of hearing aid NU.
Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

PubMed

Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B

2017-11-06

Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.
A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements.

PubMed

Hauswald, Anne; Lithari, Chrysa; Collignon, Olivier; Leonardelli, Elisa; Weisz, Nathan

2018-05-07

Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Speech and Swallowing in Parkinson’s Disease

PubMed Central

Tjaden, Kris

2009-01-01

Dysarthria and dysphagia occur frequently in Parkinson’s disease (PD). Reduced speech intelligibility is a significant functional limitation of dysarthria, and in the case of PD is likely related articulatory and phonatory impairment. Prosodically-based treatments show the most promise for addressing these deficits as well as for maximizing speech intelligibility. Communication-oriented strategies also may help to enhance mutual understanding between a speaker and listener. Dysphagia in PD can result in serious health issues, including aspiration pneumonia, malnutrition, and dehydration. Early identification of swallowing abnormalities is critical so as to minimize the impact of dysphagia on health status and quality of life. Feeding modifications, compensatory strategies, and therapeutic swallowing techniques all have a role in the management of dysphagia in PD. PMID:19946386
The level and nature of autistic intelligence III: Inspection time.

PubMed

Barbeau, Elise B; Soulières, Isabelle; Dawson, Michelle; Zeffiro, Thomas A; Mottron, Laurent

2013-02-01

Across the autism spectrum, level of intelligence is highly dependent on the psychometric instrument used for assessment, and there are conflicting views concerning which measures best estimate autistic cognitive abilities. Inspection time is a processing speed measure associated with general intelligence in typical individuals. We therefore investigated autism spectrum performance on inspection time in relation to two different general intelligence tests. Autism spectrum individuals were divided into autistic and Asperger subgroups according to speech development history. Compared to a typical control group, mean inspection time for the autistic subgroup but not the Asperger subgroup was significantly shorter (by 31%). However, the shorter mean autistic inspection time was evident only when groups were matched on Wechsler IQ and disappeared when they were matched using Raven's Progressive Matrices. When autism spectrum abilities are compared to typical abilities, results may be influenced by speech development history as well as by the instrument used for intelligence matching. 2013 APA, all rights reserved
GRIN2A

PubMed Central

Turner, Samantha J.; Mayes, Angela K.; Verhoeven, Andrea; Mandelstam, Simone A.; Morgan, Angela T.

2015-01-01

Objective: To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. Methods: We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Results: Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Conclusions: Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. PMID:25596506
Accent, Intelligibility, and the Role of the Listener: Perceptions of English-Accented German by Native German Speakers

ERIC Educational Resources Information Center

Hayes-Harb, Rachel; Watzinger-Tharp, Johanna

2012-01-01

We explore the relationship between accentedness and intelligibility, and investigate how listeners' beliefs about nonnative speech interact with their accentedness and intelligibility judgments. Native German speakers and native English learners of German produced German sentences, which were presented to 12 native German speakers in accentedness…
Agility - The Danish Way (Briefing Charts)

DTIC Science & Technology

2014-06-01

unclassified b. ABSTRACT unclassified c. THIS PAGE unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 ‘A’ PLAN… • Theory ...Social Military Economic Infrastructure Assessment Planning Execution Knowledge Base PMESII SAS 050 C2 Model – Sensemaking (INTELLIGENCE) C2...Warfighting COMMANDER Intelligence manages the sensemaking (ex SA/SU) – it is the make or break for battlespace agility. How we ‘observe’ and
Neural Oscillations Carry Speech Rhythm through to Comprehension

PubMed Central

Peelle, Jonathan E.; Davis, Matthew H.

2012-01-01

A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners’ processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope contains important markers for place and manner of articulation, segmental information, and speech rate. Behavioral studies demonstrate that amplitude envelope information is relied upon by listeners and plays a key role in speech intelligibility. Extending behavioral findings, data from neuroimaging – particularly electroencephalography (EEG) and magnetoencephalography (MEG) – point to phase locking by ongoing cortical oscillations to low-frequency information (~4–8 Hz) in the speech envelope. This phase modulation effectively encodes a prediction of when important events (such as stressed syllables) are likely to occur, and acts to increase sensitivity to these relevant acoustic cues. We suggest a framework through which such neural entrainment to speech rhythm can explain effects of speech rate on word and segment perception (i.e., that the perception of phonemes and words in connected speech is influenced by preceding speech rate). Neuroanatomically, acoustic amplitude modulations are processed largely bilaterally in auditory cortex, with intelligible speech resulting in differential recruitment of left-hemisphere regions. Notable among these is lateral anterior temporal cortex, which we propose functions in a domain-general fashion to support ongoing memory and integration of meaningful input. Together, the reviewed evidence suggests that low-frequency oscillations in the acoustic speech signal form the foundation of a rhythmic hierarchy supporting spoken language, mirrored by phase-locked oscillations in the human brain. PMID:22973251
ACSB: A minimum performance assessment

NASA Technical Reports Server (NTRS)

Jones, Lloyd Thomas; Kissick, William A.

1988-01-01

Amplitude companded sideband (ACSB) is a new modulation technique which uses a much smaller channel width than does conventional frequency modulation (FM). Among the requirements of a mobile communications system is adequate speech intelligibility. This paper explores this aspect of minimum required performance. First, the basic principles of ACSB are described, with emphasis on those features that affect speech quality. Second, the appropriate performance measures for ACSB are reviewed. Third, a subjective voice quality scoring method is used to determine the values of the performance measures that equate to the minimum level of intelligibility. It is assumed that the intelligibility of an FM system operating at 12 dB SINAD represents that minimum. It was determined that ACSB operating at 12 dB SINAD with an audio-to-pilot ratio of 10 dB provides approximately the same intelligibility as FM operating at 12 dB SINAD.
Stimulation of the pedunculopontine nucleus area in Parkinson's disease: effects on speech and intelligibility.

PubMed

Pinto, Serge; Ferraye, Murielle; Espesser, Robert; Fraix, Valérie; Maillet, Audrey; Guirchoum, Jennifer; Layani-Zemour, Deborah; Ghio, Alain; Chabardès, Stéphan; Pollak, Pierre; Debû, Bettina

2014-10-01

Improvement of gait disorders following pedunculopontine nucleus area stimulation in patients with Parkinson's disease has previously been reported and led us to propose this surgical treatment to patients who progressively developed severe gait disorders and freezing despite optimal dopaminergic drug treatment and subthalamic nucleus stimulation. The outcome of our prospective study on the first six patients was somewhat mitigated, as freezing of gait and falls related to freezing were improved by low frequency electrical stimulation of the pedunculopontine nucleus area in some, but not all, patients. Here, we report the speech data prospectively collected in these patients with Parkinson's disease. Indeed, because subthalamic nucleus surgery may lead to speech impairment and a worsening of dysarthria in some patients with Parkinson's disease, we felt it was important to precisely examine any possible modulations of speech for a novel target for deep brain stimulation. Our results suggested a trend towards speech degradation related to the pedunculopontine nucleus area surgery (off stimulation) for aero-phonatory control (maximum phonation time), phono-articulatory coordination (oral diadochokinesis) and speech intelligibility. Possibly, the observed speech degradation may also be linked to the clinical characteristics of the group of patients. The influence of pedunculopontine nucleus area stimulation per se was more complex, depending on the nature of the task: it had a deleterious effect on maximum phonation time and oral diadochokinesis, and mixed effects on speech intelligibility. Whereas levodopa intake and subthalamic nucleus stimulation alone had no and positive effects on speech dimensions, respectively, a negative interaction between the two treatments was observed both before and after pedunculopontine nucleus area surgery. This combination effect did not seem to be modulated by pedunculopontine nucleus area stimulation. Although limited in our group of patients, speech impairment following pedunculopontine nucleus area stimulation is a possible outcome that should be considered before undertaking such surgery. Deleterious effects could be dependent on electrode insertion in this brainstem structure, more than on current spread to nearby structures involved in speech control. The effect of deep brain stimulation on speech in patients with Parkinson's disease remains a challenging and exploratory research area. © The Author (2014). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Acceptable range of speech level in noisy sound fields for young adults and elderly persons.

PubMed

Sato, Hayato; Morimoto, Masayuki; Ota, Ryo

2011-09-01

The acceptable range of speech level as a function of background noise level was investigated on the basis of word intelligibility scores and listening difficulty ratings. In the present study, the acceptable range is defined as the range that maximizes word intelligibility scores and simultaneously does not cause a significant increase in listening difficulty ratings from the minimum ratings. Listening tests with young adult and elderly listeners demonstrated the following. (1) The acceptable range of speech level for elderly listeners overlapped that for young listeners. (2) The lower limit of the acceptable speech level for both young and elderly listeners was 65 dB (A-weighted) for noise levels of 40 and 45 dB (A-weighted), a level with a speech-to-noise ratio of +15 dB for noise levels of 50 and 55 dB, and a level with a speech-to-noise ratio of +10 dB for noise levels from 60 to 70 dB. (3) The upper limit of the acceptable speech level for both young and elderly listeners was 80 dB for noise levels from 40 to 55 dB and 85 dB or above for noise levels from 55 to 70 dB. © 2011 Acoustical Society of America
The Performance of Preschoolers with Speech/Language Disorders on the McCarthy Scales of Children's Abilities.

ERIC Educational Resources Information Center

Morgan, Robert L.; And Others

1992-01-01

Administered McCarthy Scales of Children's Abilities to preschool children of normal intelligence with (n=25) and without (n=25) speech/language disorders. Speech/language disorders group had significantly lower scores on all scales except Motor; showed difficulty in short-term auditory memory skills but not in visual memory skills; and had…
Receptive Vocabulary, Cognitive Flexibility, and Inhibitory Control Differentially Predict Older and Younger Adults' Success Perceiving Speech by Talkers with Dysarthria

ERIC Educational Resources Information Center

Ingvalson, Erin M.; Lansford, Kaitlin L.; Fedorova, Valeriya; Fernandez, Gabriel

2017-01-01

Purpose: Previous research has demonstrated equivocal findings related to the effect of listener age on intelligibility ratings of dysarthric speech. The aim of the present study was to investigate the mechanisms that support younger and older adults' perception of speech by talkers with dysarthria. Method: Younger and older adults identified…
Impact of Clear, Loud, and Slow Speech on Scaled Intelligibility and Speech Severity in Parkinson's Disease and Multiple Sclerosis

ERIC Educational Resources Information Center

Tjaden, Kris; Sussman, Joan E.; Wilding, Gregory E.

2014-01-01

Purpose: The perceptual consequences of rate reduction, increased vocal intensity, and clear speech were studied in speakers with multiple sclerosis (MS), Parkinson's disease (PD), and healthy controls. Method: Seventy-eight speakers read sentences in habitual, clear, loud, and slow conditions. Sentences were equated for peak amplitude and…
The Contribution of Matched Envelope Dynamic Range to the Binaural Benefits in Simulated Bilateral Electric Hearing

ERIC Educational Resources Information Center

Chen, Fei; Wong, Lena L. N.; Qiu, Jianxin; Liu, Yehai; Azimi, Behnam; Hu, Yi

2013-01-01

Purpose: This study examined the effects of envelope dynamic-range mismatch on the intelligibility of Mandarin speech in noise by simulated bilateral electric hearing. Method: Noise-vocoded Mandarin speech, corrupted by speech-shaped noise at 5 and 0 dB signal-to-noise ratios, was presented unilaterally or bilaterally to 10 normal-hearing…
Implementation of the Intelligent Voice System for Kazakh

NASA Astrophysics Data System (ADS)

Yessenbayev, Zh; Saparkhojayev, N.; Tibeyev, T.

2014-04-01

Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis- MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application Server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1% WER on isolated word recognition and 6.9% WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.
Australian children with cleft palate achieve age-appropriate speech by 5 years of age.

PubMed

Chacon, Antonia; Parkin, Melissa; Broome, Kate; Purcell, Alison

2017-12-01

Children with cleft palate demonstrate atypical speech sound development, which can influence their intelligibility, literacy and learning. There is limited documentation regarding how speech sound errors change over time in cleft palate speech and the effect that these errors have upon mono-versus polysyllabic word production. The objective of this study was to examine the phonetic and phonological speech skills of children with cleft palate at ages 3 and 5. A cross-sectional observational design was used. Eligible participants were aged 3 or 5 years with a repaired cleft palate. The Diagnostic Evaluation of Articulation and Phonology (DEAP) Articulation subtest and a non-standardised list of mono- and polysyllabic words were administered once for each child. The Profile of Phonology (PROPH) was used to analyse each child's speech. N = 51 children with cleft palate participated in the study. Three-year-old children with cleft palate produced significantly more speech errors than their typically-developing peers, but no difference was apparent at 5 years. The 5-year-olds demonstrated greater phonetic and phonological accuracy than the 3-year-old children. Polysyllabic words were more affected by errors than monosyllables in the 3-year-old group only. Children with cleft palate are prone to phonetic and phonological speech errors in their preschool years. Most of these speech errors approximate typically-developing children by 5 years. At 3 years, word shape has an influence upon phonological speech accuracy. Speech pathology intervention is indicated to support the intelligibility of these children from their earliest stages of development. Copyright © 2017 Elsevier B.V. All rights reserved.
Speech deterioration in amyotrophic lateral sclerosis (ALS) after manifestation of bulbar symptoms.

PubMed

Makkonen, Tanja; Ruottinen, Hanna; Puhto, Riitta; Helminen, Mika; Palmio, Johanna

2018-03-01

The symptoms and their progression in amyotrophic lateral sclerosis (ALS) are typically studied after the diagnosis has been confirmed. However, many people with ALS already have severe dysarthria and loss of adequate speech at the time of diagnosis. Speech-and-language therapy interventions should be targeted timely based on communicative need in ALS. To investigate how long natural speech will remain functional and to identify the changes in the speech of persons with ALS. Altogether 30 consecutive participants were studied and divided into two groups based on the initial type of ALS, bulbar or spinal. Their speech disorder was evaluated on severity, articulation rate and intelligibility during the 2-year follow-up. The ability to speak deteriorated to poor and necessitated augmentative and alternative communication (AAC) methods with 60% of the participants. Their speech remained adequate on average for 18 months from the first bulbar symptom. Severity, articulation rate and intelligibility declined with nearly all participants during the study. To begin with speech deteriorated more in the bulbar group than in the spinal group and the difference remained during the whole follow-up with some exceptions. The onset of bulbar symptoms indicated the time to loss of speech better than when assessed from ALS diagnosis or the first speech therapy evaluation. In clinical work, it is important to take the initial type of ALS into consideration when determining the urgency of AAC measures as people with bulbar-onset ALS are more susceptible to delayed evaluation and AAC intervention. © 2017 Royal College of Speech and Language Therapists.
[Intermodal timing cues for audio-visual speech recognition].

PubMed

Hashimoto, Masahiro; Kumashiro, Masaharu

2004-06-01

The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.
Long-term impact of tongue reduction on speech intelligibility, articulation and oromyofunctional behaviour in a child with Beckwith-Wiedemann syndrome.

PubMed

Van Lierde, K M; Mortier, G; Huysman, E; Vermeersch, H

2010-03-01

The purpose of the present case study was to determine the long-term impact of partial glossectomy (using the keyhole technique) on overall speech intelligibility and articulation in a Dutch-speaking child with Beckwith-Wiedemann syndrome (BWS). Furthermore the present study is meant as a contribution to the further delineation of the phonation, resonance, articulation and language characteristics and oral behaviour in a child with BWS. Detailed information on the speech and language characteristics of children with BWS may lead to better guidance of pediatric management programs. The child's speech was assessed 9 years after partial glossectomy with regard to ENT characteristics, overall intelligibility (perceptual consensus evaluation), articulation (phonetic and phonological errors), voice (videostroboscopy, vocal quality), resonance (perceptual, nasometric assessment), language (expressive and receptive) and oral behaviour. A class III malocclusion, an anterior open bite, diastema, overangulation of lower incisors and an enlarged but normal symmetric shaped tongue were present. The overall speech intelligibility improved from severely impaired (presurgical) to slightly impaired (5 months post-glossectomy) to normal (9 years postoperative). Comparative phonetic inventory showed a remarkable improvement of articulation. Nine years post-glossectomy three types of distortions seemed to predominate: a rhotacism and sigmatism and the substitution of the alveolar /z/. Oral behaviour, vocal characteristics and resonance were normal, but problems with expressive syntactic abilities were present. The long-term impact of partial glossectomy, using the keyhole technique (preserving the vascularity and the nervous input of the remaining intrinsic tongue muscles), on speech intelligibility, articulation, and oral behaviour in this Dutch-speaking child with congenital macroglossia can be regarded as successful. It is not clear how these expressive syntactical problems demonstrated in this child can be explained. Certainly they are not part of a more general developmental delay, hearing problems or cognitive malfunctioning. To what extent the presence of expressive syntactical problems is a possible aspect of the phenotypic spectrum of children with BWS is subject for further research. Multiple variables, both known and unknown can affect the long-term outcome after partial glossectomy in a child with BWS. The timing and type of the surgical technique, hearing and cognitive functioning are known variables in this study. But variables such as children's motivation, the contribution of the motor-oriented speech therapy, the parental articulation input and stimulation and other family, school and community factors are unknown and are all factors which can influence speech outcome after partial glossectomy. Detailed analyses in a greater number of subjects with BWS may help further illustrate the long-term impact of partial glossectomy. Copyright 2009 Elsevier Ireland Ltd. All rights reserved.
Naval Computer-Based Instruction: Cost, Implementation and Effectiveness Issues.

DTIC Science & Technology

1988-03-01

logical follow on to MITIPAC and are an attempt to use some artificial intelligence (AI) techniques with computer-based training. A good intelligent ...principles of steam plant operation and maintenance. Steamer was written in LISP on a LISP machine in an attempt to use artificial intelligence . "What... Artificial Intelligence and Speech Technology", Electronic Learning, September 1987. Montague, William. E., code 5, Navy Personnel Research and
Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences.

PubMed

Stilp, Christian E; Kiefte, Michael; Alexander, Joshua M; Kluender, Keith R

2010-10-01

Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.
Development of a German reading span test with dual task design for application in cognitive hearing research.

PubMed

Carroll, Rebecca; Meis, Markus; Schulte, Michael; Vormann, Matthias; Kießling, Jürgen; Meister, Hartmut

2015-02-01

To report the development of a standardized German version of a reading span test (RST) with a dual task design. Special attention was paid to psycholinguistic control of the test items and time-sensitive scoring. We aim to establish our RST version to use for determining an individual's working memory in the framework of hearing research in German contexts. RST stimuli were controlled and pretested for psycholinguistic factors. The RST task was to read sentences, quickly determine their plausibility, and later recall certain words to determine a listener's individual reading span. RST results were correlated with outcomes of additional sentence-in-noise tests measured in an aided and an unaided listening condition, each at two reception thresholds. Item plausibility was pre-determined by 28 native German participants. An additional 62 listeners (45-86 years, M = 69.8) with mild-to-moderate hearing loss were tested for speech intelligibility and reading span in a multicenter study. The reading span test significantly correlated with speech intelligibility at both speech reception thresholds in the aided listening condition. Our German RST is standardized with respect to psycholinguistic construction principles of the stimuli, and is a cognitive correlate of intelligibility in a German matrix speech-in-noise test.
The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

PubMed Central

Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

2010-01-01

In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615
Three Trailblazing Technologies for Schools.

ERIC Educational Resources Information Center

McGinty, Tony

1987-01-01

Provides an overview of the capabilities and potential educational applications of CD-ROM (compact disk read-only memory), artificial intelligence, and speech technology. Highlights include reference materials on CD-ROM; current developments in CD-I (compact disk interactive); synthesized and digital speech for microcomputers, including specific…
Environment-specific noise suppression for improved speech intelligibility by cochlear implant users.

PubMed

Hu, Yi; Loizou, Philipos C

2010-06-01

Attempts to develop noise-suppression algorithms that can significantly improve speech intelligibility in noise by cochlear implant (CI) users have met with limited success. This is partly because algorithms were sought that would work equally well in all listening situations. Accomplishing this has been quite challenging given the variability in the temporal/spectral characteristics of real-world maskers. A different approach is taken in the present study focused on the development of environment-specific noise suppression algorithms. The proposed algorithm selects a subset of the envelope amplitudes for stimulation based on the signal-to-noise ratio (SNR) of each channel. Binary classifiers, trained using data collected from a particular noisy environment, are first used to classify the mixture envelopes of each channel as either target-dominated (SNR>or=0 dB) or masker-dominated (SNR<0 dB). Only target-dominated channels are subsequently selected for stimulation. Results with CI listeners indicated substantial improvements (by nearly 44 percentage points at 5 dB SNR) in intelligibility with the proposed algorithm when tested with sentences embedded in three real-world maskers. The present study demonstrated that the environment-specific approach to noise reduction has the potential to restore speech intelligibility in noise to a level near to that attained in quiet.
Inferior frontal sensitivity to common speech sounds is amplified by increasing word intelligibility.

PubMed

Vaden, Kenneth I; Kuchinsky, Stefanie E; Keren, Noam I; Harris, Kelly C; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A

2011-11-01

The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of phonotactic frequency to LIFG activity by manipulating word intelligibility in participants of varying age. Thirty six native English speakers, 19-79 years old (mean=50.5, sd=21.0) indicated with a button press whether they recognized 120 binaurally presented consonant-vowel-consonant words during a sparse sampling fMRI experiment (TR=8 s). Word intelligibility was manipulated by low-pass filtering (cutoff frequencies of 400 Hz, 1000 Hz, 1600 Hz, and 3150 Hz). Group analyses revealed a significant positive correlation between phonotactic frequency and LIFG activity, which was unaffected by age and hearing thresholds. A region of interest analysis revealed that the relation between phonotactic frequency and LIFG activity was significantly strengthened for the most intelligible words (low-pass cutoff at 3150 Hz). These results suggest that the responsiveness of the left inferior frontal cortex to phonotactic frequency reflects the downstream impact of word recognition rather than support of word recognition, at least when there are no speech production demands. Published by Elsevier Ltd.
Use of listening strategies for the speech of individuals with dysarthria and cerebral palsy.

PubMed

Hustad, Katherine C; Dardis, Caitlin M; Kramper, Amy J

2011-03-01

This study examined listeners' endorsement of cognitive, linguistic, segmental, and suprasegmental strategies employed when listening to speakers with dysarthria. The study also examined whether strategy endorsement differed between listeners who earned the highest and lowest intelligibility scores. Speakers were eight individuals with dysarthria and cerebral palsy. Listeners were 80 individuals who transcribed speech stimuli and rated their use of each of 24 listening strategies on a 4-point scale. Results showed that cognitive and linguistic strategies were most highly endorsed. Use of listening strategies did not differ between listeners with the highest and lowest intelligibility scores. Results suggest that there may be a core of strategies common to listeners of speakers with dysarthria that may be supplemented by additional strategies, based on characteristics of the speaker and speech signal.
Acoustic characteristics of voice after severe traumatic brain injury.

PubMed

McHenry, M

2000-07-01

To describe the acoustic characteristics of voice in individuals with motor speech disorders after traumatic brain injury (TBI). Prospective study of 100 individuals with TBI based on consecutive referrals for motor speech evaluations. Subjects were audio tape-recorded while producing sustained vowels and single word and sentence intelligibility tests. Laryngeal airway resistance was estimated, and voice quality was rated perceptually. None of the subjects evidenced vocal parameters within normal limits. The most frequently occurring abnormal parameter across subjects was amplitude perturbation, followed by voice turbulence index. Twenty-three percent of subjects evidenced deviation in all five parameters measured. The perceptual ratings of breathiness were significantly correlated with both the amplitude perturbation quotient and the noise-to-harmonics ratio. Vocal quality deviation is common in motor speech disorders after TBI and may impact intelligibility.
Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.

PubMed

Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana

2017-08-09

The extent to which the sleeping brain processes sensory information remains unclear. This is particularly true for continuous and complex stimuli such as speech, in which information is organized into hierarchically embedded structures. Recently, novel metrics for assessing the neural representation of continuous speech have been developed using noninvasive brain recordings that have thus far only been tested during wakefulness. Here we investigated, for the first time, the sleeping brain's capacity to process continuous speech at different hierarchical levels using a newly developed Concurrent Hierarchical Tracking (CHT) approach that allows monitoring the neural representation and processing-depth of continuous speech online. Speech sequences were compiled with syllables, words, phrases, and sentences occurring at fixed time intervals such that different linguistic levels correspond to distinct frequencies. This enabled us to distinguish their neural signatures in brain activity. We compared the neural tracking of intelligible versus unintelligible (scrambled and foreign) speech across states of wakefulness and sleep using high-density EEG in humans. We found that neural tracking of stimulus acoustics was comparable across wakefulness and sleep and similar across all conditions regardless of speech intelligibility. In contrast, neural tracking of higher-order linguistic constructs (words, phrases, and sentences) was only observed for intelligible speech during wakefulness and could not be detected at all during nonrapid eye movement or rapid eye movement sleep. These results suggest that, whereas low-level auditory processing is relatively preserved during sleep, higher-level hierarchical linguistic parsing is severely disrupted, thereby revealing the capacity and limits of language processing during sleep. SIGNIFICANCE STATEMENT Despite the persistence of some sensory processing during sleep, it is unclear whether high-level cognitive processes such as speech parsing are also preserved. We used a novel approach for studying the depth of speech processing across wakefulness and sleep while tracking neuronal activity with EEG. We found that responses to the auditory sound stream remained intact; however, the sleeping brain did not show signs of hierarchical parsing of the continuous stream of syllables into words, phrases, and sentences. The results suggest that sleep imposes a functional barrier between basic sensory processing and high-level cognitive processing. This paradigm also holds promise for studying residual cognitive abilities in a wide array of unresponsive states. Copyright © 2017 the authors 0270-6474/17/377772-10$15.00/0.

Validation of the Intelligibility in Context Scale for Jamaican Creole-Speaking Preschoolers.

PubMed

Washington, Karla N; McDonald, Megan M; McLeod, Sharynne; Crowe, Kathryn; Devonish, Hubert

2017-08-15

To describe validation of the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012a) and ICS-Jamaican Creole (ICS-JC; McLeod, Harrison, & McCormack, 2012b) in a sample of typically developing 3- to 6-year-old Jamaicans. One-hundred and forty-five preschooler-parent dyads participated in the study. Parents completed the 7-item ICS (n = 145) and ICS-JC (n = 98) to rate children's speech intelligibility (5-point scale) across communication partners (parents, immediate family, extended family, friends, acquaintances, strangers). Preschoolers completed the Diagnostic Evaluation of Articulation and Phonology (DEAP; Dodd, Hua, Crosbie, Holm, & Ozanne, 2006) in English and Jamaican Creole to establish speech-sound competency. For this sample, we examined validity and reliability (interrater, test-rest, internal consistency) evidence using measures of speech-sound production: (a) percentage of consonants correct, (b) percentage of vowels correct, and (c) percentage of phonemes correct. ICS and ICS-JC ratings showed preschoolers were always (5) to usually (4) understood across communication partners (ICS, M = 4.43; ICS-JC, M = 4.50). Both tools demonstrated excellent internal consistency (α = .91), high interrater, and test-retest reliability. Significant correlations between the two tools and between each measure and language-specific percentage of consonants correct, percentage of vowels correct, and percentage of phonemes correct provided criterion-validity evidence. A positive correlation between the ICS and age further strengthened validity evidence for that measure. Both tools show promising evidence of reliability and validity in describing functional speech intelligibility for this group of typically developing Jamaican preschoolers.
Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition.

PubMed

Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T

2018-02-15

The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the target. We also assessed whether the spectral resolution of the noise-vocoded stimuli affected the presence of LRM and SRM under these conditions. In Experiment 1, a mixed factorial design was used to simultaneously manipulate the masker language (within-subject, English vs. Dutch), the simulated masker location (within-subject, right, center, left), and the spectral resolution (between-subjects, 6 vs. 12 channels) of noise-vocoded target-masker combinations presented at +25 dB signal-to-noise ratio (SNR). In Experiment 2, the study was repeated using a spectral resolution of 12 channels at +15 dB SNR. In both experiments, listeners' intelligibility of noise-vocoded targets was better when the background masker was Dutch, demonstrating reliable LRM in all conditions. The pattern of results in Experiment 1 was not reliably different across the 6- and 12-channel noise-vocoded speech. Finally, a reliable spatial benefit (SRM) was detected only in the more challenging SNR condition (Experiment 2). The current study is the first to report a clear LRM benefit in noise-vocoded speech-in-speech recognition. Our results indicate that this benefit is available even under spectrally degraded conditions and that it may augment the benefit due to spatial separation of target speech and competing backgrounds.
A new time-adaptive discrete bionic wavelet transform for enhancing speech from adverse noise environment

NASA Astrophysics Data System (ADS)

Palaniswamy, Sumithra; Duraisamy, Prakash; Alam, Mohammad Showkat; Yuan, Xiaohui

2012-04-01

Automatic speech processing systems are widely used in everyday life such as mobile communication, speech and speaker recognition, and for assisting the hearing impaired. In speech communication systems, the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. To obtain an intelligible speech signal and one that is more pleasant to listen, noise reduction is essential. In this paper a new Time Adaptive Discrete Bionic Wavelet Thresholding (TADBWT) scheme is proposed. The proposed technique uses Daubechies mother wavelet to achieve better enhancement of speech from additive non- stationary noises which occur in real life such as street noise and factory noise. Due to the integration of human auditory system model into the wavelet transform, bionic wavelet transform (BWT) has great potential for speech enhancement which may lead to a new path in speech processing. In the proposed technique, at first, discrete BWT is applied to noisy speech to derive TADBWT coefficients. Then the adaptive nature of the BWT is captured by introducing a time varying linear factor which updates the coefficients at each scale over time. This approach has shown better performance than the existing algorithms at lower input SNR due to modified soft level dependent thresholding on time adaptive coefficients. The objective and subjective test results confirmed the competency of the TADBWT technique. The effectiveness of the proposed technique is also evaluated for speaker recognition task under noisy environment. The recognition results show that the TADWT technique yields better performance when compared to alternate methods specifically at lower input SNR.
Integrated Speech and Language Technology for Intelligence, Surveillance, and Reconnaissance (ISR)

DTIC Science & Technology

2017-07-01

applying submodularity techniques to address computing challenges posed by large datasets in speech and language processing. MT and speech tools were...aforementioned research-oriented activities, the IT system administration team provided necessary support to laboratory computing and network operations...operations of SCREAM Lab computer systems and networks. Other miscellaneous activities in relation to Task Order 29 are presented in an additional fourth
Intelligibility and Acceptability Testing for Speech Technology

DTIC Science & Technology

1992-05-22

information in memory (Luce, Feustel, and Pisoni, 1983). In high workload or multiple task situations, the added effort of listening to degraded speech can lead...the DRT provides diagnostic feature scores on six phonemic features: voicing, nasality, sustention , sibilation, graveness, and compactness, and on a...of other speech materials (e.g., polysyllabic words, paragraphs) and methods ( memory , comprehension, reaction time) have been used to evaluate the
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality

PubMed Central

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E.; Moore, Brian C. J.

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the “clean” speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids. PMID:29708061
Speech therapy for children with dysarthria acquired before three years of age.

PubMed

Pennington, Lindsay; Parker, Naomi K; Kelly, Helen; Miller, Nick

2016-07-18

Children with motor impairments often have the motor speech disorder dysarthria, a condition which effects the tone, strength and co-ordination of any or all of the muscles used for speech. Resulting speech difficulties can range from mild, with slightly slurred articulation and breathy voice, to profound, with an inability to produce any recognisable words. Children with dysarthria are often prescribed communication aids to supplement their natural forms of communication. However, there is variation in practice regarding the provision of therapy focusing on voice and speech production. Descriptive studies have suggested that therapy may improve speech, but its effectiveness has not been evaluated. To assess whether any speech and language therapy intervention aimed at improving the speech of children with dysarthria is more effective in increasing children's speech intelligibility or communicative participation than no intervention at all , and to compare the efficacy of individual types of speech language therapy in improving the speech intelligibility or communicative participation of children with dysarthria. We searched the Cochrane Central Register of Controlled Trials (CENTRAL; 2015 , Issue 7 ), MEDLINE, EMBASE, CINAHL , LLBA, ERIC, PsychInfo, Web of Science, Scopus, UK National Research Register and Dissertation Abstracts up to July 2015, handsearched relevant journals published between 1980 and July 2015, and searched proceedings of relevant conferences between 1996 to 2015. We placed no restrictions on the language or setting of the studies. A previous version of this review considered studies published up to April 2009. In this update we searched for studies published from April 2009 to July 2015. We considered randomised controlled trials and studies using quasi-experimental designs in which children were allocated to groups using non-random methods. One author (LP) conducted searches of all databases, journals and conference reports. All searches included a reliability check in which a second review author independently checked a random sample comprising 15% of all identified reports. We planned that two review authors would independently assess the quality and extract data from eligible studies. No randomised controlled trials or group studies were identified. This review found no evidence from randomised trials of the effectiveness of speech and language therapy interventions to improve the speech of children with early acquired dysarthria. Rigorous, fully powered randomised controlled trials are needed to investigate if the positive changes in children's speech observed in phase I and phase II studies are generalisable to the population of children with early acquired dysarthria served by speech and language therapy services. Research should examine change in children's speech production and intelligibility. It must also investigate children's participation in social and educational activities, and their quality of life, as well as the cost and acceptability of interventions.
Asynchronous sampling of speech with some vocoder experimental results

NASA Technical Reports Server (NTRS)

Babcock, M. L.

1972-01-01

The method of asynchronously sampling speech is based upon the derivatives of the acoustical speech signal. The following results are apparent from experiments to date: (1) It is possible to represent speech by a string of pulses of uniform amplitude, where the only information contained in the string is the spacing of the pulses in time; (2) the string of pulses may be produced in a simple analog manner; (3) the first derivative of the original speech waveform is the most important for the encoding process; (4) the resulting pulse train can be utilized to control an acoustical signal production system to regenerate the intelligence of the original speech.
Language and Communication Development in Down Syndrome

ERIC Educational Resources Information Center

Roberts, Joanne E.; Price, Johanna; Malkin, Cheryl

2007-01-01

Although there is considerable variability, most individuals with Down syndrome have mental retardation and speech and language deficits, particularly in language production and syntax and poor speech intelligibility. This article describes research findings in the language and communication development of individuals with Down syndrome, first…
Speech Recognition for A Digital Video Library.

ERIC Educational Resources Information Center

Witbrock, Michael J.; Hauptmann, Alexander G.

1998-01-01

Production of the meta-data supporting the Informedia Digital Video Library interface is automated using techniques derived from artificial intelligence research. Speech recognition and natural-language processing, information retrieval, and image analysis are applied to produce an interface that helps users locate information and navigate more…
Bibliography of Research in Natural Language Generation

DTIC Science & Technology

1993-11-01

on 1397] Barbara J. Gross Focuing and description in Artifcial Intelligence (GWAI-88), Geseke, West natural language dialogues, In Joshi et al. (557...Proceedings of the Fifth Canadian Conference from information in a frame structure. Data and on Artificial Intelligence , pages Ŕ-24, London, Knowledge...generation workshops (IWNLGS, ENLGWS), natural language processing conferences (ANLP, TINLAP, SPEECH), artificial intelligence conferences (AAAI, SCA
A novel speech prosthesis for mandibular guidance therapy in hemimandibulectomy patient: A clinical report

PubMed Central

Adaki, Raghavendra; Shigli, Kamal; Hormuzdi, Dinshaw M.; Gali, Sivaranjani

2016-01-01

Treating diverse maxillofacial patients poses a challenge to the maxillofacial prosthodontist. Rehabilitation of hemimandibulectomy patients must aim at restoring mastication and other functions such as intelligible speech, swallowing, and esthetics. Prosthetic methods such as palatal ramp and mandibular guiding flange reposition the deviated mandible. Such prosthesis can also be used to restore speech in case of patients with debilitating speech following surgical resection. This clinical report gives detail of a hemimandibulectomy patient provided with an interim removable dental speech prosthesis with composite resin flange for mandibular guidance therapy. PMID:27041917
Differential effects of speech prostheses in glossectomized patients.

PubMed

Leonard, R J; Gillis, R

1990-12-01

Five patients representing different categories of glossal resection were fitted with prostheses specifically designed to improve speech. Speech recordings made for subjects with and without their prostheses were subjected to a variety of analyses. Prosthetic influence on listeners' judgments of severity level/intelligibility, number of consonants in error, and on the acoustic measure F2 range of vowels was evaluated. Findings indicated that all subjects demonstrated improvement on the speech measures. However, the extent of improvement on each measure varied across speakers and resection categories. Implications of the findings for prosthetic speech rehabilitation in this population are discussed.
Artificial Intelligence in Speech Understanding: Two Applications at C.R.I.N.

ERIC Educational Resources Information Center

Carbonell, N.; And Others

1986-01-01

This article explains how techniques of artificial intelligence are applied to expert systems for acoustic-phonetic decoding, phonological interpretation, and multi-knowledge sources for man-machine dialogue implementation. The basic ideas are illustrated with short examples. (Author/JDH)
Three Billy Goats and Gardner.

ERIC Educational Resources Information Center

Merrefield, Gayle Emery

1997-01-01

Describes a Jewish Community Center's efforts to adapt Gardner's multiple-intelligences theory to a preschool special-education program. Since most students had moderate speech disorders, teachers decided to deemphasize linguistic expression in favor of the other seven intelligences. They created successful units exploring patterns and size…
Overall intelligibility, articulation, resonance, voice and language in a child with Nager syndrome.

PubMed

Van Lierde, Kristiane M; Luyten, Anke; Mortier, Geert; Tijskens, Anouk; Bettens, Kim; Vermeersch, Hubert

2011-02-01

The purpose of this study was to provide a description of the language and speech (intelligibility, voice, resonance, articulation) in a 7-year-old Dutch speaking boy with Nager syndrome. To reveal these features comparison was made with an age and gender related child with a similar palatal or hearing problem. Language was tested with an age appropriate language test namely the Dutch version of the Clinical Evaluation of Language Fundamentals. Regarding articulation a phonetic inventory, phonetic analysis and phonological process analysis was performed. A nominal scale with four categories was used to judge the overall speech intelligibility. A voice and resonance assessment included a videolaryngostroboscopy, a perceptual evaluation, acoustic analysis and nasometry. The most striking communication problems in this child were expressive and receptive language delay, moderately impaired speech intelligibility, the presence of phonetic and phonological disorders, resonance disorders and a high-pitched voice. The explanation for this pattern of communication is not completely straightforward. The language and the phonological impairment, only present in the child with the Nager syndrome, are not part of a more general developmental delay. The resonance disorders can be related to the cleft palate, but were not present in the child with the isolated cleft palate. One might assume that the cul-de-sac resonance and the much decreased mandibular movement and the restricted tongue lifting are caused by the restricted jaw mobility and micrognathia. To what extent the suggested mandibular distraction osteogenesis in early childhood allows increased mandibular movement and better speech outcome with increased oral resonance is subject for further research. According to the results of this study the speech and language management must be focused on receptive and expressive language skills and linguistic conceptualization, correct phonetic placement and the modification of hypernasality and nasal emission. Copyright Â© 2010 Elsevier Ireland Ltd. All rights reserved.
Speech comprehension aided by multiple modalities: behavioural and neural interactions

PubMed Central

McGettigan, Carolyn; Faulkner, Andrew; Altarelli, Irene; Obleser, Jonas; Baverstock, Harriet; Scott, Sophie K.

2014-01-01

Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources – e.g. voice, face, gesture, linguistic context – to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with increased activation in left inferior frontal gyrus and left posterior STS. The current multimodal speech comprehension paradigm demonstrates recruitment of a wide comprehension network in the brain, in which posterior STS and fusiform gyrus form sites for convergence of auditory, visual and linguistic information, while left-dominant sites in temporal and frontal cortex support successful comprehension. PMID:22266262
Speech comprehension aided by multiple modalities: behavioural and neural interactions.

PubMed

McGettigan, Carolyn; Faulkner, Andrew; Altarelli, Irene; Obleser, Jonas; Baverstock, Harriet; Scott, Sophie K

2012-04-01

Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources - e.g. voice, face, gesture, linguistic context - to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with increased activation in left inferior frontal gyrus and left posterior STS. The current multimodal speech comprehension paradigm demonstrates recruitment of a wide comprehension network in the brain, in which posterior STS and fusiform gyrus form sites for convergence of auditory, visual and linguistic information, while left-dominant sites in temporal and frontal cortex support successful comprehension. Copyright © 2012 Elsevier Ltd. All rights reserved.
Phonological processes in the speech of school-age children with hearing loss: Comparisons with children with normal hearing.

PubMed

Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline

2018-04-27

In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.
The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

PubMed

Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

2009-06-01

The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p < 0.001) than with the 30 IIDR. Group mean CNC scores at 60 dB SPL, loudness ratings, and the signal to noise ratios-50 for Bamford-Kowal-Bench Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.

Perceptions of University Instructors When Listening to International Student Speech

ERIC Educational Resources Information Center

Sheppard, Beth; Elliott, Nancy; Baese-Berk, Melissa

2017-01-01

Intensive English Program (IEP) Instructors and content faculty both listen to international students at the university. For these two groups of instructors, this study compared perceptions of international student speech by collecting comprehensibility ratings and transcription samples for intelligibility scores. No significant differences were…
Ultrasound visual feedback in articulation therapy following partial glossectomy.

PubMed

Blyth, Katrina M; Mccabe, Patricia; Madill, Catherine; Ballard, Kirrie J

2016-01-01

Disordered speech is common following treatment for tongue cancer, however there is insufficient high quality evidence to guide clinical decision making about treatment. This study investigated use of ultrasound tongue imaging as a visual feedback tool to guide tongue placement during articulation therapy with two participants following partial glossectomy. A Phase I multiple baseline design across behaviors was used to investigate therapeutic effect of ultrasound visual feedback during speech rehabilitation. Percent consonants correct and speech intelligibility at sentence level were used to measure acquisition, generalization and maintenance of speech skills for treated and untreated related phonemes, while unrelated phonemes were tested to demonstrate experimental control. Swallowing and oromotor measures were also taken to monitor change. Sentence intelligibility was not a sensitive measure of speech change, but both participants demonstrated significant change in percent consonants correct for treated phonemes. One participant also demonstrated generalization to non-treated phonemes. Control phonemes along with swallow and oromotor measures remained stable throughout the study. This study establishes therapeutic benefit of ultrasound visual feedback in speech rehabilitation following partial glossectomy. Readers will be able to explain why and how tongue cancer surgery impacts on articulation precision. Readers will also be able to explain the acquisition, generalization and maintenance effects in the study. Copyright © 2016. Published by Elsevier Inc.
Evaluating signal-to-noise ratios, loudness, and related measures as indicators of airborne sound insulation.

PubMed

Park, H K; Bradley, J S

2009-09-01

Subjective ratings of the audibility, annoyance, and loudness of music and speech sounds transmitted through 20 different simulated walls were used to identify better single number ratings of airborne sound insulation. The first part of this research considered standard measures such as the sound transmission class the weighted sound reduction index (R(w)) and variations of these measures [H. K. Park and J. S. Bradley, J. Acoust. Soc. Am. 126, 208-219 (2009)]. This paper considers a number of other measures including signal-to-noise ratios related to the intelligibility of speech and measures related to the loudness of sounds. An exploration of the importance of the included frequencies showed that the optimum ranges of included frequencies were different for speech and music sounds. Measures related to speech intelligibility were useful indicators of responses to speech sounds but were not as successful for music sounds. A-weighted level differences, signal-to-noise ratios and an A-weighted sound transmission loss measure were good predictors of responses when the included frequencies were optimized for each type of sound. The addition of new spectrum adaptation terms to R(w) values were found to be the most practical approach for achieving more accurate predictions of subjective ratings of transmitted speech and music sounds.
Suprasegmental Characteristics of Spontaneous Speech Produced in Good and Challenging Communicative Conditions by Talkers Aged 9-14 Years.

PubMed

Hazan, Valerie; Tuomainen, Outi; Pettinato, Michèle

2016-12-01

This study investigated the acoustic characteristics of spontaneous speech by talkers aged 9-14 years and their ability to adapt these characteristics to maintain effective communication when intelligibility was artificially degraded for their interlocutor. Recordings were made for 96 children (50 female participants, 46 male participants) engaged in a problem-solving task with a same-sex friend; recordings for 20 adults were used as reference. The task was carried out in good listening conditions (normal transmission) and in degraded transmission conditions. Articulation rate, median fundamental frequency (f0), f0 range, and relative energy in the 1- to 3-kHz range were analyzed. With increasing age, children significantly reduced their median f0 and f0 range, became faster talkers, and reduced their mid-frequency energy in spontaneous speech. Children produced similar clear speech adaptations (in degraded transmission conditions) as adults, but only children aged 11-14 years increased their f0 range, an unhelpful strategy not transmitted via the vocoder. Changes made by children were consistent with a general increase in vocal effort. Further developments in speech production take place during later childhood. Children use clear speech strategies to benefit an interlocutor facing intelligibility problems but may not be able to attune these strategies to the same degree as adults.
a New Architecture for Intelligent Systems with Logic Based Languages

NASA Astrophysics Data System (ADS)

Saini, K. K.; Saini, Sanju

2008-10-01

People communicate with each other in sentences that incorporate two kinds of information: propositions about some subject, and metalevel speech acts that specify how the propositional information is used—as an assertion, a command, a question, or a promise. By means of speech acts, a group of people who have different areas of expertise can cooperate and dynamically reconfigure their social interactions to perform tasks and solve problems that would be difficult or impossible for any single individual. This paper proposes a framework for intelligent systems that consist of a variety of specialized components together with logic-based languages that can express propositions and speech acts about those propositions. The result is a system with a dynamically changing architecture that can be reconfigured in various ways: by a human knowledge engineer who specifies a script of speech acts that determine how the components interact; by a planning component that generates the speech acts to redirect the other components; or by a committee of components, which might include human assistants, whose speech acts serve to redirect one another. The components communicate by sending messages to a Linda-like blackboard, in which components accept messages that are either directed to them or that they consider themselves competent to handle.
The effect of sensorineural hearing loss and tinnitus on speech recognition over air and bone conduction military communications headsets.

PubMed

Manning, Candice; Mermagen, Timothy; Scharine, Angelique

2017-06-01

Military personnel are at risk for hearing loss due to noise exposure during deployment (USACHPPM, 2008). Despite mandated use of hearing protection, hearing loss and tinnitus are prevalent due to reluctance to use hearing protection. Bone conduction headsets can offer good speech intelligibility for normal hearing (NH) listeners while allowing the ears to remain open in quiet environments and the use of hearing protection when needed. Those who suffer from tinnitus, the experience of perceiving a sound not produced by an external source, often show degraded speech recognition; however, it is unclear whether this is a result of decreased hearing sensitivity or increased distractibility (Moon et al., 2015). It has been suggested that the vibratory stimulation of a bone conduction headset might ameliorate the effects of tinnitus on speech perception; however, there is currently no research to support or refute this claim (Hoare et al., 2014). Speech recognition of words presented over air conduction and bone conduction headsets was measured for three groups of listeners: NH, sensorineural hearing impaired, and/or tinnitus sufferers. Three levels of speech-to-noise (SNR = 0, -6, -12 dB) were created by embedding speech items in pink noise. Better speech recognition performance was observed with the bone conduction headset regardless of hearing profile, and speech intelligibility was a function of SNR. Discussion will include study limitations and the implications of these findings for those serving in the military. Published by Elsevier B.V.
Role of Binaural Temporal Fine Structure and Envelope Cues in Cocktail-Party Listening.

PubMed

Swaminathan, Jayaganesh; Mason, Christine R; Streeter, Timothy M; Best, Virginia; Roverud, Elin; Kidd, Gerald

2016-08-03

While conversing in a crowded social setting, a listener is often required to follow a target speech signal amid multiple competing speech signals (the so-called "cocktail party" problem). In such situations, separation of the target speech signal in azimuth from the interfering masker signals can lead to an improvement in target intelligibility, an effect known as spatial release from masking (SRM). This study assessed the contributions of two stimulus properties that vary with separation of sound sources, binaural envelope (ENV) and temporal fine structure (TFS), to SRM in normal-hearing (NH) human listeners. Target speech was presented from the front and speech maskers were either colocated with or symmetrically separated from the target in azimuth. The target and maskers were presented either as natural speech or as "noise-vocoded" speech in which the intelligibility was conveyed only by the speech ENVs from several frequency bands; the speech TFS within each band was replaced with noise carriers. The experiments were designed to preserve the spatial cues in the speech ENVs while retaining/eliminating them from the TFS. This was achieved by using the same/different noise carriers in the two ears. A phenomenological auditory-nerve model was used to verify that the interaural correlations in TFS differed across conditions, whereas the ENVs retained a high degree of correlation, as intended. Overall, the results from this study revealed that binaural TFS cues, especially for frequency regions below 1500 Hz, are critical for achieving SRM in NH listeners. Potential implications for studying SRM in hearing-impaired listeners are discussed. Acoustic signals received by the auditory system pass first through an array of physiologically based band-pass filters. Conceptually, at the output of each filter, there are two principal forms of temporal information: slowly varying fluctuations in the envelope (ENV) and rapidly varying fluctuations in the temporal fine structure (TFS). The importance of these two types of information in everyday listening (e.g., conversing in a noisy social situation; the "cocktail-party" problem) has not been established. This study assessed the contributions of binaural ENV and TFS cues for understanding speech in multiple-talker situations. Results suggest that, whereas the ENV cues are important for speech intelligibility, binaural TFS cues are critical for perceptually segregating the different talkers and thus for solving the cocktail party problem. Copyright © 2016 the authors 0270-6474/16/368250-08$15.00/0.
Intelligibility assessment in developmental phonological disorders: accuracy of caregiver gloss.

PubMed

Kwiatkowski, J; Shriberg, L D

1992-10-01

Fifteen caregivers each glossed a simultaneously videotaped and audiotaped sample of their child with speech delay engaged in conversation with a clinician. One of the authors generated a reference gloss for each sample, aided by (a) prior knowledge of the child's speech-language status and error patterns, (b) glosses from the child's clinician and the child's caregiver, (c) unlimited replays of the taped sample, and (d) the information gained from completing a narrow phonetic transcription of the sample. Caregivers glossed an average of 78% of the utterances and 81% of the words. A comparison of their glosses to the reference glosses suggested that they accurately understood an average of 58% of the utterances and 73% of the words. Discussion considers the implications of such findings for methodological and theoretical issues underlying children's moment-to-moment intelligibility breakdowns during speech-language processing.
An Ecosystem of Intelligent ICT Tools for Speech-Language Therapy Based on a Formal Knowledge Model.

PubMed

Robles-Bykbaev, Vladimir; López-Nores, Martín; Pazos-Arias, José; Quisi-Peralta, Diego; García-Duque, Jorge

2015-01-01

The language and communication constitute the development mainstays of several intellectual and cognitive skills in humans. However, there are millions of people around the world who suffer from several disabilities and disorders related with language and communication, while most of the countries present a lack of corresponding services related with health care and rehabilitation. On these grounds, we are working to develop an ecosystem of intelligent ICT tools to support speech and language pathologists, doctors, students, patients and their relatives. This ecosystem has several layers and components, integrating Electronic Health Records management, standardized vocabularies, a knowledge database, an ontology of concepts from the speech-language domain, and an expert system. We discuss the advantages of such an approach through experiments carried out in several institutions assisting children with a wide spectrum of disabilities.
Talking Wheelchair

NASA Technical Reports Server (NTRS)

1981-01-01

Communication is made possible for disabled individuals by means of an electronic system, developed at Stanford University's School of Medicine, which produces highly intelligible synthesized speech. Familiarly known as the "talking wheelchair" and formally as the Versatile Portable Speech Prosthesis (VPSP). Wheelchair mounted system consists of a word processor, a video screen, a voice synthesizer and a computer program which instructs the synthesizer how to produce intelligible sounds in response to user commands. Computer's memory contains 925 words plus a number of common phrases and questions. Memory can also store several thousand other words of the user's choice. Message units are selected by operating a simple switch, joystick or keyboard. Completed message appears on the video screen, then user activates speech synthesizer, which generates a voice with a somewhat mechanical tone. With the keyboard, an experienced user can construct messages as rapidly as 30 words per minute.
Speech impairment in Down syndrome: a review.

PubMed

Kent, Ray D; Vorperian, Houri K

2013-02-01

This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.
Talker- and language-specific effects on speech intelligibility in noise assessed with bilingual talkers: Which language is more robust against noise and reverberation?

PubMed

Hochmuth, Sabine; Jürgens, Tim; Brand, Thomas; Kollmeier, Birger

2015-01-01

Investigate talker- and language-specific aspects of speech intelligibility in noise and reverberation using highly comparable matrix sentence tests across languages. Matrix sentences spoken by German/Russian and German/Spanish bilingual talkers were recorded. These sentences were used to measure speech reception thresholds (SRTs) with native listeners in the respective languages in different listening conditions (stationary and fluctuating noise, multi-talker babble, reverberated speech-in-noise condition). Four German/Russian and four German/Spanish bilingual talkers; 20 native German-speaking, 10 native Russian-speaking, and 10 native Spanish-speaking listeners. Across-talker SRT differences of up to 6 dB were found for both groups of bilinguals. SRTs of German/Russian bilingual talkers were the same in both languages. SRTs of German/Spanish bilingual talkers were higher when they talked in Spanish than when they talked in German. The benefit from listening in the gaps was similar across all languages. The detrimental effect of reverberation was larger for Spanish than for German and Russian. Within the limitations set by the number and slight accentedness of talkers and other possible confounding factors, talker- and test-condition-dependent differences were isolated from the language effect: Russian and German exhibited similar intelligibility in noise and reverberation, whereas Spanish was more impaired in these situations.
Analysis of a simplified normalized covariance measure based on binary weighting functions for predicting the intelligibility of noise-suppressed speech.

PubMed

Chen, Fei; Loizou, Philipos C

2010-12-01

The normalized covariance measure (NCM) has been shown previously to predict reliably the intelligibility of noise-suppressed speech containing non-linear distortions. This study analyzes a simplified NCM measure that requires only a small number of bands (not necessarily contiguous) and uses simple binary (1 or 0) weighting functions. The rationale behind the use of a small number of bands is to account for the fact that the spectral information contained in contiguous or nearby bands is correlated and redundant. The modified NCM measure was evaluated with speech intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech corrupted by four different types of maskers (car, babble, train, and street interferences). High correlation (r = 0.8) was obtained with the modified NCM measure even when only one band was used. Further analysis revealed a masker-specific pattern of correlations when only one band was used, and bands with low correlation signified the corresponding envelopes that have been severely distorted by the noise-suppression algorithm and/or the masker. Correlation improved to r = 0.84 when only two disjoint bands (centered at 325 and 1874 Hz) were used. Even further improvements in correlation (r = 0.85) were obtained when three or four lower-frequency (<700 Hz) bands were selected.
Comparing Binaural Pre-processing Strategies II

PubMed Central

Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-01-01

Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. PMID:26721921
[Pilot study of the acoustic values of the vowels in Spanish as indicators of the severity of dysarthria].

PubMed

Delgado-Hernandez, J

2017-02-01

The acoustic analysis is a tool that provides objective data on changes of speech in dysarthria. To evaluate, in the ataxic dysarthria, the relationship between the vowel space area (VSA), the formant centralization ratio (FCR) and the mean of the primary distances with the speech intelligibility. A sample of fourteen Spanish speakers, ten with dysarthria and four controls, was used. The values of first and second formants in 140 vowels extracted of 140 words were analyzed. To calculate the level of intelligibility seven listeners were involved and a task of identification verbal stimuli was used. The dysarthric subjects have less contrast between middle and high vowels and between back vowels. Significant differences in the VSA, FCR and mean of the primary distances compared to control subjects (p = 0.007, 0.005 and 0.030, respectively) are observed. Regression analysis show the relationship between VSA and the mean of primary distances with the level of speech intelligibility (r = 0.60 and 0.74, respectively). Ataxic dysarthria subjects have lower contrast and vowel centralization in carrying out the vowels. The acoustic measures studied in this preliminary work have a high sensitivity in the detection of dysarthria but only the VSA and the mean of primary distances provide information on the severity of this type of speech disturbance.
Comparing Motor Skills in Autism Spectrum Individuals With and Without Speech Delay

PubMed Central

Barbeau, Elise B.; Meilleur, Andrée‐Anne S.; Zeffiro, Thomas A.

2015-01-01

Movement atypicalities in speed, coordination, posture, and gait have been observed across the autism spectrum (AS) and atypicalities in coordination are more commonly observed in AS individuals without delayed speech (DSM‐IV Asperger) than in those with atypical or delayed speech onset. However, few studies have provided quantitative data to support these mostly clinical observations. Here, we compared perceptual and motor performance between 30 typically developing and AS individuals (21 with speech delay and 18 without speech delay) to examine the associations between limb movement control and atypical speech development. Groups were matched for age, intelligence, and sex. The experimental design included: an inspection time task, which measures visual processing speed; the Purdue Pegboard, which measures finger dexterity, bimanual performance, and hand‐eye coordination; the Annett Peg Moving Task, which measures unimanual goal‐directed arm movement; and a simple reaction time task. We used analysis of covariance to investigate group differences in task performance and linear regression models to explore potential associations between intelligence, language skills, simple reaction time, and visually guided movement performance. AS participants without speech delay performed slower than typical participants in the Purdue Pegboard subtests. AS participants without speech delay showed poorer bimanual coordination than those with speech delay. Visual processing speed was slightly faster in both AS groups than in the typical group. Altogether, these results suggest that AS individuals with and without speech delay differ in visually guided and visually triggered behavior and show that early language skills are associated with slower movement in simple and complex motor tasks. Autism Res 2015, 8: 682–693. © 2015 The Authors Autism Research published by Wiley Periodicals, Inc. on behalf of International Society for Autism Research PMID:25820662
Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices

PubMed Central

Falk, Tiago H.; Parsa, Vijay; Santos, João F.; Arehart, Kathryn; Hazrati, Oldooz; Huber, Rainer; Kates, James M.; Scollie, Susan

2015-01-01

This article presents an overview of twelve existing objective speech quality and intelligibility prediction tools. Two classes of algorithms are presented, namely intrusive and non-intrusive, with the former requiring the use of a reference signal, while the latter does not. Investigated metrics include both those developed for normal hearing listeners, as well as those tailored particularly for hearing impaired (HI) listeners who are users of assistive listening devices (i.e., hearing aids, HAs, and cochlear implants, CIs). Representative examples of those optimized for HI listeners include the speech-to-reverberation modulation energy ratio, tailored to hearing aids (SRMR-HA) and to cochlear implants (SRMR-CI); the modulation spectrum area (ModA); the hearing aid speech quality (HASQI) and perception indices (HASPI); and the PErception MOdel - hearing impairment quality (PEMO-Q-HI). The objective metrics are tested on three subjectively-rated speech datasets covering reverberation-alone, noise-alone, and reverberation-plus-noise degradation conditions, as well as degradations resultant from nonlinear frequency compression and different speech enhancement strategies. The advantages and limitations of each measure are highlighted and recommendations are given for suggested uses of the different tools under specific environmental and processing conditions. PMID:26052190
When Are High-Tech Communicators Effective in Parkinson's Disease?

ERIC Educational Resources Information Center

Ferriero, Giorgio; Caligari, Marco; Ronconi, Gianpaolo; Franchignoni, Franco

2012-01-01

This report describes a 63-year-old woman with Parkinson's disease showing loss of intelligibility of speech and severely impaired handwriting, despite undergoing physical and speech therapies. As the patient had sufficient residual motor abilities and adequate cognitive function and motivation, a computer-based communication aid with a software…
Native Reactions to Non-Native Speech: A Review of Empirical Research.

ERIC Educational Resources Information Center

Eisenstein, Miriam

1983-01-01

Recent research on native speakers' reactions to nonnative speech that views listeners, speakers, and language from a variety of perspectives using both objective and subjective research paradigms is reviewed. Studies of error gravity, relative intelligibility of language samples, the role of accent, speakers' characteristics, and context in which…
Speech Production in Hearing-Impaired Children.

ERIC Educational Resources Information Center

Gold, Toni

1980-01-01

Investigations in recent years have indicated that only about 20% of the speech output of the deaf is understood by the "person on the street." This lack of intelligibility has been associated with some frequently occurring segmental and suprasegmental errors. Journal Availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY…

Enhancing Computer-Based Lessons for Effective Speech Education.

ERIC Educational Resources Information Center

Hemphill, Michael R.; Standerfer, Christina C.

1987-01-01

Assesses the advantages of computer-based instruction on speech education. Concludes that, while it offers tremendous flexibility to the instructor--especially in dynamic lesson design, feedback, graphics, and artificial intelligence--there is no inherent advantage to the use of computer technology in the classroom, unless the student interacts…
Hidden Hearing Loss and Computational Models of the Auditory Pathway: Predicting Speech Intelligibility Decline

DTIC Science & Technology

2016-11-28

of low spontaneous rate auditory nerve fibers (ANFs) and reduction of auditory brainstem response wave-I amplitudes. The goal of this research is...auditory nerve (AN) responses to speech stimuli under a variety of difficult listening conditions. The resulting cochlear neurogram, a spectrogram
Effect of Three Classroom Listening Conditions on Speech Intelligibility

ERIC Educational Resources Information Center

Ross, Mark; Giolas, Thomas G.

1971-01-01

Speech discrimination scores for 13 deaf children were obtained in a classroom under: usual listening condition (hearing aid or not), binaural listening situation using auditory trainer/FM receiver with wireless microphone transmitter turned off, and binaural condition with inputs from auditory trainer/FM receiver and wireless microphone/FM…
Adults Who Read Like Children: The Psycholinguistic Bases. Final Report.

ERIC Educational Resources Information Center

Read, Charles

A study examined basic reading skills among men in prison, comparing poor and adequate readers with respect to comprehension, decoding, short-term memory, and speech perception. Subjects, 88 inmates of normal intelligence, normal hearing, and no significant speech abnormalities, at a minimum-security prison, were given reading comprehension tests…
Contribution of Binaural Masking Release to Improved Speech Intelligibility for different Masker types.

PubMed

Sutojo, Sarinah; van de Par, Steven; Schoenmaker, Esther

2018-06-01

In situations with competing talkers or in the presence of masking noise, speech intelligibility can be improved by spatially separating the target speaker from the interferers. This advantage is generally referred to as spatial release from masking (SRM) and different mechanisms have been suggested to explain it. One proposed mechanism to benefit from spatial cues is the binaural masking release, which is purely stimulus driven. According to this mechanism, the spatial benefit results from differences in the binaural cues of target and masker, which need to appear simultaneously in time and frequency to improve the signal detection. In an alternative proposed mechanism, the differences in the interaural cues improve the segregation of auditory streams, a process, which involves top-down processing rather than being purely stimulus driven. Other than the cues that produce binaural masking release, the interaural cue differences between target and interferer required to improve stream segregation do not have to appear simultaneously in time and frequency. This study is concerned with the contribution of binaural masking release to SRM for three masker types that differ with respect to the amount of energetic masking they exert. Speech intelligibility was measured, employing a stimulus manipulation that inhibits binaural masking release, and analyzed with a metric to account for the number of better-ear glimpses. Results indicate that the contribution of the stimulus-driven binaural masking release plays a minor role while binaural stream segregation and the availability of glimpses in the better ear had a stronger influence on improving the speech intelligibility. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Blind speech separation system for humanoid robot with FastICA for audio filtering and separation

NASA Astrophysics Data System (ADS)

Budiharto, Widodo; Santoso Gunawan, Alexander Agung

2016-07-01

Nowadays, there are many developments in building intelligent humanoid robot, mainly in order to handle voice and image. In this research, we propose blind speech separation system using FastICA for audio filtering and separation that can be used in education or entertainment. Our main problem is to separate the multi speech sources and also to filter irrelevant noises. After speech separation step, the results will be integrated with our previous speech and face recognition system which is based on Bioloid GP robot and Raspberry Pi 2 as controller. The experimental results show the accuracy of our blind speech separation system is about 88% in command and query recognition cases.
Influence of timing of delayed hard palate closure on articulation skills in 3-year-old Danish children with unilateral cleft lip and palate.

PubMed

Willadsen, Elisabeth; Boers, Maria; Schöps, Antje; Kisling-Møller, Mia; Nielsen, Joan Bogh; Jørgensen, Line Dahl; Andersen, Mikael; Bolund, Stig; Andersen, Helene Søgaard

2018-01-01

Differing results regarding articulation skills in young children with cleft palate (CP) have been reported and often interpreted as a consequence of different surgical protocols. To assess the influence of different timing of hard palate closure in a two-stage procedure on articulation skills in 3-year-olds born with unilateral cleft lip and palate (UCLP). Secondary aims were to compare results with peers without CP, and to investigate if there are gender differences in articulation skills. Furthermore, burden of treatment was to be estimated in terms of secondary surgery, hearing and speech therapy. A randomized controlled trial (RCT). Early hard palate closure (EHPC) at 12 months versus late hard palate closure (LHPC) at 36 months in a two-stage procedure was tested in a cohort of 126 Danish-speaking children born with non-syndromic UCLP. All participants had the lip and soft palate closed around 4 months of age. Audio and video recordings of a naming test were available from 113 children (32 girls and 81 boys) and were transcribed phonetically. Recordings were obtained prior to hard palate closure in the LHPC group. The main outcome measures were percentage consonants correct adjusted (PCC-A) and consonant errors from blinded assessments. Results from 36 Danish-speaking children without CP obtained previously by Willadsen in 2012 were used for comparison. Children with EHPC produced significantly more target consonants correctly (83%) than children with LHPC (48%; p < .001). In addition, children with LHPC produced significantly more active cleft speech characteristics than children with EHPC (p < .001). Boys achieved significantly lower PCC-A scores than girls (p = .04) and produced significantly more consonant errors than girls (p = .02). No significant differences were found between groups regarding burden of treatment. The control group performed significantly better than the EHPC and LHPC groups on all compared variables. © 2017 Royal College of Speech and Language Therapists.
Virtual personal assistance

NASA Astrophysics Data System (ADS)

Aditya, K.; Biswadeep, G.; Kedar, S.; Sundar, S.

2017-11-01

Human computer communication has growing demand recent days. The new generation of autonomous technology aspires to give computer interfaces emotional states that relate and consider user as well as system environment considerations. In the existing computational model is based an artificial intelligent and externally by multi-modal expression augmented with semi human characteristics. But the main problem with is multi-model expression is that the hardware control given to the Artificial Intelligence (AI) is very limited. So, in our project we are trying to give the Artificial Intelligence (AI) more control on the hardware. There are two main parts such as Speech to Text (STT) and Text to Speech (TTS) engines are used accomplish the requirement. In this work, we are using a raspberry pi 3, a speaker and a mic as hardware and for the programing part, we are using python scripting.
Five Lectures on Artificial Intelligence

DTIC Science & Technology

1974-09-01

large systems The current projects on speech understanding (which I will describe iater) are an exception to this, dealing explicitly with the problem...learns that "Fred lives in Sydney", we must find some new fact to resolve the tension — 1 SPEECH UNDERSTANDING SYSTEMS perhaps he lives in a zco It is...possible Speech Understanding Systems Most of the problems described above might be characterized as relating to the chunking of knowledge. Such ideas are
Objective speech quality evaluation of real-time speech coders

NASA Astrophysics Data System (ADS)

Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.

1984-02-01

This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.
Successful and rapid response of speech bulb reduction program combined with speech therapy in velopharyngeal dysfunction: a case report.

PubMed

Shin, Yu-Jeong; Ko, Seung-O

2015-12-01

Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.
The Galker test of speech reception in noise; associations with background variables, middle ear status, hearing, and language in Danish preschool children.

PubMed

Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend; Dørup, Jens; Lous, Jørgen

2016-01-01

We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons, analysis of variance (ANOVA) was used. Interrelations were adjusted for using a non-parametric graphic model. In unadjusted analyses, the Galker test was associated with gender, age group, language development (Reynell revised scale), audiometry, and tympanometry. The Galker score was also associated with the parents' and day care teachers' reports on the children's vocabulary, sentence construction, and pronunciation. Type B tympanograms were associated with a mean hearing 5-6dB below that of than type A, C1, or C2. In the graphic analysis, Galker scores were closely and significantly related to Reynell test scores (Gamma (G)=0.35), the children's age group (G=0.33), and the day care teachers' assessment of the children's vocabulary (G=0.26). The Galker test of speech reception in noise appears promising as an easy and quick tool for evaluating preschool children's understanding of spoken words in noise, and it correlated well with the day care teachers' reports and less with the parents' reports. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.

PubMed

De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

2016-10-01

This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.
Effects of a denture adhesive in edentulous patients after maxillectomy.

PubMed

Sumita, Yuka I; Otomaru, Takafumi; Taniguchi, Hisashi

2012-06-01

The objective of this study is to evaluate the usefulness of a denture adhesive in edentulous patients after maxillectomy. Maxillectomy patients suffer from functional impairments. Denture adhesives (DAs) are the solution in such patients. However, little is known about DAs in maxillectomy patients. Eight edentulous patients who had undergone maxillectomy were included and divided into three groups. Group 1 (half ≤ remaining residual maxilla), Group 2 (quarter < remaining residual maxilla < half) and Group 3 (remaining residual maxilla ≤ quarter). They were evaluated by a speech intelligibility test and a mixing ability test, respectively. A cream-type DA called New Poligrip(®) (GlaxoSmithKline, Tokyo, Japan) was used. Applying the DA, speech intelligibility showed a higher score than the data without DA. The effects of using a DA depend on the amount of the remaining residual maxilla. Our study showed that if the remaining residual maxilla is less than a quarter (Group 3), it is difficult to have confidence in the effectiveness of the DA to improve masticatory function. On the other hand, the use a DA showed improved speech intelligibility test values in all groups. © 2011 The Gerodontology Society and John Wiley & Sons A/S.
Strategic Computing. New-Generation Computing Technology: A Strategic Plan for Its Development and Application to Critical Problems in Defense

DTIC Science & Technology

1983-10-28

Computing. By seizing an opportunity to leverage recent advances in artificial intelligence, computer science, and microelectronics, the Agency plans...occurred in many separated areas of artificial intelligence, computer science, and microelectronics. Advances in "expert system" technology now...and expert knowledge o Advances in Artificial Intelligence: Mechanization of speech recognition, vision, and natural language understanding. o
Speech perception in noise with a harmonic complex excited vocoder.

PubMed

Churchill, Tyler H; Kan, Alan; Goupell, Matthew J; Ihlefeld, Antje; Litovsky, Ruth Y

2014-04-01

A cochlear implant (CI) presents band-pass-filtered acoustic envelope information by modulating current pulse train levels. Similarly, a vocoder presents envelope information by modulating an acoustic carrier. By studying how normal hearing (NH) listeners are able to understand degraded speech signals with a vocoder, the parameters that best simulate electric hearing and factors that might contribute to the NH-CI performance difference may be better understood. A vocoder with harmonic complex carriers (fundamental frequency, f0 = 100 Hz) was used to study the effect of carrier phase dispersion on speech envelopes and intelligibility. The starting phases of the harmonic components were randomly dispersed to varying degrees prior to carrier filtering and modulation. NH listeners were tested on recognition of a closed set of vocoded words in background noise. Two sets of synthesis filters simulated different amounts of current spread in CIs. Results showed that the speech vocoded with carriers whose starting phases were maximally dispersed was the most intelligible. Superior speech understanding may have been a result of the flattening of the dispersed-phase carrier's intrinsic temporal envelopes produced by the large number of interacting components in the high-frequency channels. Cross-correlogram analyses of auditory nerve model simulations confirmed that randomly dispersing the carrier's component starting phases resulted in better neural envelope representation. However, neural metrics extracted from these analyses were not found to accurately predict speech recognition scores for all vocoded speech conditions. It is possible that central speech understanding mechanisms are insensitive to the envelope-fine structure dichotomy exploited by vocoders.
Auditory-Phonetic Projection and Lexical Structure in the Recognition of Sine-Wave Words

ERIC Educational Resources Information Center

Remez, Robert E.; Dubowski, Kathryn R.; Broder, Robin S.; Davids, Morgana L.; Grossman, Yael S.; Moskalenko, Marina; Pardo, Jennifer S.; Hasbun, Sara Maria

2011-01-01

Speech remains intelligible despite the elimination of canonical acoustic correlates of phonemes from the spectrum. A portion of this perceptual flexibility can be attributed to modulation sensitivity in the auditory-to-phonetic projection, although signal-independent properties of lexical neighborhoods also affect intelligibility in utterances…
Mathematical Modelling for the Evaluation of Automated Speech Recognition Systems--Research Area 3.3.1 (c)

DTIC Science & Technology

2016-01-07

news. Both of these resemble typical activities of intelligence analysts in OSINT processing and production applications. We assessed two task...intelligence analysts in a number of OSINT processing and production applications. (5) Summary of the most important results In both settings
Improving Intelligibility: Guided Reflective Journals in Action

ERIC Educational Resources Information Center

Lear, Emmaline L.

2014-01-01

This study explores the effectiveness of guided reflective journals to improve intelligibility in a Japanese higher educational context. Based on qualitative and quantitative methods, the paper evaluates changes in speech over the duration of one semester. In particular, this study focuses on changes in prosodic features such as stress, intonation…
Audiologic features of Norrie disease.

PubMed

Halpin, Chris; Owen, Grace; Gutiérrez-Espeleta, Gustavo A; Sims, Katherine; Rehm, Heidi L

2005-07-01

Norrie disease is an X-linked recessive disorder in which patients are born blind and develop sensory hearing loss in adolescence. The hearing loss associated with Norrie disease has been shown in a genetically altered knockout mouse to involve dysfunction of the stria vascularis; most other structures are preserved until the later stages of the disease. The objective of this study was to characterize the audiologic phenotype of Norrie disease for comparison with the pathophysiologic mechanism. The design combined two series of clinical audiologic evaluations, with special attention to speech intelligibility. The audiologic results for 12 affected individuals and 10 carriers show that patients with Norrie disease retain high speech intelligibility scores even when the threshold loss is severe. The cochlear mechanism-- failure of the stria vascularis-- accounts for some of the higher values in the wide distribution of speech scores in cases with similar pure tone audiograms.

Longitudinal development of communication in children with cerebral palsy between 24 and 53 months: Predicting speech outcomes.

PubMed

Hustad, Katherine C; Allison, Kristen M; Sakash, Ashley; McFadd, Emily; Broman, Aimee Teo; Rathouz, Paul J

2017-08-01

To determine whether communication at 2 years predicted communication at 4 years in children with cerebral palsy (CP); and whether the age a child first produces words imitatively predicts change in speech production. 30 children (15 males) with CP participated and were seen 5 times at 6-month intervals between 24 and 53 months (mean age at time 1 = 26.9 months (SD 1.9)). Variables were communication classification at 24 and 53 months, age that children were first able to produce words imitatively, single-word intelligibility, and longest utterance produced. Communication at 24 months was highly predictive of abilities at 53 months. Speaking earlier led to faster gains in intelligibility and length of utterance and better outcomes at 53 months than speaking later. Inability to speak at 24 months indicates greater speech and language difficulty at 53 months and a strong need for early communication intervention.
Developmental Variables and Speech-Language in a Special Education Intervention Model.

ERIC Educational Resources Information Center

Cruz, Maria del C.; Ayala, Myrna

Case studies of eight children with speech and language impairments are presented in a review of the intervention efforts at the Demonstration Center for Preschool Special Education (DCPSE) in Puerto Rico. Five components of the intervention model are examined: social medical history, intelligence, motor development, socio-emotional development,…
Impact of Aberrant Acoustic Properties on the Perception of Sound Quality in Electrolarynx Speech

ERIC Educational Resources Information Center

Meltzner, Geoffrey S.; Hillman, Robert E.

2005-01-01

A large percentage of patients who have undergone laryngectomy to treat advanced laryngeal cancer rely on an electrolarynx (EL) to communicate verbally. Although serviceable, EL speech is plagued by shortcomings in both sound quality and intelligibility. This study sought to better quantify the relative contributions of previously identified…
Intelligibility as a Clinical Outcome Measure Following Intervention with Children with Phonologically Based Speech-Sound Disorders

ERIC Educational Resources Information Center

Lousada, M.; Jesus, Luis M. T.; Hall, A.; Joffe, V.

2014-01-01

Background: The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological…
The Role of Somatosensory Information in Speech Perception: Imitation Improves Recognition of Disordered Speech

ERIC Educational Resources Information Center

Borrie, Stephanie A.; Schäfer, Martina C. M.

2015-01-01

Purpose: Perceptual learning paradigms involving written feedback appear to be a viable clinical tool to reduce the intelligibility burden of dysarthria. The underlying theoretical assumption is that pairing the degraded acoustics with the intended lexical targets facilitates a remapping of existing mental representations in the lexicon. This…
Inferior Frontal Sensitivity to Common Speech Sounds Is Amplified by Increasing Word Intelligibility

ERIC Educational Resources Information Center

Vaden, Kenneth I., Jr.; Kuchinsky, Stefanie E.; Keren, Noam I.; Harris, Kelly C.; Ahlstrom, Jayne B.; Dubno, Judy R.; Eckert, Mark A.

2011-01-01

The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of…
Social Biases toward Children with Speech and Language Impairments: A Correlative Causal Model of Language Limitations.

ERIC Educational Resources Information Center

Rice, Mabel L.; And Others

1993-01-01

In a study of adults' attitudes toward children with limited linguistic competence, four groups of judges listened to audiotaped samples of preschool children's speech and responded to questionnaire items addressing child attributes (e.g., intelligence, social maturity). Systemic biases were revealed toward children with limited communication…
Evaluation of Core Vocabulary Therapy for Deaf Children: Four Treatment Case Studies

ERIC Educational Resources Information Center

Herman, Rosalind; Ford, Katie; Thomas, Jane; Oyebade, Natalie; Bennett, Danita; Dodd, Barbara

2015-01-01

This study evaluated whether core vocabulary intervention (CVT) improved single word speech accuracy, consistency and intelligibility in four 9-11-year-old children with profound sensori-neural deafness fitted with cochlear implants and/or digital hearing aids. Their speech was characterized by inconsistent production of different error forms for…
On the balance of envelope and temporal fine structure in the encoding of speech in the early auditory system.

PubMed

Shamma, Shihab; Lorenzi, Christian

2013-05-01

There is much debate on how the spectrotemporal modulations of speech (or its spectrogram) are encoded in the responses of the auditory nerve, and whether speech intelligibility is best conveyed via the "envelope" (E) or "temporal fine-structure" (TFS) of the neural responses. Wide use of vocoders to resolve this question has commonly assumed that manipulating the amplitude-modulation and frequency-modulation components of the vocoded signal alters the relative importance of E or TFS encoding on the nerve, thus facilitating assessment of their relative importance to intelligibility. Here we argue that this assumption is incorrect, and that the vocoder approach is ineffective in differentially altering the neural E and TFS. In fact, we demonstrate using a simplified model of early auditory processing that both neural E and TFS encode the speech spectrogram with constant and comparable relative effectiveness regardless of the vocoder manipulations. However, we also show that neural TFS cues are less vulnerable than their E counterparts under severe noisy conditions, and hence should play a more prominent role in cochlear stimulation strategies.
Reaction times of normal listeners to laryngeal, alaryngeal, and synthetic speech.

PubMed

Evitts, Paul M; Searl, Jeff

2006-12-01

The purpose of this study was to compare listener processing demands when decoding alaryngeal compared to laryngeal speech. Fifty-six listeners were presented with single words produced by 1 proficient speaker from 5 different modes of speech: normal, tracheosophageal (TE), esophageal (ES), electrolaryngeal (EL), and synthetic speech (SS). Cognitive processing load was indexed by listener reaction time (RT). To account for significant durational differences among the modes of speech, an RT ratio was calculated (stimulus duration divided by RT). Results indicated that the cognitive processing load was greater for ES and EL relative to normal speech. TE and normal speech did not differ in terms of RT ratio, suggesting fairly comparable cognitive demands placed on the listener. SS required greater cognitive processing load than normal and alaryngeal speech. The results are discussed relative to alaryngeal speech intelligibility and the role of the listener. Potential clinical applications and directions for future research are also presented.
An integrated approach to improving noisy speech perception

NASA Astrophysics Data System (ADS)

Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail

2002-05-01

For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.
CTC Sentinel. Volume 8, Issue 9, September 2015

DTIC Science & Technology

2015-09-01

without com- promise, complacency, equivocation, or circumvention.”12 13 The speech caused concern across the Syrian opposition, many mem- bers of which...Bayda, slide 8. n Hamza bin Ladin gave an audio speech that was released on August 14 calling for lone wolf attacks against the United States and...the West, for example. “Al-Qaeda’s as-Sahab Media Releases Audio Speech from Hamza bin Laden,” SITE Intelligence Group, August 14, 2015. SEP TEMBER
Consequences of Stimulus Type on Higher-Order Processing in Single-Sided Deaf Cochlear Implant Users.

PubMed

Finke, Mareike; Sandmann, Pascale; Bönitz, Hanna; Kral, Andrej; Büchner, Andreas

2016-01-01

Single-sided deaf subjects with a cochlear implant (CI) provide the unique opportunity to compare central auditory processing of the electrical input (CI ear) and the acoustic input (normal-hearing, NH, ear) within the same individual. In these individuals, sensory processing differs between their two ears, while cognitive abilities are the same irrespectively of the sensory input. To better understand perceptual-cognitive factors modulating speech intelligibility with a CI, this electroencephalography study examined the central-auditory processing of words, the cognitive abilities, and the speech intelligibility in 10 postlingually single-sided deaf CI users. We found lower hit rates and prolonged response times for word classification during an oddball task for the CI ear when compared with the NH ear. Also, event-related potentials reflecting sensory (N1) and higher-order processing (N2/N4) were prolonged for word classification (targets versus nontargets) with the CI ear compared with the NH ear. Our results suggest that speech processing via the CI ear and the NH ear differs both at sensory (N1) and cognitive (N2/N4) processing stages, thereby affecting the behavioral performance for speech discrimination. These results provide objective evidence for cognition to be a key factor for speech perception under adverse listening conditions, such as the degraded speech signal provided from the CI. © 2016 S. Karger AG, Basel.
Development and evaluation of the British English coordinate response measure speech-in-noise test as an occupational hearing assessment tool.

PubMed

Semeraro, Hannah D; Rowan, Daniel; van Besouw, Rachel M; Allsopp, Adrian A

2017-10-01

The studies described in this article outline the design and development of a British English version of the coordinate response measure (CRM) speech-in-noise (SiN) test. Our interest in the CRM is as a SiN test with high face validity for occupational auditory fitness for duty (AFFD) assessment. Study 1 used the method of constant stimuli to measure and adjust the psychometric functions of each target word, producing a speech corpus with equal intelligibility. After ensuring all the target words had similar intelligibility, for Studies 2 and 3, the CRM was presented in an adaptive procedure in stationary speech-spectrum noise to measure speech reception thresholds and evaluate the test-retest reliability of the CRM SiN test. Studies 1 (n = 20) and 2 (n = 30) were completed by normal-hearing civilians. Study 3 (n = 22) was completed by hearing impaired military personnel. The results display good test-retest reliability (95% confidence interval (CI) < 2.1 dB) and concurrent validity when compared to the triple-digit test (r ≤ 0.65), and the CRM is sensitive to hearing impairment. The British English CRM using stationary speech-spectrum noise is a "ready to use" SiN test, suitable for investigation as an AFFD assessment tool for military personnel.
Audio-visual speech intelligibility benefits with bilateral cochlear implants when talker location varies.

PubMed

van Hoesel, Richard J M

2015-04-01

One of the key benefits of using cochlear implants (CIs) in both ears rather than just one is improved localization. It is likely that in complex listening scenes, improved localization allows bilateral CI users to orient toward talkers to improve signal-to-noise ratios and gain access to visual cues, but to date, that conjecture has not been tested. To obtain an objective measure of that benefit, seven bilateral CI users were assessed for both auditory-only and audio-visual speech intelligibility in noise using a novel dynamic spatial audio-visual test paradigm. For each trial conducted in spatially distributed noise, first, an auditory-only cueing phrase that was spoken by one of four talkers was selected and presented from one of four locations. Shortly afterward, a target sentence was presented that was either audio-visual or, in another test configuration, audio-only and was spoken by the same talker and from the same location as the cueing phrase. During the target presentation, visual distractors were added at other spatial locations. Results showed that in terms of speech reception thresholds (SRTs), the average improvement for bilateral listening over the better performing ear alone was 9 dB for the audio-visual mode, and 3 dB for audition-alone. Comparison of bilateral performance for audio-visual and audition-alone showed that inclusion of visual cues led to an average SRT improvement of 5 dB. For unilateral device use, no such benefit arose, presumably due to the greatly reduced ability to localize the target talker to acquire visual information. The bilateral CI speech intelligibility advantage over the better ear in the present study is much larger than that previously reported for static talker locations and indicates greater everyday speech benefits and improved cost-benefit than estimated to date.
Effects of the rate of formant-frequency variation on the grouping of formants in speech perception.

PubMed

Summers, Robert J; Bailey, Peter J; Roberts, Brian

2012-04-01

How speech is separated perceptually from other speech remains poorly understood. Recent research suggests that the ability of an extraneous formant to impair intelligibility depends on the modulation of its frequency, but not its amplitude, contour. This study further examined the effect of formant-frequency variation on intelligibility by manipulating the rate of formant-frequency change. Target sentences were synthetic three-formant (F1 + F2 + F3) analogues of natural utterances. Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3C; F2 + F3), where F2C + F3C constitute a competitor for F2 and F3 that listeners must reject to optimize recognition. Competitors were derived using formant-frequency contours extracted from extended passages spoken by the same talker and processed to alter the rate of formant-frequency variation, such that rate scale factors relative to the target sentences were 0, 0.25, 0.5, 1, 2, and 4 (0 = constant frequencies). Competitor amplitude contours were either constant, or time-reversed and rate-adjusted in parallel with the frequency contour. Adding a competitor typically reduced intelligibility; this reduction increased with competitor rate until the rate was at least twice that of the target sentences. Similarity in the results for the two amplitude conditions confirmed that formant amplitude contours do not influence across-formant grouping. The findings indicate that competitor efficacy is not tuned to the rate of the target sentences; most probably, it depends primarily on the overall rate of frequency variation in the competitor formants. This suggests that, when segregating the speech of concurrent talkers, differences in speech rate may not be a significant cue for across-frequency grouping of formants.
Multipath/RFI/modulation study for DRSS-RFI problem: Voice coding and intelligibility testing for a satellite-based air traffic control system

NASA Technical Reports Server (NTRS)

Birch, J. N.; Getzin, N.

1971-01-01

Analog and digital voice coding techniques for application to an L-band satellite-basedair traffic control (ATC) system for over ocean deployment are examined. In addition to performance, the techniques are compared on the basis of cost, size, weight, power consumption, availability, reliability, and multiplexing features. Candidate systems are chosen on the bases of minimum required RF bandwidth and received carrier-to-noise density ratios. A detailed survey of automated and nonautomated intelligibility testing methods and devices is presented and comparisons given. Subjective evaluation of speech system by preference tests is considered. Conclusion and recommendations are developed regarding the selection of the voice system. Likewise, conclusions and recommendations are developed for the appropriate use of intelligibility tests, speech quality measurements, and preference tests with the framework of the proposed ATC system.
Effects of music therapy in the treatment of children with delayed speech development - results of a pilot study.

PubMed

Gross, Wibke; Linden, Ulrike; Ostermann, Thomas

2010-07-21

Language development is one of the most significant processes of early childhood development. Children with delayed speech development are more at risk of acquiring other cognitive, social-emotional, and school-related problems. Music therapy appears to facilitate speech development in children, even within a short period of time. The aim of this pilot study is to explore the effects of music therapy in children with delayed speech development. A total of 18 children aged 3.5 to 6 years with delayed speech development took part in this observational study in which music therapy and no treatment were compared to demonstrate effectiveness. Individual music therapy was provided on an outpatient basis. An ABAB reversal design with alternations between music therapy and no treatment with an interval of approximately eight weeks between the blocks was chosen. Before and after each study period, a speech development test, a non-verbal intelligence test for children, and music therapy assessment scales were used to evaluate the speech development of the children. Compared to the baseline, we found a positive development in the study group after receiving music therapy. Both phonological capacity and the children's understanding of speech increased under treatment, as well as their cognitive structures, action patterns, and level of intelligence. Throughout the study period, developmental age converged with their biological age. Ratings according to the Nordoff-Robbins scales showed clinically significant changes in the children, namely in the areas of client-therapist relationship and communication. This study suggests that music therapy may have a measurable effect on the speech development of children through the treatment's interactions with fundamental aspects of speech development, including the ability to form and maintain relationships and prosodic abilities. Thus, music therapy may provide a basic and supportive therapy for children with delayed speech development. Further studies should be conducted to investigate the mechanisms of these interactions in greater depth. The trial is registered in the German clinical trials register; Trial-No.: DRKS00000343.
A method for determining internal noise criteria based on practical speech communication applied to helicopters

NASA Technical Reports Server (NTRS)

Sternfeld, H., Jr.; Doyle, L. B.

1978-01-01

The relationship between the internal noise environment of helicopters and the ability of personnel to understand commands and instructions was studied. A test program was conducted to relate speech intelligibility to a standard measurement called Articulation Index. An acoustical simulator was used to provide noise environments typical of Army helicopters. Speech material (command sentences and phonetically balanced word lists) were presented at several voice levels in each helicopter environment. Recommended helicopter internal noise criteria, based on speech communication, were derived and the effectiveness of hearing protection devices were evaluated.
Coding strategies for cochlear implants under adverse environments

NASA Astrophysics Data System (ADS)

Tahmina, Qudsia

Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.