Sample records for danish speech intelligibility

  1. Speech intelligibility after glossectomy and speech rehabilitation.

    PubMed

    Furia, C L; Kowalski, L P; Latorre, M R; Angelis, E C; Martins, N M; Barros, A P; Ribeiro, K C

    2001-07-01

    Oral tumor resections cause articulation deficiencies, depending on the site, extent of resection, type of reconstruction, and tongue stump mobility. To evaluate the speech intelligibility of patients undergoing total, subtotal, or partial glossectomy, before and after speech therapy. Twenty-seven patients (24 men and 3 women), aged 34 to 77 years (mean age, 56.5 years), underwent glossectomy. Tumor stages were T1 in 3 patients, T2 in 4, T3 in 8, T4 in 11, and TX in 1; node stages, N0 in 15 patients, N1 in 5, N2a-c in 6, and N3 in 1. No patient had metastases (M0). Patients were divided into 3 groups by extent of tongue resection, ie, total (group 1; n = 6), subtotal (group 2; n = 9), and partial (group 3; n = 12). Different phonological tasks were recorded and analyzed by 3 experienced judges, including sustained 7 oral vowels, vowel in a syllable, and the sequence vowel-consonant-vowel (VCV). The intelligibility of spontaneous speech (sequence story) was scored from 1 to 4 in consensus. All patients underwent a therapeutic program to activate articulatory adaptations, compensations, and maximization of the remaining structures for 3 to 6 months. The tasks were recorded after speech therapy. To compare mean changes, analyses of variance and Wilcoxon tests were used. Patients of groups 1 and 2 significantly improved their speech intelligibility (P<.05). Group 1 improved vowels, VCV, and spontaneous speech; group 2, syllable, VCV, and spontaneous speech. Group 3 demonstrated better intelligibility in the pretherapy phase, but the improvement after therapy was not significant. Speech therapy was effective in improving speech intelligibility of patients undergoing glossectomy, even after major resection. Different pretherapy ability between groups was seen, with improvement of speech intelligibility in groups 1 and 2. The improvement of speech intelligibility in group 3 was not statistically significant, possibly because of the small and heterogeneous sample.

  2. Relationship between speech motor control and speech intelligibility in children with speech sound disorders.

    PubMed

    Namasivayam, Aravind Kumar; Pukonen, Margit; Goshulak, Debra; Yu, Vickie Y; Kadis, Darren S; Kroll, Robert; Pang, Elizabeth W; De Nil, Luc F

    2013-01-01

    The current study was undertaken to investigate the impact of speech motor issues on the speech intelligibility of children with moderate to severe speech sound disorders (SSD) within the context of the PROMPT intervention approach. The word-level Children's Speech Intelligibility Measure (CSIM), the sentence-level Beginner's Intelligibility Test (BIT) and tests of speech motor control and articulation proficiency were administered to 12 children (3:11 to 6:7 years) before and after PROMPT therapy. PROMPT treatment was provided for 45 min twice a week for 8 weeks. Twenty-four naïve adult listeners aged 22-46 years judged the intelligibility of the words and sentences. For CSIM, each time a recorded word was played to the listeners they were asked to look at a list of 12 words (multiple-choice format) and circle the word while for BIT sentences, the listeners were asked to write down everything they heard. Words correctly circled (CSIM) or transcribed (BIT) were averaged across three naïve judges to calculate percentage speech intelligibility. Speech intelligibility at both the word and sentence level was significantly correlated with speech motor control, but not articulatory proficiency. Further, the severity of speech motor planning and sequencing issues may potentially be a limiting factor in connected speech intelligibility and highlights the need to target these issues early and directly in treatment. The reader will be able to: (1) outline the advantages and disadvantages of using word- and sentence-level speech intelligibility tests; (2) describe the impact of speech motor control and articulatory proficiency on speech intelligibility; and (3) describe how speech motor control and speech intelligibility data may provide critical information to aid treatment planning. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. Speech intelligibility in hospitals.

    PubMed

    Ryherd, Erica E; Moeller, Michael; Hsu, Timothy

    2013-07-01

    Effective communication between staff members is key to patient safety in hospitals. A variety of patient care activities including admittance, evaluation, and treatment rely on oral communication. Surprisingly, published information on speech intelligibility in hospitals is extremely limited. In this study, speech intelligibility measurements and occupant evaluations were conducted in 20 units of five different U.S. hospitals. A variety of unit types and locations were studied. Results show that overall, no unit had "good" intelligibility based on the speech intelligibility index (SII > 0.75) and several locations found to have "poor" intelligibility (SII < 0.45). Further, occupied spaces were found to have 10%-15% lower SII than unoccupied spaces on average. Additionally, staff perception of communication problems at nurse stations was significantly correlated with SII ratings. In a targeted second phase, a unit treated with sound absorption had higher SII ratings for a larger percentage of time as compared to an identical untreated unit. Taken as a whole, the study provides an extensive baseline evaluation of speech intelligibility across a variety of hospitals and unit types, offers some evidence of the positive impact of absorption on intelligibility, and identifies areas for future research.

  4. Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

    PubMed

    Schoenmaker, Esther; van de Par, Steven

    2016-01-01

    Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.

  5. Intelligibility of clear speech: effect of instruction.

    PubMed

    Lam, Jennifer; Tjaden, Kris

    2013-10-01

    The authors investigated how clear speech instructions influence sentence intelligibility. Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis investigated percentage-correct intelligibility scores as a function of the 4 conditions and speaker sex. Additional analyses included listener response variability, individual speaker trends, and an alternate intelligibility measure: proportion of content words correct. Relative to the habitual condition, the overenunciate condition was associated with the greatest intelligibility benefit, followed by the hearing impaired and clear conditions. Ten speakers followed this trend. The results indicated different patterns of clear speech benefit for male and female speakers. Greater listener variability was observed for speakers with inherently low habitual intelligibility compared to speakers with inherently high habitual intelligibility. Stable proportions of content words were observed across conditions. Clear speech instructions affected the magnitude of the intelligibility benefit. The instruction to overenunciate may be most effective in clear speech training programs. The findings may help explain the range of clear speech intelligibility benefit previously reported. Listener variability analyses suggested the importance of obtaining multiple listener judgments of intelligibility, especially for speakers with inherently low habitual intelligibility.

  6. Measuring up to speech intelligibility.

    PubMed

    Miller, Nick

    2013-01-01

    Improvement or maintenance of speech intelligibility is a central aim in a whole range of conditions in speech-language therapy, both developmental and acquired. Best clinical practice and pursuance of the evidence base for interventions would suggest measurement of intelligibility forms a vital role in clinical decision-making and monitoring. However, what should be measured to gauge intelligibility and how this is achieved and relates to clinical planning continues to be a topic of debate. This review considers the strengths and weaknesses of selected clinical approaches to intelligibility assessment, stressing the importance of explanatory, diagnostic testing as both a more sensitive and a clinically informative method. The worth of this, and any approach, is predicated, though, on awareness and control of key design, elicitation, transcription and listening/listener variables to maximize validity and reliability of assessments. These are discussed. A distinction is drawn between signal-dependent and -independent factors in intelligibility evaluation. Discussion broaches how these different perspectives might be reconciled to deliver comprehensive insights into intelligibility levels and their clinical/educational significance. The paper ends with a call for wider implementation of best practice around intelligibility assessment. © 2013 Royal College of Speech and Language Therapists.

  7. Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope.

    PubMed

    Vanthornhout, Jonas; Decruy, Lien; Wouters, Jan; Simon, Jonathan Z; Francart, Tom

    2018-04-01

    Speech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation, and memory. Very often, electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods, non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.

  8. Optimizing acoustical conditions for speech intelligibility in classrooms

    NASA Astrophysics Data System (ADS)

    Yang, Wonyoung

    High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with

  9. Predicting speech intelligibility with a multiple speech subsystems approach in children with cerebral palsy.

    PubMed

    Lee, Jimin; Hustad, Katherine C; Weismer, Gary

    2014-10-01

    Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (speech motor impairment [SMI] group) and 9 judged to be free of dysarthria (no SMI [NSMI] group). Data from children with CP were compared to data from age-matched typically developing children. Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and typically developing groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (adjusted R2 = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R2 analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Children in the SMI group had articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems.

  10. Chinese speech intelligibility and its relationship with the speech transmission index for children in elementary school classrooms.

    PubMed

    Peng, Jianxin; Yan, Nanjie; Wang, Dan

    2015-01-01

    The present study investigated Chinese speech intelligibility in 28 classrooms from nine different elementary schools in Guangzhou, China. The subjective Chinese speech intelligibility in the classrooms was evaluated with children in grades 2, 4, and 6 (7 to 12 years old). Acoustical measurements were also performed in these classrooms. Subjective Chinese speech intelligibility scores and objective speech intelligibility parameters, such as speech transmission index (STI), were obtained at each listening position for all tests. The relationship between subjective Chinese speech intelligibility scores and STI was revealed and analyzed. The effects of age on Chinese speech intelligibility scores were compared. Results indicate high correlations between subjective Chinese speech intelligibility scores and STI for grades 2, 4, and 6 children. Chinese speech intelligibility scores increase with increase of age under the same STI condition. The differences in scores among different age groups decrease as STI increases. To achieve 95% Chinese speech intelligibility scores, the STIs required for grades 2, 4, and 6 children are 0.75, 0.69, and 0.63, respectively.

  11. Predicting Speech Intelligibility with A Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    PubMed Central

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystem approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (SMI), and nine judged to be free of dysarthria (NSMI). Data from children with CP were compared to data from age-matched typically developing children (TD). Results Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and TD groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (Adjusted R-squared = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R-squared analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Conclusions Children in the SMI group have articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems. PMID:24824584

  12. Relationship Between Speech Intelligibility and Speech Comprehension in Babble Noise.

    PubMed

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-06-01

    The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to transcribe what they heard and obey the commands in an interactive environment set up for this purpose. The former test provided intelligibility scores and the latter provided comprehension scores. Collected data revealed a globally weak correlation between intelligibility and comprehension scores (r = .35, p < .001). The discrepancy tended to grow as noise level increased. An analysis of standard deviations showed that variability in comprehension scores increased linearly with noise level, whereas higher variability in intelligibility scores was found for moderate noise level conditions. These results support the hypothesis that intelligibility scores are poor predictors of listeners' comprehension in real communication situations. Intelligibility and comprehension scores appear to provide different insights, the first measure being centered on speech signal transfer and the second on communicative performance. Both theoretical and practical implications for the use of speech intelligibility tests as indicators of speakers' performances are discussed.

  13. Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  14. Segmental intelligibility of synthetic speech produced by rule.

    PubMed

    Logan, J S; Greene, B G; Pisoni, D B

    1989-08-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.

  15. Segmental intelligibility of synthetic speech produced by rule

    PubMed Central

    Logan, John S.; Greene, Beth G.; Pisoni, David B.

    2012-01-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884

  16. Effects of interior aircraft noise on speech intelligibility and annoyance

    NASA Technical Reports Server (NTRS)

    Pearsons, K. S.; Bennett, R. L.

    1977-01-01

    Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceived annoyance in 36 subjects. Both speech intelligibility and background level significantly affected judged annoyance. However, the interaction between the two variables showed that above an 85 db background level the speech intelligibility results had a minimal effect on annoyance ratings. Below this level, people rated the background as less annoying if there was adequate speech intelligibility.

  17. Speech Intelligibility and Hearing Protector Selection

    DTIC Science & Technology

    2016-08-29

    for use.   8     Another nonstandardized speech intelligibility test relevant to military environments is the Coordinate Response Measure ( CRM ...developed by the U.S. Air Force Research Laboratory (Bolia, Nelson, Ericson, and Simpson, 2000). The phrases in the CRM are comprised of a call...detections and the percentage of correctly identified color-number combinations. The CRM is particularly useful in evaluating speech intelligibility over

  18. Speech Intelligibility Advantages using an Acoustic Beamformer Display

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Sunder, Kaushik; Godfroy, Martine; Otto, Peter

    2015-01-01

    A speech intelligibility test conforming to the Modified Rhyme Test of ANSI S3.2 "Method for Measuring the Intelligibility of Speech Over Communication Systems" was conducted using a prototype 12-channel acoustic beamformer system. The target speech material (signal) was identified against speech babble (noise), with calculated signal-noise ratios of 0, 5 and 10 dB. The signal was delivered at a fixed beam orientation of 135 deg (re 90 deg as the frontal direction of the array) and the noise at 135 deg (co-located) and 0 deg (separated). A significant improvement in intelligibility from 57% to 73% was found for spatial separation for the same signal-noise ratio (0 dB). Significant effects for improved intelligibility due to spatial separation were also found for higher signal-noise ratios (5 and 10 dB).

  19. The Pathways for Intelligible Speech: Multivariate and Univariate Perspectives

    PubMed Central

    Evans, S.; Kyong, J.S.; Rosen, S.; Golestani, N.; Warren, J.E.; McGettigan, C.; Mourão-Miranda, J.; Wise, R.J.S.; Scott, S.K.

    2014-01-01

    An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400–2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486–2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local “searchlights” and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. PMID:23585519

  20. Effects of intelligibility on working memory demand for speech perception.

    PubMed

    Francis, Alexander L; Nusbaum, Howard C

    2009-08-01

    Understanding low-intelligibility speech is effortful. In three experiments, we examined the effects of intelligibility on working memory (WM) demands imposed by perception of synthetic speech. In all three experiments, a primary speeded word recognition task was paired with a secondary WM-load task designed to vary the availability of WM capacity during speech perception. Speech intelligibility was varied either by training listeners to use available acoustic cues in a more diagnostic manner (as in Experiment 1) or by providing listeners with more informative acoustic cues (i.e., better speech quality, as in Experiments 2 and 3). In the first experiment, training significantly improved intelligibility and recognition speed; increasing WM load significantly slowed recognition. A significant interaction between training and load indicated that the benefit of training on recognition speed was observed only under low memory load. In subsequent experiments, listeners received no training; intelligibility was manipulated by changing synthesizers. Improving intelligibility without training improved recognition accuracy, and increasing memory load still decreased it, but more intelligible speech did not produce more efficient use of available WM capacity. This suggests that perceptual learning modifies the way available capacity is used, perhaps by increasing the use of more phonetically informative features and/or by decreasing use of less informative ones.

  1. Effects of linear and nonlinear speech rate changes on speech intelligibility in stationary and fluctuating maskers

    PubMed Central

    Cooke, Martin; Aubanel, Vincent

    2017-01-01

    Algorithmic modifications to the durational structure of speech designed to avoid intervals of intense masking lead to increases in intelligibility, but the basis for such gains is not clear. The current study addressed the possibility that the reduced information load produced by speech rate slowing might explain some or all of the benefits of durational modifications. The study also investigated the influence of masker stationarity on the effectiveness of durational changes. Listeners identified keywords in sentences that had undergone linear and nonlinear speech rate changes resulting in overall temporal lengthening in the presence of stationary and fluctuating maskers. Relative to unmodified speech, a slower speech rate produced no intelligibility gains for the stationary masker, suggesting that a reduction in information rate does not underlie intelligibility benefits of durationally modified speech. However, both linear and nonlinear modifications led to substantial intelligibility increases in fluctuating noise. One possibility is that overall increases in speech duration provide no new phonetic information in stationary masking conditions, but that temporal fluctuations in the background increase the likelihood of glimpsing additional salient speech cues. Alternatively, listeners may have benefitted from an increase in the difference in speech rates between the target and background. PMID:28618803

  2. Aircraft noise and speech intelligibility in an outdoor living space.

    PubMed

    Alvarsson, Jesper J; Nordström, Henrik; Lundén, Peter; Nilsson, Mats E

    2014-06-01

    Studies of effects on speech intelligibility from aircraft noise in outdoor places are currently lacking. To explore these effects, first-order ambisonic recordings of aircraft noise were reproduced outdoors in a pergola. The average background level was 47 dB LA eq. Lists of phonetically balanced words (LAS max,word = 54 dB) were reproduced simultaneously with aircraft passage noise (LAS max,noise = 72-84 dB). Twenty individually tested listeners wrote down each presented word while seated in the pergola. The main results were (i) aircraft noise negatively affects speech intelligibility at sound pressure levels that exceed those of the speech sound (signal-to-noise ratio, S/N < 0), and (ii) the simple A-weighted S/N ratio was nearly as good an indicator of speech intelligibility as were two more advanced models, the Speech Intelligibility Index and Glasberg and Moore's [J. Audio Eng. Soc. 53, 906-918 (2005)] partial loudness model. This suggests that any of these indicators is applicable for predicting effects of aircraft noise on speech intelligibility outdoors.

  3. Improving the speech intelligibility in classrooms

    NASA Astrophysics Data System (ADS)

    Lam, Choi Ling Coriolanus

    One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations

  4. Acceptable noise level (ANL) with Danish and non-semantic speech materials in adult hearing-aid users.

    PubMed

    Olsen, Steen Østergaard; Lantz, Johannes; Nielsen, Lars Holme; Brännström, K Jonas

    2012-09-01

    The acceptable noise level (ANL) test is used for quantification of the amount of background noise subjects accept when listening to speech. This study investigates Danish hearing-aid users' ANL performance using Danish and non-semantic speech signals, the repeatability of ANL, and the association between ANL and outcome of the international outcome inventory for hearing aids (IOI-HA). ANL was measured in three conditions in both ears at two test sessions. Subjects completed the IOI-HA and the ANL questionnaire. Sixty-three Danish hearing-aid users; fifty-seven subjects were full time users and 6 were part time/non users of hearing aids according to the ANL questionnaire. ANLs were similar to results with American English speech material. The coefficient of repeatability (CR) was 6.5-8.8 dB. IOI-HA scores were not associated to ANL. Danish and non-semantic ANL versions yield results similar to the American English version. The magnitude of the CR indicates that ANL with Danish and non-semantic speech materials is not suitable for prediction of individual patterns of future hearing-aid use or evaluation of individual benefit from hearing-aid features. The ANL with Danish and non-semantic speech materials is not related to IOI-HA outcome.

  5. Characterizing Speech Intelligibility in Noise After Wide Dynamic Range Compression.

    PubMed

    Rhebergen, Koenraad S; Maalderink, Thijs H; Dreschler, Wouter A

    The effects of nonlinear signal processing on speech intelligibility in noise are difficult to evaluate. Often, the effects are examined by comparing speech intelligibility scores with and without processing measured at fixed signal to noise ratios (SNRs) or by comparing the adaptive measured speech reception thresholds corresponding to 50% intelligibility (SRT50) with and without processing. These outcome measures might not be optimal. Measuring at fixed SNRs can be affected by ceiling or floor effects, because the range of relevant SNRs is not know in advance. The SRT50 is less time consuming, has a fixed performance level (i.e., 50% correct), but the SRT50 could give a limited view, because we hypothesize that the effect of most nonlinear signal processing algorithms at the SRT50 cannot be generalized to other points of the psychometric function. In this article, we tested the value of estimating the entire psychometric function. We studied the effect of wide dynamic range compression (WDRC) on speech intelligibility in stationary, and interrupted speech-shaped noise in normal-hearing subjects, using a fast method-based local linear fitting approach and by two adaptive procedures. The measured performance differences for conditions with and without WDRC for the psychometric functions in stationary noise and interrupted speech-shaped noise show that the effects of WDRC on speech intelligibility are SNR dependent. We conclude that favorable and unfavorable effects of WDRC on speech intelligibility can be missed if the results are presented in terms of SRT50 values only.

  6. Variability and Diagnostic Accuracy of Speech Intelligibility Scores in Children

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Oakes, Ashley; Allison, Kristen

    2015-01-01

    Purpose: We examined variability of speech intelligibility scores and how well intelligibility scores predicted group membership among 5-year-old children with speech motor impairment (SMI) secondary to cerebral palsy and an age-matched group of typically developing (TD) children. Method: Speech samples varying in length from 1-4 words were…

  7. Quantifying the intelligibility of speech in noise for non-native listeners.

    PubMed

    van Wijngaarden, Sander J; Steeneken, Herman J M; Houtgast, Tammo

    2002-04-01

    When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.

  8. Quantifying the intelligibility of speech in noise for non-native listeners

    NASA Astrophysics Data System (ADS)

    van Wijngaarden, Sander J.; Steeneken, Herman J. M.; Houtgast, Tammo

    2002-04-01

    When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.

  9. Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

    NASA Astrophysics Data System (ADS)

    Middag, Catherine; Martens, Jean-Pierre; Van Nuffelen, Gwen; De Bodt, Marc

    2009-12-01

    It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words) and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008) is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.

  10. Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems

    PubMed Central

    Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L.; Berry, James D.; Perry, Bridget; Green, Jordan R.

    2016-01-01

    Purpose To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Method Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Results Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Conclusion Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred. PMID:27148967

  11. Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems.

    PubMed

    Rong, Panying; Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L; Berry, James D; Perry, Bridget; Green, Jordan R

    2016-01-01

    To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred.

  12. Speech intelligibility at high helium-oxygen pressures.

    PubMed

    Rothman, H B; Gelfand, R; Hollien, H; Lambertsen, C J

    1980-12-01

    Word-list intelligibility scores of unprocessed speech (mean of 4 subjects) were recorded in helium-oxygen atmospheres at stable pressures equivalent to 1600, 1400, 1200, 1000, 860, 690, 560, 392, and 200 fsw daring Predictive Studies IV-1975 by wide-bandwidth condenser microphones (frequency responses not degraded by increased gas density). Intelligibility scores were substantially lower in helium-oxygen a 200 fsw than in air at l ATA, but there was little difference between 200 fsw and 1600 fsw. A previously documented prominent decrease in intelligibility of speech between 200 or 600 fsw because of helium and pressure was probably due to degradation of microphone frequency response by high gas density.

  13. The Effect of Background Noise on Intelligibility of Dysphonic Speech

    ERIC Educational Resources Information Center

    Ishikawa, Keiko; Boyce, Suzanne; Kelchner, Lisa; Powell, Maria Golla; Schieve, Heidi; de Alarcon, Alessandro; Khosla, Sid

    2017-01-01

    Purpose: The aim of this study is to determine the effect of background noise on the intelligibility of dysphonic speech and to examine the relationship between intelligibility in noise and an acoustic measure of dysphonia--cepstral peak prominence (CPP). Method: A study of speech perception was conducted using speech samples from 6 adult speakers…

  14. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  15. Motivation as a predictor of speech intelligibility after total laryngectomy.

    PubMed

    Singer, Susanne; Meyer, Alexandra; Fuchs, Michael; Schock, Juliane; Pabst, Friedemann; Vogel, Hans-Joachim; Oeken, Jens; Sandner, Annett; Koscielny, Sven; Hormes, Karl; Breitenstein, Kerstin; Dietz, Andreas

    2013-06-01

    It has often been argued that if patients' success with speech rehabilitation after laryngectomy is limited, it is the result of lacking motivation on their part. This project investigated the role of motivation in speech rehabilitation. In a multicenter prospective cohort study, 141 laryngectomees were interviewed at the beginning of rehabilitation and 1 year after laryngectomy. Speech intelligibility was measured with a standardized test, and patients self-assessed their own motivation shortly after the surgery. Logistic regression, adjusted for several theory-based confounding factors, was used to assess the impact of motivation on speech intelligibility. Speech intelligibility 1 year after laryngectomy was not significantly associated with the level of motivation at the beginning of rehabilitation (odds ratio [OR], 1.3; 95% confidence interval [CI], 0.7-2.3; p = .43) after adjusting for the effect of potential confounders (implantation of a voice prosthesis, patient's cognitive abilities, frustration tolerance, physical functioning, and type of rehabilitation). Motivation is not a strong predictor of speech intelligibility 1 year after laryngectomy. Copyright © 2012 Wiley Periodicals, Inc.

  16. Enhancing Speech Intelligibility: Interactions among Context, Modality, Speech Style, and Masker

    ERIC Educational Resources Information Center

    Van Engen, Kristin J.; Phelps, Jasmine E. B.; Smiljanic, Rajka; Chandrasekaran, Bharath

    2014-01-01

    Purpose: The authors sought to investigate interactions among intelligibility-enhancing speech cues (i.e., semantic context, clearly produced speech, and visual information) across a range of masking conditions. Method: Sentence recognition in noise was assessed for 29 normal-hearing listeners. Testing included semantically normal and anomalous…

  17. Effect of classroom acoustics on the speech intelligibility of students.

    PubMed

    Rabelo, Alessandra Terra Vasconcelos; Santos, Juliana Nunes; Oliveira, Rafaella Cristina; Magalhães, Max de Castro

    2014-01-01

    To analyze the acoustic parameters of classrooms and the relationship among equivalent sound pressure level (Leq), reverberation time (T₃₀), the Speech Transmission Index (STI), and the performance of students in speech intelligibility testing. A cross-sectional descriptive study, which analyzed the acoustic performance of 18 classrooms in 9 public schools in Belo Horizonte, Minas Gerais, Brazil, was conducted. The following acoustic parameters were measured: Leq, T₃₀, and the STI. In the schools evaluated, a speech intelligibility test was performed on 273 students, 45.4% of whom were boys, with an average age of 9.4 years. The results of the speech intelligibility test were compared to the values of the acoustic parameters with the help of Student's t-test. The Leq, T₃₀, and STI tests were conducted in empty and furnished classrooms. Children showed better results in speech intelligibility tests conducted in classrooms with less noise, a lower T₃₀, and greater STI values. The majority of classrooms did not meet the recommended regulatory standards for good acoustic performance. Acoustic parameters have a direct effect on the speech intelligibility of students. Noise contributes to a decrease in their understanding of information presented orally, which can lead to negative consequences in their education and their social integration as future professionals.

  18. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    PubMed

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.

  19. Impact of clear, loud, and slow speech on scaled intelligibility and speech severity in Parkinson's disease and multiple sclerosis.

    PubMed

    Tjaden, Kris; Sussman, Joan E; Wilding, Gregory E

    2014-06-01

    The perceptual consequences of rate reduction, increased vocal intensity, and clear speech were studied in speakers with multiple sclerosis (MS), Parkinson's disease (PD), and healthy controls. Seventy-eight speakers read sentences in habitual, clear, loud, and slow conditions. Sentences were equated for peak amplitude and mixed with multitalker babble for presentation to listeners. Using a computerized visual analog scale, listeners judged intelligibility or speech severity as operationally defined in Sussman and Tjaden (2012). Loud and clear but not slow conditions improved intelligibility relative to the habitual condition. With the exception of the loud condition for the PD group, speech severity did not improve above habitual and was reduced relative to habitual in some instances. Intelligibility and speech severity were strongly related, but relationships for disordered speakers were weaker in clear and slow conditions versus habitual. Both clear and loud speech show promise for improving intelligibility and maintaining or improving speech severity in multitalker babble for speakers with mild dysarthria secondary to MS or PD, at least as these perceptual constructs were defined and measured in this study. Although scaled intelligibility and speech severity overlap, the metrics further appear to have some separate value in documenting treatment-related speech changes.

  20. Evaluation of the importance of time-frequency contributions to speech intelligibility in noise

    PubMed Central

    Yu, Chengzhu; Wójcicki, Kamil K.; Loizou, Philipos C.; Hansen, John H. L.; Johnson, Michael T.

    2014-01-01

    Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures. PMID:24815280

  1. Speech Intelligibility in Persian Hearing Impaired Children with Cochlear Implants and Hearing Aids.

    PubMed

    Rezaei, Mohammad; Emadi, Maryam; Zamani, Peyman; Farahani, Farhad; Lotfi, Gohar

    2017-04-01

    The aim of present study is to evaluate and compare speech intelligibility in hearing impaired children with cochlear implants (CI) and hearing aid (HA) users and children with normal hearing (NH). The sample consisted of 45 Persian-speaking children aged 3 to 5-years-old. They were divided into three groups, and each group had 15, children, children with CI and children using hearing aids in Hamadan. Participants was evaluated by the test of speech intelligibility level. Results of ANOVA on speech intelligibility test showed that NH children had significantly better reading performance than hearing impaired children with CI and HA. Post-hoc analysis, using Scheffe test, indicated that the mean score of speech intelligibility of normal children was higher than the HA and CI groups; but the difference was not significant between mean of speech intelligibility in children with hearing loss that use cochlear implant and those using HA. It is clear that even with remarkabkle advances in HA technology, many hearing impaired children continue to find speech production a challenging problem. Given that speech intelligibility is a key element in proper communication and social interaction, consequently, educational and rehabilitation programs are essential to improve speech intelligibility of children with hearing loss.

  2. Assessment of Intelligibility Using Children's Spontaneous Speech: Methodological Aspects

    ERIC Educational Resources Information Center

    Lagerberg, Tove B.; Åsberg, Jakob; Hartelius, Lena; Persson, Christina

    2014-01-01

    Background: Intelligibility is a speaker's ability to convey a message to a listener. Including an assessment of intelligibility is essential in both research and clinical work relating to individuals with communication disorders due to speech impairment. Assessment of the intelligibility of spontaneous speech can be used as an overall…

  3. Sensitivity of the Speech Intelligibility Index to the Assumed Dynamic Range

    ERIC Educational Resources Information Center

    Jin, In-Ki; Kates, James M.; Arehart, Kathryn H.

    2017-01-01

    Purpose: This study aims to evaluate the sensitivity of the speech intelligibility index (SII) to the assumed speech dynamic range (DR) in different languages and with different types of stimuli. Method: Intelligibility prediction uses the absolute transfer function (ATF) to map the SII value to the predicted intelligibility for a given stimuli.…

  4. Variability and Intelligibility of Clarified Speech to Different Listener Groups

    NASA Astrophysics Data System (ADS)

    Silber, Ronnie F.

    Two studies examined the modifications that adult speakers make in speech to disadvantaged listeners. Previous research that has focused on speech to the deaf individuals and to young children has shown that adults clarify speech when addressing these two populations. Acoustic measurements suggest that the signal undergoes similar changes for both populations. Perceptual tests corroborate these results for the deaf population, but are nonsystematic in developmental studies. The differences in the findings for these populations and the nonsystematic results in the developmental literature may be due to methodological factors. The present experiments addressed these methodological questions. Studies of speech to hearing impaired listeners have used read, nonsense, sentences, for which speakers received explicit clarification instructions and feedback, while in the child literature, excerpts of real-time conversations were used. Therefore, linguistic samples were not precisely matched. In this study, experiments used various linguistic materials. Experiment 1 used a children's story; experiment 2, nonsense sentences. Four mothers read both types of material in four ways: (1) in "normal" adult speech, (2) in "babytalk," (3) under the clarification instructions used in the "hearing impaired studies" (instructed clear speech) and (4) in (spontaneous) clear speech without instruction. No extra practice or feedback was given. Sentences were presented to 40 normal hearing college students with and without simultaneous masking noise. Results were separately tabulated for content and function words, and analyzed using standard statistical tests. The major finding in the study was individual variation in speaker intelligibility. "Real world" speakers vary in their baseline intelligibility. The four speakers also showed unique patterns of intelligibility as a function of each independent variable. Results were as follows. Nonsense sentences were less intelligible than story

  5. Predicting Intelligibility Gains in Dysarthria through Automated Speech Feature Analysis

    ERIC Educational Resources Information Center

    Fletcher, Annalise R.; Wisler, Alan A.; McAuliffe, Megan J.; Lansford, Kaitlin L.; Liss, Julie M.

    2017-01-01

    Purpose: Behavioral speech modifications have variable effects on the intelligibility of speakers with dysarthria. In the companion article, a significant relationship was found between measures of speakers' baseline speech and their intelligibility gains following cues to speak louder and reduce rate (Fletcher, McAuliffe, Lansford, Sinex, &…

  6. Speech Characteristics and Intelligibility in Adults with Mild and Moderate Intellectual Disabilities

    PubMed Central

    Coppens-Hofman, Marjolein C.; Terband, Hayo; Snik, Ad F.M.; Maassen, Ben A.M.

    2017-01-01

    Purpose Adults with intellectual disabilities (ID) often show reduced speech intelligibility, which affects their social interaction skills. This study aims to establish the main predictors of this reduced intelligibility in order to ultimately optimise management. Method Spontaneous speech and picture naming tasks were recorded in 36 adults with mild or moderate ID. Twenty-five naïve listeners rated the intelligibility of the spontaneous speech samples. Performance on the picture-naming task was analysed by means of a phonological error analysis based on expert transcriptions. Results The transcription analyses showed that the phonemic and syllabic inventories of the speakers were complete. However, multiple errors at the phonemic and syllabic level were found. The frequencies of specific types of errors were related to intelligibility and quality ratings. Conclusions The development of the phonemic and syllabic repertoire appears to be completed in adults with mild-to-moderate ID. The charted speech difficulties can be interpreted to indicate speech motor control and planning difficulties. These findings may aid the development of diagnostic tests and speech therapies aimed at improving speech intelligibility in this specific group. PMID:28118637

  7. Influence of auditory fatigue on masked speech intelligibility

    NASA Technical Reports Server (NTRS)

    Parker, D. E.; Martens, W. L.; Johnston, P. A.

    1980-01-01

    Intelligibility of PB word lists embedded in simultaneous masking noise was evaluated before and after fatiguing-noise exposure, which was determined by observing the number of words correctly repeated during a shadowing task. Both the speech signal and the masking noise were filtered to a 2825-3185-Hz band. Masking-noise leves were varied from 0- to 90-dB SL. Fatigue was produced by a 1500-3000-Hz octave band of noise at 115 dB (re 20 micron-Pa) presented continuously for 5 min. The results of three experiments indicated that speed intelligibility was reduced when the speech was presented against a background of silence but that the fatiguing-noise exposure had no effect on intelligibility when the speech was made more intense and embedded in masking noise of 40-90-dB SL. These observations are interpreted by considering the recruitment produced by fatigue and masking noise.

  8. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

    ERIC Educational Resources Information Center

    Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

    2004-01-01

    This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…

  9. Speech intelligibility in noise using throat and acoustic microphones.

    PubMed

    Acker-Mills, Barbara E; Houtsma, Adrianus J M; Ahroon, William A

    2006-01-01

    Helicopter cockpits are very noisy and this noise must be reduced for effective communication. The standard U.S. Army aviation helmet is equipped with a noise-canceling acoustic microphone, but some ambient noise still is transmitted. Throat microphones are not sensitive to air molecule vibrations and thus, transmittal of ambient noise is reduced. It is possible that throat microphones could enhance speech communication in helicopters, but speech intelligibility with the devices must first be assessed. In the current study, speech intelligibility of signals generated by an acoustic microphone, a throat microphone, and by the combined output of the two microphones was assessed using the Modified Rhyme Test (MRT). Stimulus words were recorded in a reverberant chamber with ambient broadband noise intensity at 90 and 106 dBA. Listeners completed the MRT task in the same settings, thus simulating the typical environment of a rotary-wing aircraft. Results show that speech intelligibility is significantly worse for the throat microphone (average percent correct = 55.97) than for the acoustic microphone (average percent correct = 69.70), particularly for the higher noise level. In addition, no benefit is gained by simultaneously using both microphones. A follow-up experiment evaluated different consonants using the Diagnostic Rhyme Test and replicated the MRT results. The current results show that intelligibility using throat microphones is poorer than with the use of boom microphones in noisy and in quiet environments. Therefore, throat microphones are not recommended for use in any situation where fast and accurate speech intelligibility is essential.

  10. Investigation of in-vehicle speech intelligibility metrics for normal hearing and hearing impaired listeners

    NASA Astrophysics Data System (ADS)

    Samardzic, Nikolina

    The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly

  11. Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners.

    PubMed

    Geravanchizadeh, Masoud; Fallah, Ali

    2015-12-01

    A binaural and psychoacoustically motivated intelligibility model, based on a well-known monaural microscopic model is proposed. This model simulates a phoneme recognition task in the presence of spatially distributed speech-shaped noise in anechoic scenarios. In the proposed model, binaural advantage effects are considered by generating a feature vector for a dynamic-time-warping speech recognizer. This vector consists of three subvectors incorporating two monaural subvectors to model the better-ear hearing, and a binaural subvector to simulate the binaural unmasking effect. The binaural unit of the model is based on equalization-cancellation theory. This model operates blindly, which means separate recordings of speech and noise are not required for the predictions. Speech intelligibility tests were conducted with 12 normal hearing listeners by collecting speech reception thresholds (SRTs) in the presence of single and multiple sources of speech-shaped noise. The comparison of the model predictions with the measured binaural SRTs, and with the predictions of a macroscopic binaural model called extended equalization-cancellation, shows that this approach predicts the intelligibility in anechoic scenarios with good precision. The square of the correlation coefficient (r(2)) and the mean-absolute error between the model predictions and the measurements are 0.98 and 0.62 dB, respectively.

  12. Speech Intelligibility and Personality Peer-Ratings of Young Adults with Cochlear Implants

    ERIC Educational Resources Information Center

    Freeman, Valerie

    2018-01-01

    Speech intelligibility, or how well a speaker's words are understood by others, affects listeners' judgments of the speaker's competence and personality. Deaf cochlear implant (CI) users vary widely in speech intelligibility, and their speech may have a noticeable "deaf" quality, both of which could evoke negative stereotypes or…

  13. An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

    PubMed

    Kim, Gibak; Lu, Yang; Hu, Yi; Loizou, Philipos C

    2009-09-01

    Traditional noise-suppression algorithms have been shown to improve speech quality, but not speech intelligibility. Motivated by prior intelligibility studies of speech synthesized using the ideal binary mask, an algorithm is proposed that decomposes the input signal into time-frequency (T-F) units and makes binary decisions, based on a Bayesian classifier, as to whether each T-F unit is dominated by the target or the masker. Speech corrupted at low signal-to-noise ratio (SNR) levels (-5 and 0 dB) using different types of maskers is synthesized by this algorithm and presented to normal-hearing listeners for identification. Results indicated substantial improvements in intelligibility (over 60% points in -5 dB babble) over that attained by human listeners with unprocessed stimuli. The findings from this study suggest that algorithms that can estimate reliably the SNR in each T-F unit can improve speech intelligibility.

  14. Automatic intelligibility classification of sentence-level pathological speech

    PubMed Central

    Kim, Jangwon; Kumar, Naveen; Tsiartas, Andreas; Li, Ming; Narayanan, Shrikanth S.

    2014-01-01

    Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGO database (cerebral palsy or amyotrophic lateral sclerosis), by evaluating classification accuracy without overlapping subjects’ data among training and test partitions. Results show that the feature sets of each of the voice quality subsystem, prosodic subsystem, and pronunciation subsystem, offer significant discriminating power for binary intelligibility classification. We observe that the proposed posterior smoothing in the acoustic space can further reduce classification errors. The smoothed posterior score fusion of subsystems shows the best classification performance (73.5% for unweighted, and 72.8% for weighted, average recalls of the binary classes). PMID:25414544

  15. Speech intelligibility enhancement after maxillary denture treatment and its impact on quality of life.

    PubMed

    Knipfer, Christian; Riemann, Max; Bocklet, Tobias; Noeth, Elmar; Schuster, Maria; Sokol, Biljana; Eitner, Stephan; Nkenke, Emeka; Stelzle, Florian

    2014-01-01

    Tooth loss and its prosthetic rehabilitation significantly affect speech intelligibility. However, little is known about the influence of speech deficiencies on oral health-related quality of life (OHRQoL). The aim of this study was to investigate whether speech intelligibility enhancement through prosthetic rehabilitation significantly influences OHRQoL in patients wearing complete maxillary dentures. Speech intelligibility by means of an automatic speech recognition system (ASR) was prospectively evaluated and compared with subjectively assessed Oral Health Impact Profile (OHIP) scores. Speech was recorded in 28 edentulous patients 1 week prior to the fabrication of new complete maxillary dentures and 6 months thereafter. Speech intelligibility was computed based on the word accuracy (WA) by means of an ASR and compared with a matched control group. One week before and 6 months after rehabilitation, patients assessed themselves for OHRQoL. Speech intelligibility improved significantly after 6 months. Subjects reported a significantly higher OHRQoL after maxillary rehabilitation with complete dentures. No significant correlation was found between the OHIP sum score or its subscales to the WA. Speech intelligibility enhancement achieved through the fabrication of new complete maxillary dentures might not be in the forefront of the patients' perception of their quality of life. For the improvement of OHRQoL in patients wearing complete maxillary dentures, food intake and mastication as well as freedom from pain play a more prominent role.

  16. Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

    ERIC Educational Resources Information Center

    Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

    2010-01-01

    Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…

  17. Formant trajectory characteristics in speakers with dysarthria and homogeneous speech intelligibility scores: Further data

    NASA Astrophysics Data System (ADS)

    Kim, Yunjung; Weismer, Gary; Kent, Ray D.

    2005-09-01

    In previous work [J. Acoust. Soc. Am. 117, 2605 (2005)], we reported on formant trajectory characteristics of a relatively large number of speakers with dysarthria and near-normal speech intelligibility. The purpose of that analysis was to begin a documentation of the variability, within relatively homogeneous speech-severity groups, of acoustic measures commonly used to predict across-speaker variation in speech intelligibility. In that study we found that even with near-normal speech intelligibility (90%-100%), many speakers had reduced formant slopes for some words and distributional characteristics of acoustic measures that were different than values obtained from normal speakers. In the current report we extend those findings to a group of speakers with dysarthria with somewhat poorer speech intelligibility than the original group. Results are discussed in terms of the utility of certain acoustic measures as indices of speech intelligibility, and as explanatory data for theories of dysarthria. [Work supported by NIH Award R01 DC00319.

  18. Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users.

    PubMed

    Goehring, Tobias; Bolner, Federico; Monaghan, Jessica J M; van Dijk, Bas; Zarowski, Andrzej; Bleeck, Stefan

    2017-02-01

    Speech understanding in noisy environments is still one of the major challenges for cochlear implant (CI) users in everyday life. We evaluated a speech enhancement algorithm based on neural networks (NNSE) for improving speech intelligibility in noise for CI users. The algorithm decomposes the noisy speech signal into time-frequency units, extracts a set of auditory-inspired features and feeds them to the neural network to produce an estimation of which frequency channels contain more perceptually important information (higher signal-to-noise ratio, SNR). This estimate is used to attenuate noise-dominated and retain speech-dominated CI channels for electrical stimulation, as in traditional n-of-m CI coding strategies. The proposed algorithm was evaluated by measuring the speech-in-noise performance of 14 CI users using three types of background noise. Two NNSE algorithms were compared: a speaker-dependent algorithm, that was trained on the target speaker used for testing, and a speaker-independent algorithm, that was trained on different speakers. Significant improvements in the intelligibility of speech in stationary and fluctuating noises were found relative to the unprocessed condition for the speaker-dependent algorithm in all noise types and for the speaker-independent algorithm in 2 out of 3 noise types. The NNSE algorithms used noise-specific neural networks that generalized to novel segments of the same noise type and worked over a range of SNRs. The proposed algorithm has the potential to improve the intelligibility of speech in noise for CI users while meeting the requirements of low computational complexity and processing delay for application in CI devices. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Speech Intelligibility of Profoundly Deaf Pediatric Hearing Aid Users.

    ERIC Educational Resources Information Center

    Svirsky, Mario A.; Chin, Steven B.; Miyamoto, Richard T.; Sloan, Robert B.; Caldwell, Matthew D.

    2000-01-01

    A study examined the speech intelligibility of children (ages 1-15) with deafness who use hearing aids. Data revealed a strong significant trend toward higher intelligibility for children with more residual hearing, and a significant trend toward higher intelligibility for users of oral communication than those using total communication. (Contains…

  20. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    PubMed Central

    Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

    2015-01-01

    Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that

  1. Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

    ERIC Educational Resources Information Center

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-01-01

    Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…

  2. Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit

    PubMed Central

    2012-01-01

    Abstract Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based on the non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of the proposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearing impaired children and elderly listeners. It was shown that for the speech with average rate equal to or higher than 6.48 vowels/s, all of the proposed methods have statistically significant impact on the improvement of speech intelligibility for hearing impaired children with reduced hearing resolution and one of the proposed methods significantly improves comprehension of speech in the group of elderly listeners with reduced hearing resolution. Virtual slides http://www.diagnosticpathology.diagnomx.eu/vs/2065486371761991 PMID:23009662

  3. Effect of Whole-Body Vibration on Speech. Part 2; Effect on Intelligibility

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.

    2011-01-01

    The effect on speech intelligibility was measured for speech where talkers reading Diagnostic Rhyme Test material were exposed to 0.7 g whole body vibration to simulate space vehicle launch. Across all talkers, the effect of vibration was to degrade the percentage of correctly transcribed words from 83% to 74%. The magnitude of the effect of vibration on speech communication varies between individuals, for both talkers and listeners. A worst case scenario for intelligibility would be the most sensitive listener hearing the most sensitive talker; one participant s intelligibility was reduced by 26% (97% to 71%) for one of the talkers.

  4. Identification of a pathway for intelligible speech in the left temporal lobe

    PubMed Central

    Scott, Sophie K.; Blank, C. Catrin; Rosen, Stuart; Wise, Richard J. S.

    2017-01-01

    Summary It has been proposed that the identification of sounds, including species-specific vocalizations, by primates depends on anterior projections from the primary auditory cortex, an auditory pathway analogous to the ventral route proposed for the visual identification of objects. We have identified a similar route in the human for understanding intelligible speech. Using PET imaging to identify separable neural subsystems within the human auditory cortex, we used a variety of speech and speech-like stimuli with equivalent acoustic complexity but varying intelligibility. We have demonstrated that the left superior temporal sulcus responds to the presence of phonetic information, but its anterior part only responds if the stimulus is also intelligible. This novel observation demonstrates a left anterior temporal pathway for speech comprehension. PMID:11099443

  5. Effects of speech intelligibility level on concurrent visual task performance.

    PubMed

    Payne, D G; Peters, L J; Birkmire, D P; Bonto, M A; Anastasi, J S; Wenger, M J

    1994-09-01

    Four experiments were performed to determine if changes in the level of speech intelligibility in an auditory task have an impact on performance in concurrent visual tasks. The auditory task used in each experiment was a memory search task in which subjects memorized a set of words and then decided whether auditorily presented probe items were members of the memorized set. The visual tasks used were an unstable tracking task, a spatial decision-making task, a mathematical reasoning task, and a probability monitoring task. Results showed that performance on the unstable tracking and probability monitoring tasks was unaffected by the level of speech intelligibility on the auditory task, whereas accuracy in the spatial decision-making and mathematical processing tasks was significantly worse at low speech intelligibility levels. The findings are interpreted within the framework of multiple resource theory.

  6. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

    PubMed

    Greene, Beth G; Logan, John S; Pisoni, David B

    1986-03-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.

  7. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

    PubMed Central

    GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

    2012-01-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916

  8. Speech Intelligibility of Aircrew Mask Communication Configurations in High-Noise Environments

    DTIC Science & Technology

    2017-09-28

    ARL-TR-8168 ● Sep 2017 US Army Research Laboratory Speech Intelligibility of Aircrew Mask Communication Configurations in High ...Laboratory Speech Intelligibility of Aircrew Mask Communication Configurations in High -Noise Environments by Kimberly A Pollard and Lamar Garrett...in High - Noise Environments 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Kimberly A Pollard and Lamar

  9. The speech intelligibility at the opera singing.

    PubMed

    Novák, A; Vokrál, J

    2000-01-01

    The authors investigated the speech intelligibility at opera singing. They analysed several arias sung by soprano voices from Czech opera "Rusalka" (The water Nymph), from Puccini's opera "La Boheme" and several parts of arias from "Il barbiere di Siviglia" sung by baritone, tenor, bass and soprano. The sonographic pictures of selected arias were compared with a subjective evaluation. The difference between both authors was about 70%. The opinion is, that the singer's formant is not the only one problem having its role in the speech intelligibility of opera singing. The important role is played by the ability to change the shape of vocal tract and the ability of rapid and exact articulatory movements. This ability influences the shape of transients that are important at the normal speaking and also in the singing.

  10. Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality.

    PubMed

    Kates, James M; Arehart, Kathryn H

    2015-10-01

    This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships.

  11. Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality

    PubMed Central

    Kates, James M.; Arehart, Kathryn H.

    2015-01-01

    This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships. PMID:26520329

  12. Prior exposure to a reverberant listening environment improves speech intelligibility in adult cochlear implant listeners.

    PubMed

    Srinivasan, Nirmal Kumar; Tobey, Emily A; Loizou, Philipos C

    2016-01-01

    The goal of this study is to investigate whether prior exposure to reverberant listening environment improves speech intelligibility of adult cochlear implant (CI) users. Six adult CI users participated in this study. Speech intelligibility was measured in five different simulated reverberant listening environments with two different speech corpuses. Within each listening environment, prior exposure was varied by either having the same environment across all trials (blocked presentation) or having different environment from trial to trial (unblocked). Speech intelligibility decreased as reverberation time increased. Although substantial individual variability was observed, all CI listeners showed an increase in the blocked presentation condition as compared to the unblocked presentation condition for both speech corpuses. Prior listening exposure to a reverberant listening environment improves speech intelligibility in adult CI listeners. Further research is required to understand the underlying mechanism of adaptation to listening environment.

  13. Talker differences in clear and conversational speech: Vowel intelligibility for normal-hearing listeners

    NASA Astrophysics Data System (ADS)

    Hargus Ferguson, Sarah

    2004-10-01

    Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. While variability among talkers during speech production is well known, only one study to date [Gagné et al., J. Acad. Rehab. Audiol. 27, 135-158 (1994)] has directly examined differences among talkers producing clear and conversational speech. Data from that study, which utilized ten talkers, suggested that talkers vary in the extent to which they improve their intelligibility by speaking clearly. Similar variability can be also seen in studies using smaller groups of talkers [e.g., Picheny, Durlach, and Braida, J. Speech Hear. Res. 28, 96-103 (1985)]. In the current paper, clear and conversational speech materials were recorded from 41 male and female talkers aged 18 to 45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. While female talkers showed a larger clear speech vowel intelligibility benefit than male talkers, neither talker age nor prior experience communicating with hearing-impaired listeners significantly affected the speaking style effect. .

  14. Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility.

    PubMed

    Biberger, Thomas; Ewert, Stephan D

    2016-08-01

    Human auditory perception and speech intelligibility have been successfully described based on the two concepts of spectral masking and amplitude modulation (AM) masking. The power-spectrum model (PSM) [Patterson and Moore (1986). Frequency Selectivity in Hearing, pp. 123-177] accounts for effects of spectral masking and critical bandwidth, while the envelope power-spectrum model (EPSM) [Ewert and Dau (2000). J. Acoust. Soc. Am. 108, 1181-1196] has been successfully applied to AM masking and discrimination. Both models extract the long-term (envelope) power to calculate signal-to-noise ratios (SNR). Recently, the EPSM has been applied to speech intelligibility (SI) considering the short-term envelope SNR on various time scales (multi-resolution speech-based envelope power-spectrum model; mr-sEPSM) to account for SI in fluctuating noise [Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134, 436-446]. Here, a generalized auditory model is suggested combining the classical PSM and the mr-sEPSM to jointly account for psychoacoustics and speech intelligibility. The model was extended to consider the local AM depth in conditions with slowly varying signal levels, and the relative role of long-term and short-term SNR was assessed. The suggested generalized power-spectrum model is shown to account for a large variety of psychoacoustic data and to predict speech intelligibility in various types of background noise.

  15. Application of artifical intelligence principles to the analysis of "crazy" speech.

    PubMed

    Garfield, D A; Rapp, C

    1994-04-01

    Artificial intelligence computer simulation methods can be used to investigate psychotic or "crazy" speech. Here, symbolic reasoning algorithms establish semantic networks that schematize speech. These semantic networks consist of two main structures: case frames and object taxonomies. Node-based reasoning rules apply to object taxonomies and pathway-based reasoning rules apply to case frames. Normal listeners may recognize speech as "crazy talk" based on violations of node- and pathway-based reasoning rules. In this article, three separate segments of schizophrenic speech illustrate violations of these rules. This artificial intelligence approach is compared and contrasted with other neurolinguistic approaches and is discussed as a conceptual link between neurobiological and psychodynamic understandings of psychopathology.

  16. Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech

    PubMed Central

    Kishida, Takuya; Nakajima, Yoshitaka; Ueda, Kazuo; Remijn, Gerard B.

    2016-01-01

    Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary

  17. Factors influencing relative speech intelligibility in patients with oral squamous cell carcinoma: a prospective study using automatic, computer-based speech analysis.

    PubMed

    Stelzle, F; Knipfer, C; Schuster, M; Bocklet, T; Nöth, E; Adler, W; Schempf, L; Vieler, P; Riemann, M; Neukam, F W; Nkenke, E

    2013-11-01

    Oral squamous cell carcinoma (OSCC) and its treatment impair speech intelligibility by alteration of the vocal tract. The aim of this study was to identify the factors of oral cancer treatment that influence speech intelligibility by means of an automatic, standardized speech-recognition system. The study group comprised 71 patients (mean age 59.89, range 35-82 years) with OSCC ranging from stage T1 to T4 (TNM staging). Tumours were located on the tongue (n=23), lower alveolar crest (n=27), and floor of the mouth (n=21). Reconstruction was conducted through local tissue plasty or microvascular transplants. Adjuvant radiotherapy was performed in 49 patients. Speech intelligibility was evaluated before, and at 3, 6, and 12 months after tumour resection, and compared to that of a healthy control group (n=40). Postoperatively, significant influences on speech intelligibility were tumour localization (P=0.010) and resection volume (P=0.019). Additionally, adjuvant radiotherapy (P=0.049) influenced intelligibility at 3 months after surgery. At 6 months after surgery, influences were resection volume (P=0.028) and adjuvant radiotherapy (P=0.034). The influence of tumour localization (P=0.001) and adjuvant radiotherapy (P=0.022) persisted after 12 months. Tumour localization, resection volume, and radiotherapy are crucial factors for speech intelligibility. Radiotherapy significantly impaired word recognition rate (WR) values with a progression of the impairment for up to 12 months after surgery. Copyright © 2013 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.

  18. The Effect of Noise on Relationships Between Speech Intelligibility and Self-Reported Communication Measures in Tracheoesophageal Speakers.

    PubMed

    Eadie, Tanya L; Otero, Devon Sawin; Bolt, Susan; Kapsner-Smith, Mara; Sullivan, Jessica R

    2016-08-01

    The purpose of this study was to examine how sentence intelligibility relates to self-reported communication in tracheoesophageal speakers when speech intelligibility is measured in quiet and noise. Twenty-four tracheoesophageal speakers who were at least 1 year postlaryngectomy provided audio recordings of 5 sentences from the Sentence Intelligibility Test. Speakers also completed self-reported measures of communication-the Voice Handicap Index-10 and the Communicative Participation Item Bank short form. Speech recordings were presented to 2 groups of inexperienced listeners who heard sentences in quiet or noise. Listeners transcribed the sentences to yield speech intelligibility scores. Very weak relationships were found between intelligibility in quiet and measures of voice handicap and communicative participation. Slightly stronger, but still weak and nonsignificant, relationships were observed between measures of intelligibility in noise and both self-reported measures. However, 12 speakers who were more than 65% intelligible in noise showed strong and statistically significant relationships with both self-reported measures (R2 = .76-.79). Speech intelligibility in quiet is a weak predictor of self-reported communication measures in tracheoesophageal speakers. Speech intelligibility in noise may be a better metric of self-reported communicative function for speakers who demonstrate higher speech intelligibility in noise.

  19. Spectrotemporal Modulation Sensitivity as a Predictor of Speech Intelligibility for Hearing-Impaired Listeners

    PubMed Central

    Bernstein, Joshua G.W.; Mehraei, Golbarg; Shamma, Shihab; Gallun, Frederick J.; Theodoroff, Sarah M.; Leek, Marjorie R.

    2014-01-01

    Background A model that can accurately predict speech intelligibility for a given hearing-impaired (HI) listener would be an important tool for hearing-aid fitting or hearing-aid algorithm development. Existing speech-intelligibility models do not incorporate variability in suprathreshold deficits that are not well predicted by classical audiometric measures. One possible approach to the incorporation of such deficits is to base intelligibility predictions on sensitivity to simultaneously spectrally and temporally modulated signals. Purpose The likelihood of success of this approach was evaluated by comparing estimates of spectrotemporal modulation (STM) sensitivity to speech intelligibility and to psychoacoustic estimates of frequency selectivity and temporal fine-structure (TFS) sensitivity across a group of HI listeners. Research Design The minimum modulation depth required to detect STM applied to an 86 dB SPL four-octave noise carrier was measured for combinations of temporal modulation rate (4, 12, or 32 Hz) and spectral modulation density (0.5, 1, 2, or 4 cycles/octave). STM sensitivity estimates for individual HI listeners were compared to estimates of frequency selectivity (measured using the notched-noise method at 500, 1000measured using the notched-noise method at 500, 2000, and 4000 Hz), TFS processing ability (2 Hz frequency-modulation detection thresholds for 500, 10002 Hz frequency-modulation detection thresholds for 500, 2000, and 4000 Hz carriers) and sentence intelligibility in noise (at a 0 dB signal-to-noise ratio) that were measured for the same listeners in a separate study. Study Sample Eight normal-hearing (NH) listeners and 12 listeners with a diagnosis of bilateral sensorineural hearing loss participated. Data Collection and Analysis STM sensitivity was compared between NH and HI listener groups using a repeated-measures analysis of variance. A stepwise regression analysis compared STM sensitivity for individual HI listeners to

  20. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility.

    PubMed

    Bentsen, Thomas; May, Tobias; Kressner, Abigail A; Dau, Torsten

    2018-01-01

    Computational speech segregation attempts to automatically separate speech from noise. This is challenging in conditions with interfering talkers and low signal-to-noise ratios. Recent approaches have adopted deep neural networks and successfully demonstrated speech intelligibility improvements. A selection of components may be responsible for the success with these state-of-the-art approaches: the system architecture, a time frame concatenation technique and the learning objective. The aim of this study was to explore the roles and the relative contributions of these components by measuring speech intelligibility in normal-hearing listeners. A substantial improvement of 25.4 percentage points in speech intelligibility scores was found going from a subband-based architecture, in which a Gaussian Mixture Model-based classifier predicts the distributions of speech and noise for each frequency channel, to a state-of-the-art deep neural network-based architecture. Another improvement of 13.9 percentage points was obtained by changing the learning objective from the ideal binary mask, in which individual time-frequency units are labeled as either speech- or noise-dominated, to the ideal ratio mask, where the units are assigned a continuous value between zero and one. Therefore, both components play significant roles and by combining them, speech intelligibility improvements were obtained in a six-talker condition at a low signal-to-noise ratio.

  1. The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts

    PubMed Central

    Hayes-Harb, Rachel; Smith, Bruce L.; Bent, Tessa; Bradlow, Ann R.

    2009-01-01

    This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit. PMID:19606271

  2. Intelligibility of Clear Speech: Effect of Instruction

    ERIC Educational Resources Information Center

    Lam, Jennifer; Tjaden, Kris

    2013-01-01

    Purpose: The authors investigated how clear speech instructions influence sentence intelligibility. Method: Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis…

  3. Speech intelligibility in complex acoustic environments in young children

    NASA Astrophysics Data System (ADS)

    Litovsky, Ruth

    2003-04-01

    While the auditory system undergoes tremendous maturation during the first few years of life, it has become clear that in complex scenarios when multiple sounds occur and when echoes are present, children's performance is significantly worse than their adult counterparts. The ability of children (3-7 years of age) to understand speech in a simulated multi-talker environment and to benefit from spatial separation of the target and competing sounds was investigated. In these studies, competing sources vary in number, location, and content (speech, modulated or unmodulated speech-shaped noise and time-reversed speech). The acoustic spaces were also varied in size and amount of reverberation. Finally, children with chronic otitis media who received binaural training were tested pre- and post-training on a subset of conditions. Results indicated the following. (1) Children experienced significantly more masking than adults, even in the simplest conditions tested. (2) When the target and competing sounds were spatially separated speech intelligibility improved, but the amount varied with age, type of competing sound, and number of competitors. (3) In a large reverberant classroom there was no benefit of spatial separation. (4) Binaural training improved speech intelligibility performance in children with otitis media. Future work includes similar studies in children with unilateral and bilateral cochlear implants. [Work supported by NIDCD, DRF, and NOHR.

  4. Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.

    PubMed

    Jørgensen, Søren; Dau, Torsten

    2011-09-01

    A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNR(env), at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility. © 2011 Acoustical Society of America

  5. A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments

    PubMed Central

    Colburn, H. Steven

    2016-01-01

    Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. PMID:27698261

  6. A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.

    PubMed

    Mi, Jing; Colburn, H Steven

    2016-10-03

    Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. © The Author(s) 2016.

  7. Exploring the roles of spectral detail and intonation contour in speech intelligibility: an FMRI study.

    PubMed

    Kyong, Jeong S; Scott, Sophie K; Rosen, Stuart; Howe, Timothy B; Agnew, Zarinah K; McGettigan, Carolyn

    2014-08-01

    The melodic contour of speech forms an important perceptual aspect of tonal and nontonal languages and an important limiting factor on the intelligibility of speech heard through a cochlear implant. Previous work exploring the neural correlates of speech comprehension identified a left-dominant pathway in the temporal lobes supporting the extraction of an intelligible linguistic message, whereas the right anterior temporal lobe showed an overall preference for signals clearly conveying dynamic pitch information [Johnsrude, I. S., Penhune, V. B., & Zatorre, R. J. Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain, 123, 155-163, 2000; Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000]. The current study combined modulations of overall intelligibility (through vocoding and spectral inversion) with a manipulation of pitch contour (normal vs. falling) to investigate the processing of spoken sentences in functional MRI. Our overall findings replicate and extend those of Scott et al. [Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406, 2000], where greater sentence intelligibility was predominately associated with increased activity in the left STS, and the greatest response to normal sentence melody was found in right superior temporal gyrus. These data suggest a spatial distinction between brain areas associated with intelligibility and those involved in the processing of dynamic pitch information in speech. By including a set of complexity-matched unintelligible conditions created by spectral inversion, this is additionally the first study reporting a fully factorial exploration of spectrotemporal complexity and spectral inversion as they relate to the neural processing of speech intelligibility. Perhaps

  8. Accent, intelligibility, and comprehensibility in the perception of foreign-accented Lombard speech

    NASA Astrophysics Data System (ADS)

    Li, Chi-Nin

    2003-10-01

    Speech produced in noise (Lombard speech) has been reported to be more intelligible than speech produced in quiet (normal speech). This study examined the perception of non-native Lombard speech in terms of intelligibility, comprehensibility, and degree of foreign accent. Twelve Cantonese speakers and a comparison group of English speakers read simple true and false English statements in quiet and in 70 dB of masking noise. Lombard and normal utterances were mixed with noise at a constant signal-to-noise ratio, and presented along with noise-free stimuli to eight new English listeners who provided transcription scores, comprehensibility ratings, and accent ratings. Analyses showed that, as expected, utterances presented in noise were less well perceived than were noise-free sentences, and that the Cantonese speakers' productions were more accented, but less intelligible and less comprehensible than those of the English speakers. For both groups of speakers, the Lombard sentences were correctly transcribed more often than their normal utterances in noisy conditions. However, the Cantonese-accented Lombard sentences were not rated as easier to understand than was the normal speech in all conditions. The assigned accent ratings were similar throughout all listening conditions. Implications of these findings will be discussed.

  9. The effect of intensive speech rate and intonation therapy on intelligibility in Parkinson's disease.

    PubMed

    Martens, Heidi; Van Nuffelen, Gwen; Dekens, Tomas; Hernández-Díaz Huici, Maria; Kairuz Hernández-Díaz, Hector Arturo; De Letter, Miet; De Bodt, Marc

    2015-01-01

    Most studies on treatment of prosody in individuals with dysarthria due to Parkinson's disease are based on intensive treatment of loudness. The present study investigates the effect of intensive treatment of speech rate and intonation on the intelligibility of individuals with dysarthria due to Parkinson's disease. A one group pretest-posttest design was used to compare intelligibility, speech rate, and intonation before and after treatment. Participants included eleven Dutch-speaking individuals with predominantly moderate dysarthria due to Parkinson's disease, who received five one-hour treatment sessions per week during three weeks. Treatment focused on lowering speech rate and magnifying the phrase final intonation contrast between statements and questions. Intelligibility was perceptually assessed using a standardized sentence intelligibility test. Speech rate was automatically assessed during the sentence intelligibility test as well as during a passage reading task and a storytelling task. Intonation was perceptually assessed using a sentence reading task and a sentence repetition task, and also acoustically analyzed in terms of maximum fundamental frequency. After treatment, there was a significant improvement of sentence intelligibility (effect size .83), a significant increase of pause frequency during the passage reading task, a significant improvement of correct listener identification of statements and questions, and a significant increase of the maximum fundamental frequency in the final syllable of questions during both intonation tasks. The findings suggest that participants were more intelligible and more able to manipulate pause frequency and statement-question intonation after treatment. However, the relationship between the change in intelligibility on the one hand and the changes in speech rate and intonation on the other hand is not yet fully understood. Results are nuanced in the light of the operated research design. The reader will be able

  10. Proximate factors associated with speech intelligibility in children with cochlear implants: A preliminary study.

    PubMed

    Chin, Steven B; Kuhns, Matthew J

    2014-01-01

    The purpose of this descriptive pilot study was to examine possible relationships among speech intelligibility and structural characteristics of speech in children who use cochlear implants. The Beginners Intelligibility Test (BIT) was administered to 10 children with cochlear implants, and the intelligibility of the words in the sentences was judged by panels of naïve adult listeners. Additionally, several qualitative and quantitative measures of word omission, segment correctness, duration, and intonation variability were applied to the sentences used to assess intelligibility. Correlational analyses were conducted to determine if BIT scores and the other speech parameters were related. There was a significant correlation between BIT score and percent words omitted, but no other variables correlated significantly with BIT score. The correlation between intelligibility and word omission may be task-specific as well as reflective of memory limitations.

  11. Influence of Visual Information on the Intelligibility of Dysarthric Speech

    ERIC Educational Resources Information Center

    Keintz, Connie K.; Bunton, Kate; Hoit, Jeannette D.

    2007-01-01

    Purpose: To examine the influence of visual information on speech intelligibility for a group of speakers with dysarthria associated with Parkinson's disease. Method: Eight speakers with Parkinson's disease and dysarthria were recorded while they read sentences. Speakers performed a concurrent manual task to facilitate typical speech production.…

  12. Acoustic richness modulates the neural networks supporting intelligible speech processing.

    PubMed

    Lee, Yune-Sang; Min, Nam Eun; Wingfield, Arthur; Grossman, Murray; Peelle, Jonathan E

    2016-03-01

    The information contained in a sensory signal plays a critical role in determining what neural processes are engaged. Here we used interleaved silent steady-state (ISSS) functional magnetic resonance imaging (fMRI) to explore how human listeners cope with different degrees of acoustic richness during auditory sentence comprehension. Twenty-six healthy young adults underwent scanning while hearing sentences that varied in acoustic richness (high vs. low spectral detail) and syntactic complexity (subject-relative vs. object-relative center-embedded clause structures). We manipulated acoustic richness by presenting the stimuli as unprocessed full-spectrum speech, or noise-vocoded with 24 channels. Importantly, although the vocoded sentences were spectrally impoverished, all sentences were highly intelligible. These manipulations allowed us to test how intelligible speech processing was affected by orthogonal linguistic and acoustic demands. Acoustically rich speech showed stronger activation than acoustically less-detailed speech in a bilateral temporoparietal network with more pronounced activity in the right hemisphere. By contrast, listening to sentences with greater syntactic complexity resulted in increased activation of a left-lateralized network including left posterior lateral temporal cortex, left inferior frontal gyrus, and left dorsolateral prefrontal cortex. Significant interactions between acoustic richness and syntactic complexity occurred in left supramarginal gyrus, right superior temporal gyrus, and right inferior frontal gyrus, indicating that the regions recruited for syntactic challenge differed as a function of acoustic properties of the speech. Our findings suggest that the neural systems involved in speech perception are finely tuned to the type of information available, and that reducing the richness of the acoustic signal dramatically alters the brain's response to spoken language, even when intelligibility is high. Copyright © 2015 Elsevier

  13. Intelligibility as a clinical outcome measure following intervention with children with phonologically based speech-sound disorders.

    PubMed

    Lousada, M; Jesus, Luis M T; Hall, A; Joffe, V

    2014-01-01

    The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological processes and phonetic inventory). Considering that the ultimate goal of intervention for children with phonologically based SSD is to improve intelligibility, it is curious that intervention studies focusing on children's phonology do not routinely use intelligibility as an outcome measure. It is therefore important that the impact of interventions on speech intelligibility is explored. This paper investigates the effectiveness of the two treatment approaches (phonological therapy and articulation therapy) using intelligibility measures, both in single words and in continuous speech, as the primary outcome. Fourteen children with phonologically based SSD participated in the intervention. The children were randomly assigned to phonological therapy or articulation therapy (seven children in each group). Two assessment methods were used for measuring intelligibility: a word identification task (for single words) and a rating scale (for continuous speech). Twenty-one unfamiliar adults listened and judged the children's intelligibility. Reliability analyses showed overall high agreement between listeners across both methods. Significant improvements were noted in intelligibility in both single words (paired t(6)=4.409, p=0.005) and continuous speech (asymptotic Z=2.371, p=0.018) for the group receiving phonology therapy pre- to post-treatment, but no differences in intelligibility were found for those receiving the articulation therapy pre- to post-treatment, either for single words (paired t(6)=1.763, p=0.128) or continuous speech (asymptotic Z=1.442, p=0.149). Intelligibility measures were sensitive enough to show changes in the

  14. Speech intelligibility and speech quality of modified loudspeaker announcements examined in a simulated aircraft cabin.

    PubMed

    Pennig, Sibylle; Quehl, Julia; Wittkowski, Martin

    2014-01-01

    Acoustic modifications of loudspeaker announcements were investigated in a simulated aircraft cabin to improve passengers' speech intelligibility and quality of communication in this specific setting. Four experiments with 278 participants in total were conducted in an acoustic laboratory using a standardised speech test and subjective rating scales. In experiments 1 and 2 the sound pressure level (SPL) of the announcements was varied (ranging from 70 to 85 dB(A)). Experiments 3 and 4 focused on frequency modification (octave bands) of the announcements. All studies used a background noise with the same SPL (74 dB(A)), but recorded at different seat positions in the aircraft cabin (front, rear). The results quantify speech intelligibility improvements with increasing signal-to-noise ratio and amplification of particular octave bands, especially the 2 kHz and the 4 kHz band. Thus, loudspeaker power in an aircraft cabin can be reduced by using appropriate filter settings in the loudspeaker system.

  15. Predicting Intelligibility Gains in Individuals with Dysarthria from Baseline Speech Features

    ERIC Educational Resources Information Center

    Fletcher, Annalise R.; McAuliffe, Megan J.; Lansford, Kaitlin L.; Sinex, Donal G.; Liss, Julie M.

    2017-01-01

    Purpose: Across the treatment literature, behavioral speech modifications have produced variable intelligibility changes in speakers with dysarthria. This study is the first of two articles exploring whether measurements of baseline speech features can predict speakers' responses to these modifications. Method: Fifty speakers (7 older individuals…

  16. Sound System Engineering & Optimization: The effects of multiple arrivals on the intelligibility of reinforced speech

    NASA Astrophysics Data System (ADS)

    Ryan, Timothy James

    The effects of multiple arrivals on the intelligibility of speech produced by live-sound reinforcement systems are examined. The intent is to determine if correlations exist between the manipulation of sound system optimization parameters and the subjective attribute speech intelligibility. Given the number, and wide range, of variables involved, this exploratory research project attempts to narrow the focus of further studies. Investigated variables are delay time between signals arriving from multiple elements of a loudspeaker array, array type and geometry and the two-way interactions of speech-to-noise ratio and array geometry with delay time. Intelligibility scores were obtained through subjective evaluation of binaural recordings, reproduced via headphone, using the Modified Rhyme Test. These word-score results are compared with objective measurements of Speech Transmission Index (STI). Results indicate that both variables, delay time and array geometry, have significant effects on intelligibility. Additionally, it is seen that all three of the possible two-way interactions have significant effects. Results further reveal that the STI measurement method overestimates the decrease in intelligibility due to short delay times between multiple arrivals.

  17. Relationships between Speech Intelligibility and Word Articulation Scores in Children with Hearing Loss

    PubMed Central

    Ertmer, David J.

    2012-01-01

    Purpose This investigation sought to determine whether scores from a commonly used word-based articulation test are closely associated with speech intelligibility in children with hearing loss. If the scores are closely related, articulation testing results might be used to estimate intelligibility. If not, the importance of direct assessment of intelligibility would be reinforced. Methods Forty-four children with hearing losses produced words from the Goldman-Fristoe Test of Articulation-2 and sets of 10 short sentences. Correlation analyses were conducted between scores for seven word-based predictor variables and percent-intelligible scores derived from listener judgments of stimulus sentences. Results Six of seven predictor variables were significantly correlated with percent-intelligible scores. However, regression analysis revealed that no single predictor variable or multi- variable model accounted for more than 25% of the variability in intelligibility scores. Implications The findings confirm the importance of assessing connected speech intelligibility directly. PMID:20220022

  18. The Influence of Noise Reduction on Speech Intelligibility, Response Times to Speech, and Perceived Listening Effort in Normal-Hearing Listeners.

    PubMed

    van den Tillaart-Haverkate, Maj; de Ronde-Brons, Inge; Dreschler, Wouter A; Houben, Rolph

    2017-01-01

    Single-microphone noise reduction leads to subjective benefit, but not to objective improvements in speech intelligibility. We investigated whether response times (RTs) provide an objective measure of the benefit of noise reduction and whether the effect of noise reduction is reflected in rated listening effort. Twelve normal-hearing participants listened to digit triplets that were either unprocessed or processed with one of two noise-reduction algorithms: an ideal binary mask (IBM) and a more realistic minimum mean square error estimator (MMSE). For each of these three processing conditions, we measured (a) speech intelligibility, (b) RTs on two different tasks (identification of the last digit and arithmetic summation of the first and last digit), and (c) subjective listening effort ratings. All measurements were performed at four signal-to-noise ratios (SNRs): -5, 0, +5, and +∞ dB. Speech intelligibility was high (>97% correct) for all conditions. A significant decrease in response time, relative to the unprocessed condition, was found for both IBM and MMSE for the arithmetic but not the identification task. Listening effort ratings were significantly lower for IBM than for MMSE and unprocessed speech in noise. We conclude that RT for an arithmetic task can provide an objective measure of the benefit of noise reduction. For young normal-hearing listeners, both ideal and realistic noise reduction can reduce RTs at SNRs where speech intelligibility is close to 100%. Ideal noise reduction can also reduce perceived listening effort.

  19. Speech intelligibility in cerebral palsy children attending an art therapy program.

    PubMed

    Wilk, Magdalena; Pachalska, Maria; Lipowska, Małgorzata; Herman-Sucharska, Izabela; Makarowski, Ryszard; Mirski, Andrzej; Jastrzebowska, Grazyna

    2010-05-01

    Dysarthia is a common sequela of cerebral palsy (CP), directly affecting both the intelligibility of speech and the child's psycho-social adjustment. Speech therapy focused exclusively on the articulatory organs does not always help CP children to speak more intelligibly. The program of art therapy described here has proven to be helpful for these children. From among all the CP children enrolled in our art therapy program from 2005 to 2009, we selected a group of 14 boys and girls (average age 15.3) with severe dysarthria at baseline but no other language or cognitive disturbances. Our retrospective study was based on results from the Auditory Dysarthria Scale and neuropsychological tests for fluency, administered routinely over the 4 months of art therapy. All 14 children in the study group showed some degree of improvement after art therapy in all tested parameters. On the Auditory Dysarthia Scale, highly significant improvements were noted in overall intelligibility (p<0.0001), with significant improvement (p<0.001) in volume, tempo, and control of pauses. The least improvement was noted in the most purely motor parameters. All 14 children also exhibited significant improvement in fluency. Art therapy improves the intelligibility of speech in children with cerebral palsy, even when language functions are not as such the object of therapeutic intervention.

  20. Horizontal localization and speech intelligibility with bilateral and unilateral hearing aid amplification.

    PubMed

    Köbler, S; Rosenhall, U

    2002-10-01

    Speech intelligibility and horizontal localization of 19 subjects with mild-to-moderate hearing loss were studied in order to evaluate the advantages and disadvantages of bilateral and unilateral hearing aid (HA) fittings. Eight loudspeakers were arranged in a circular array covering the horizontal plane around the subjects. Speech signals of a sentence test were delivered by one, randomly chosen, loudspeaker. At the same time, the other seven loudspeakers emitted noise with the same long-term average spectrum as the speech signals. The subjects were asked to repeat the speech signal and to point out the corresponding loudspeaker. Speech intelligibility was significantly improved by HAs, bilateral amplification being superior to unilateral. Horizontal localization could not be improved by HA amplification. However, bilateral HAs preserved the subjects' horizontal localization, whereas unilateral amplification decreased their horizontal localization abilities. Front-back confusions were common in the horizontal localization test. The results indicate that bilateral HA amplification has advantages compared with unilateral amplification.

  1. Exploring the Roles of Spectral Detail and Intonation Contour in Speech Intelligibility: An fMRI Study

    PubMed Central

    Kyong, Jeong S.; Scott, Sophie K.; Rosen, Stuart; Howe, Timothy B.; Agnew, Zarinah K.; McGettigan, Carolyn

    2014-01-01

    The melodic contour of speech forms an important perceptual aspect of tonal and nontonal languages and an important limiting factor on the intelligibility of speech heard through a cochlear implant. Previous work exploring the neural correlates of speech comprehension identified a left-dominant pathway in the temporal lobes supporting the extraction of an intelligible linguistic message, whereas the right anterior temporal lobe showed an overall preference for signals clearly conveying dynamic pitch information. The current study combined modulations of overall intelligibility (through vocoding and spectral inversion) with a manipulation of pitch contour (normal vs. falling) to investigate the processing of spoken sentences in functional MRI. Our overall findings replicate and extend those of Scott et al., whereas greater sentence intelligibility was predominately associated with increased activity in the left STS, the greatest response to normal sentence melody was found right superior temporal gyrus. These data suggest a spatial distinction between brain areas associated with intelligibility and those involved in the processing of dynamic pitch information in speech. By including a set of complexity-matched unintelligible conditions created by spectral inversion, this is additionally the first study reporting a fully factorial exploration of spectrotemporal complexity and spectral inversion as they relate to the neural processing of speech intelligibility. Perhaps surprisingly, there was no evidence for an interaction between the two factors—we discuss the implications for the processing of sound and speech in the dorsolateral temporal lobes. PMID:24568205

  2. The influence of visual speech information on the intelligibility of English consonants produced by non-native speakers.

    PubMed

    Kawase, Saya; Hannah, Beverly; Wang, Yue

    2014-09-01

    This study examines how visual speech information affects native judgments of the intelligibility of speech sounds produced by non-native (L2) speakers. Native Canadian English perceivers as judges perceived three English phonemic contrasts (/b-v, θ-s, l-ɹ/) produced by native Japanese speakers as well as native Canadian English speakers as controls. These stimuli were presented under audio-visual (AV, with speaker voice and face), audio-only (AO), and visual-only (VO) conditions. The results showed that, across conditions, the overall intelligibility of Japanese productions of the native (Japanese)-like phonemes (/b, s, l/) was significantly higher than the non-Japanese phonemes (/v, θ, ɹ/). In terms of visual effects, the more visually salient non-Japanese phonemes /v, θ/ were perceived as significantly more intelligible when presented in the AV compared to the AO condition, indicating enhanced intelligibility when visual speech information is available. However, the non-Japanese phoneme /ɹ/ was perceived as less intelligible in the AV compared to the AO condition. Further analysis revealed that, unlike the native English productions, the Japanese speakers produced /ɹ/ without visible lip-rounding, indicating that non-native speakers' incorrect articulatory configurations may decrease the degree of intelligibility. These results suggest that visual speech information may either positively or negatively affect L2 speech intelligibility.

  3. Intelligibility of Noise-Adapted and Clear Speech in Child, Young Adult, and Older Adult Talkers

    ERIC Educational Resources Information Center

    Smiljanic, Rajka; Gilbert, Rachael C.

    2017-01-01

    Purpose: This study examined intelligibility of conversational and clear speech sentences produced in quiet and in noise by children, young adults, and older adults. Relative talker intelligibility was assessed across speaking styles. Method: Sixty-one young adult participants listened to sentences mixed with speech-shaped noise at -5 dB…

  4. Intelligibility Evaluation of Pathological Speech through Multigranularity Feature Extraction and Optimization.

    PubMed

    Fang, Chunying; Li, Haifeng; Ma, Lin; Zhang, Mancai

    2017-01-01

    Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S -transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S -transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F -score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation.

  5. Speech Intelligibility and Prosody Production in Children with Cochlear Implants

    PubMed Central

    Chin, Steven B.; Bergeson, Tonya R.; Phan, Jennifer

    2012-01-01

    Objectives The purpose of the current study was to examine the relation between speech intelligibility and prosody production in children who use cochlear implants. Methods The Beginner's Intelligibility Test (BIT) and Prosodic Utterance Production (PUP) task were administered to 15 children who use cochlear implants and 10 children with normal hearing. Adult listeners with normal hearing judged the intelligibility of the words in the BIT sentences, identified the PUP sentences as one of four grammatical or emotional moods (i.e., declarative, interrogative, happy, or sad), and rated the PUP sentences according to how well they thought the child conveyed the designated mood. Results Percent correct scores were higher for intelligibility than for prosody and higher for children with normal hearing than for children with cochlear implants. Declarative sentences were most readily identified and received the highest ratings by adult listeners; interrogative sentences were least readily identified and received the lowest ratings. Correlations between intelligibility and all mood identification and rating scores except declarative were not significant. Discussion The findings suggest that the development of speech intelligibility progresses ahead of prosody in both children with cochlear implants and children with normal hearing; however, children with normal hearing still perform better than children with cochlear implants on measures of intelligibility and prosody even after accounting for hearing age. Problems with interrogative intonation may be related to more general restrictions on rising intonation, and the correlation results indicate that intelligibility and sentence intonation may be relatively dissociated at these ages. PMID:22717120

  6. Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram.

    PubMed

    Hossain, Mohammad E; Jassim, Wissam A; Zilany, Muhammad S A

    2016-01-01

    Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants.

  7. Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram

    PubMed Central

    Hossain, Mohammad E.; Jassim, Wissam A.; Zilany, Muhammad S. A.

    2016-01-01

    Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants. PMID:26967160

  8. Vowels in clear and conversational speech: Talker differences in acoustic characteristics and intelligibility for normal-hearing listeners

    NASA Astrophysics Data System (ADS)

    Hargus Ferguson, Sarah; Kewley-Port, Diane

    2002-05-01

    Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. Recent work in this lab suggests that talkers vary in how much their intelligibility improves when they are instructed to speak clearly. The few studies examining acoustic characteristics of clear and conversational speech suggest that these differing clear speech effects result from different acoustic strategies on the part of individual talkers. However, only two studies to date have directly examined differences among talkers producing clear versus conversational speech, and neither included acoustic analysis. In this project, clear and conversational speech was recorded from 41 male and female talkers aged 18-45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. Acoustic analyses using stimuli from a subgroup of talkers shown to have a range of speaking style effects will be used to assess specific acoustic correlates of vowel intelligibility in clear and conversational speech. [Work supported by NIHDCD-02229.

  9. [Communication and noise. Speech intelligibility of airplane pilots with and without active noise compensation].

    PubMed

    Matschke, R G

    1994-08-01

    Noise exposure measurements were performed with pilots of the German Federal Navy during flight situations. The ambient noise levels during regular flight were maintained at levels above a 90 dB A-weighted level. This noise intensity requires wearing ear protection to avoid sound-induced hearing loss. To be able to understand radio communication (ATC) in spite of a noisy environment, headphone volume must be raised above the noise of the engines. The use of ear plugs in addition to the headsets and flight helmets is only of limited value because personal ear protection affects the intelligibility of ATC. Whereas speech intelligibility of pilots with normal hearing is affected to only a smaller degree, pilots with pre-existing high-frequency hearing losses show substantial impairments of speech intelligibility that vary in proportion to the hearing deficit present. Communication abilities can be reduced drastically, which in turn can affect air traffic security. The development of active noise compensation devices (ANC) that make use of the "anti-noise" principle may be a solution to this dilemma. To evaluate the effectiveness of an ANC-system and its influence on speech intelligibility, speech audiometry was performed with a German standardized test during simulated flight conditions with helicopter pilots. Results demonstrate the helpful effect on speech understanding especially for pilots with noise-induced hearing losses. This may help to avoid pre-retirement professional disability.

  10. Effects of Instantaneous Multiband Dynamic Compression on Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Herzke, Tobias; Hohmann, Volker

    2005-12-01

    The recruitment phenomenon, that is, the reduced dynamic range between threshold and uncomfortable level, is attributed to the loss of instantaneous dynamic compression on the basilar membrane. Despite this, hearing aids commonly use slow-acting dynamic compression for its compensation, because this was found to be the most successful strategy in terms of speech quality and intelligibility rehabilitation. Former attempts to use fast-acting compression gave ambiguous results, raising the question as to whether auditory-based recruitment compensation by instantaneous compression is in principle applicable in hearing aids. This study thus investigates instantaneous multiband dynamic compression based on an auditory filterbank. Instantaneous envelope compression is performed in each frequency band of a gammatone filterbank, which provides a combination of time and frequency resolution comparable to the normal healthy cochlea. The gain characteristics used for dynamic compression are deduced from categorical loudness scaling. In speech intelligibility tests, the instantaneous dynamic compression scheme was compared against a linear amplification scheme, which used the same filterbank for frequency analysis, but employed constant gain factors that restored the sound level for medium perceived loudness in each frequency band. In subjective comparisons, five of nine subjects preferred the linear amplification scheme and would not accept the instantaneous dynamic compression in hearing aids. Four of nine subjects did not perceive any quality differences. A sentence intelligibility test in noise (Oldenburg sentence test) showed little to no negative effects of the instantaneous dynamic compression, compared to linear amplification. A word intelligibility test in quiet (one-syllable rhyme test) showed that the subjects benefit from the larger amplification at low levels provided by instantaneous dynamic compression. Further analysis showed that the increase in intelligibility

  11. Functional connectivity between face-movement and speech-intelligibility areas during auditory-only speech perception.

    PubMed

    Schall, Sonja; von Kriegstein, Katharina

    2014-01-01

    It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers' voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker's face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas.

  12. Bidirectional clear speech perception benefit for native and high-proficiency non-native talkers and listeners: Intelligibility and accentednessa

    PubMed Central

    Smiljanić, Rajka; Bradlow, Ann R.

    2011-01-01

    This study investigated how native language background interacts with speaking style adaptations in determining levels of speech intelligibility. The aim was to explore whether native and high proficiency non-native listeners benefit similarly from native and non-native clear speech adjustments. The sentence-in-noise perception results revealed that fluent non-native listeners gained a large clear speech benefit from native clear speech modifications. Furthermore, proficient non-native talkers in this study implemented conversational-to-clear speaking style modifications in their second language (L2) that resulted in significant intelligibility gain for both native and non-native listeners. The results of the accentedness ratings obtained for native and non-native conversational and clear speech sentences showed that while intelligibility was improved, the presence of foreign accent remained constant in both speaking styles. This suggests that objective intelligibility and subjective accentedness are two independent dimensions of non-native speech. Overall, these results provide strong evidence that greater experience in L2 processing leads to improved intelligibility in both production and perception domains. These results also demonstrated that speaking style adaptations along with less signal distortion can contribute significantly towards successful native and non-native interactions. PMID:22225056

  13. Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

    PubMed

    Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B

    2017-11-06

    Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners With Simulated Age-Related Hearing Loss.

    PubMed

    Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian

    2017-09-18

    The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist audiologists/hearing-aid dispensers in the fine-tuning of hearing aids. Sixty young participants with normal hearing listened to speech materials mimicking the perceptual consequences of ARHL at different levels of severity. Two intelligibility tests (repetition of words and sentences) and 1 comprehension test (responding to oral commands by moving virtual objects) were administered. Several language models were developed and used by the ASR system in order to fit human performances. Strong significant positive correlations were observed between human and ASR scores, with coefficients up to .99. However, the spectral smearing used to simulate losses in frequency selectivity caused larger declines in ASR performance than in human performance. Both intelligibility and comprehension scores for listeners with simulated ARHL are highly correlated with the performances of an ASR-based system. In the future, it needs to be determined if the ASR system is similarly successful in predicting speech processing in noise and by older people with ARHL.

  15. Minimal Pair Distinctions and Intelligibility in Preschool Children with and without Speech Sound Disorders

    ERIC Educational Resources Information Center

    Hodge, Megan M.; Gotzke, Carrie L.

    2011-01-01

    Listeners' identification of young children's productions of minimally contrastive words and predictive relationships between accurately identified words and intelligibility scores obtained from a 100-word spontaneous speech sample were determined for 36 children with typically developing speech (TDS) and 36 children with speech sound disorders…

  16. Auditory “bubbles”: Efficient classification of the spectrotemporal modulations essential for speech intelligibility

    PubMed Central

    Venezia, Jonathan H.; Hickok, Gregory; Richards, Virginia M.

    2016-01-01

    Speech intelligibility depends on the integrity of spectrotemporal patterns in the signal. The current study is concerned with the speech modulation power spectrum (MPS), which is a two-dimensional representation of energy at different combinations of temporal and spectral (i.e., spectrotemporal) modulation rates. A psychophysical procedure was developed to identify the regions of the MPS that contribute to successful reception of auditory sentences. The procedure, based on the two-dimensional image classification technique known as “bubbles” (Gosselin and Schyns (2001). Vision Res. 41, 2261–2271), involves filtering (i.e., degrading) the speech signal by removing parts of the MPS at random, and relating filter patterns to observer performance (keywords identified) over a number of trials. The result is a classification image (CImg) or “perceptual map” that emphasizes regions of the MPS essential for speech intelligibility. This procedure was tested using normal-rate and 2×-time-compressed sentences. The results indicated: (a) CImgs could be reliably estimated in individual listeners in relatively few trials, (b) CImgs tracked changes in spectrotemporal modulation energy induced by time compression, though not completely, indicating that “perceptual maps” deviated from physical stimulus energy, and (c) the bubbles method captured variance in intelligibility not reflected in a common modulation-based intelligibility metric (spectrotemporal modulation index or STMI). PMID:27586738

  17. The effect of compression and attention allocation on speech intelligibility

    NASA Astrophysics Data System (ADS)

    Choi, Sangsook; Carrell, Thomas

    2003-10-01

    Research investigating the effects of amplitude compression on speech intelligibility for individuals with sensorineural hearing loss has demonstrated contradictory results [Souza and Turner (1999)]. Because percent-correct measures may not be the best indicator of compression effectiveness, a speech intelligibility and motor coordination task was developed to provide data that may more thoroughly explain the perception of compressed speech signals. In the present study, a pursuit rotor task [Dlhopolsky (2000)] was employed along with word identification task to measure the amount of attention required to perceive compressed and non-compressed words in noise. Monosyllabic words were mixed with speech-shaped noise at a fixed signal-to-noise ratio and compressed using a wide dynamic range compression scheme. Participants with normal hearing identified each word with or without a simultaneous pursuit-rotor task. Also, participants completed the pursuit-rotor task without simultaneous word presentation. It was expected that the performance on the additional motor task would reflect effect of the compression better than simple word-accuracy measures. Results were complex. For example, in some conditions an irrelevant task actually improved performance on a simultaneous listening task. This suggests there might be an optimal level of attention required for recognition of monosyllabic words.

  18. In-flight speech intelligibility evaluation of a service member with sensorineural hearing loss: case report.

    PubMed

    Casto, Kristen L; Cho, Timothy H

    2012-09-01

    This case report describes the in-flight speech intelligibility evaluation of an aircraft crewmember with pure tone audiometric thresholds that exceed the U.S. Army's flight standards. Results of in-flight speech intelligibility testing highlight the inability to predict functional auditory abilities from pure tone audiometry and underscore the importance of conducting validated functional hearing evaluations to determine aviation fitness-for-duty.

  19. A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation.

    PubMed

    Lai, Ying-Hui; Chen, Fei; Wang, Syu-Siang; Lu, Xugang; Tsao, Yu; Lee, Chin-Hui

    2017-07-01

    In a cochlear implant (CI) speech processor, noise reduction (NR) is a critical component for enabling CI users to attain improved speech perception under noisy conditions. Identifying an effective NR approach has long been a key topic in CI research. Recently, a deep denoising autoencoder (DDAE) based NR approach was proposed and shown to be effective in restoring clean speech from noisy observations. It was also shown that DDAE could provide better performance than several existing NR methods in standardized objective evaluations. Following this success with normal speech, this paper further investigated the performance of DDAE-based NR to improve the intelligibility of envelope-based vocoded speech, which simulates speech signal processing in existing CI devices. We compared the performance of speech intelligibility between DDAE-based NR and conventional single-microphone NR approaches using the noise vocoder simulation. The results of both objective evaluations and listening test showed that, under the conditions of nonstationary noise distortion, DDAE-based NR yielded higher intelligibility scores than conventional NR approaches. This study confirmed that DDAE-based NR could potentially be integrated into a CI processor to provide more benefits to CI users under noisy conditions.

  20. Comparison of speech intelligibility in cockpit noise using SPH-4 flight helmet with and without active noise reduction

    NASA Technical Reports Server (NTRS)

    Chan, Jeffrey W.; Simpson, Carol A.

    1990-01-01

    Active Noise Reduction (ANR) is a new technology which can reduce the level of aircraft cockpit noise that reaches the pilot's ear while simultaneously improving the signal to noise ratio for voice communications and other information bearing sound signals in the cockpit. A miniature, ear-cup mounted ANR system was tested to determine whether speech intelligibility is better for helicopter pilots using ANR compared to a control condition of ANR turned off. Two signal to noise ratios (S/N), representative of actual cockpit conditions, were used for the ratio of the speech to cockpit noise sound pressure levels. Speech intelligibility was significantly better with ANR compared to no ANR for both S/N conditions. Variability of speech intelligibility among pilots was also significantly less with ANR. When the stock helmet was used with ANR turned off, the average PB Word speech intelligibility score was below the Normally Acceptable level. In comparison, it was above that level with ANR on in both S/N levels.

  1. Production Variability and Single Word Intelligibility in Aphasia and Apraxia of Speech

    ERIC Educational Resources Information Center

    Haley, Katarina L.; Martin, Gwenyth

    2011-01-01

    This study was designed to estimate test-retest reliability of orthographic speech intelligibility testing in speakers with aphasia and AOS and to examine its relationship to the consistency of speaker and listener responses. Monosyllabic single word speech samples were recorded from 13 speakers with coexisting aphasia and AOS. These words were…

  2. Functional Connectivity between Face-Movement and Speech-Intelligibility Areas during Auditory-Only Speech Perception

    PubMed Central

    Schall, Sonja; von Kriegstein, Katharina

    2014-01-01

    It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers’ voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker’s face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas. PMID:24466026

  3. Cortical characterization of the perception of intelligible and unintelligible speech measured via high-density electroencephalography.

    PubMed

    Utianski, Rene L; Caviness, John N; Liss, Julie M

    2015-01-01

    High-density electroencephalography was used to evaluate cortical activity during speech comprehension via a sentence verification task. Twenty-four participants assigned true or false to sentences produced with 3 noise-vocoded channel levels (1--unintelligible, 6--decipherable, 16--intelligible), during simultaneous EEG recording. Participant data were sorted into higher- (HP) and lower-performing (LP) groups. The identification of a late-event related potential for LP listeners in the intelligible condition and in all listeners when challenged with a 6-Ch signal supports the notion that this induced potential may be related to either processing degraded speech, or degraded processing of intelligible speech. Different cortical locations are identified as neural generators responsible for this activity; HP listeners are engaging motor aspects of their language system, utilizing an acoustic-phonetic based strategy to help resolve the sentence, while LP listeners do not. This study presents evidence for neurophysiological indices associated with more or less successful speech comprehension performance across listening conditions. Copyright © 2014 Elsevier Inc. All rights reserved.

  4. The role of accent imitation in sensorimotor integration during processing of intelligible speech

    PubMed Central

    Adank, Patti; Rueschemeyer, Shirley-Ann; Bekkering, Harold

    2013-01-01

    Recent theories on how listeners maintain perceptual invariance despite variation in the speech signal allocate a prominent role to imitation mechanisms. Notably, these simulation accounts propose that motor mechanisms support perception of ambiguous or noisy signals. Indeed, imitation of ambiguous signals, e.g., accented speech, has been found to aid effective speech comprehension. Here, we explored the possibility that imitation in speech benefits perception by increasing activation in speech perception and production areas. Participants rated the intelligibility of sentences spoken in an unfamiliar accent of Dutch in a functional Magnetic Resonance Imaging experiment. Next, participants in one group repeated the sentences in their own accent, while a second group vocally imitated the accent. Finally, both groups rated the intelligibility of accented sentences in a post-test. The neuroimaging results showed an interaction between type of training and pre- and post-test sessions in left Inferior Frontal Gyrus, Supplementary Motor Area, and left Superior Temporal Sulcus. Although alternative explanations such as task engagement and fatigue need to be considered as well, the results suggest that imitation may aid effective speech comprehension by supporting sensorimotor integration. PMID:24109447

  5. The Impact of Dysphonic Voices on Healthy Listeners: Listener Reaction Times, Speech Intelligibility, and Listener Comprehension.

    PubMed

    Evitts, Paul M; Starmer, Heather; Teets, Kristine; Montgomery, Christen; Calhoun, Lauren; Schulze, Allison; MacKenzie, Jenna; Adams, Lauren

    2016-11-01

    There is currently minimal information on the impact of dysphonia secondary to phonotrauma on listeners. Considering the high incidence of voice disorders with professional voice users, it is important to understand the impact of a dysphonic voice on their audiences. Ninety-one healthy listeners (39 men, 52 women; mean age = 23.62 years) were presented with speech stimuli from 5 healthy speakers and 5 speakers diagnosed with dysphonia secondary to phonotrauma. Dependent variables included processing speed (reaction time [RT] ratio), speech intelligibility, and listener comprehension. Voice quality ratings were also obtained for all speakers by 3 expert listeners. Statistical results showed significant differences between RT ratio and number of speech intelligibility errors between healthy and dysphonic voices. There was not a significant difference in listener comprehension errors. Multiple regression analyses showed that voice quality ratings from the Consensus Assessment Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were able to predict RT ratio and speech intelligibility but not listener comprehension. Results of the study suggest that although listeners require more time to process and have more intelligibility errors when presented with speech stimuli from speakers with dysphonia secondary to phonotrauma, listener comprehension may not be affected.

  6. Improving Speech Intelligibility in Children with Childhood Apraxia of Speech: Employing Evidence-Based Practice. EBP Briefs. Volume 9, Issue 5

    ERIC Educational Resources Information Center

    Koehlinger, Keegan M.

    2015-01-01

    Clinical Question: Would a preschool-aged child with childhood apraxia of speech (CAS) benefit from a singular approach--such as motor planning, sensory cueing, linguistic and rhythmic--or a combined approach in order to increase intelligibility of spoken language? Method: Systematic Review. Study Sources: ASHA Wire, Google Scholar, Speech Bite.…

  7. Lexical effects on speech production and intelligibility in Parkinson's disease

    NASA Astrophysics Data System (ADS)

    Chiu, Yi-Fang

    Individuals with Parkinson's disease (PD) often have speech deficits that lead to reduced speech intelligibility. Previous research provides a rich database regarding the articulatory deficits associated with PD including restricted vowel space (Skodda, Visser, & Schlegel, 2011) and flatter formant transitions (Tjaden & Wilding, 2004; Walsh & Smith, 2012). However, few studies consider the effect of higher level structural variables of word usage frequency and the number of similar sounding words (i.e. neighborhood density) on lower level articulation or on listeners' perception of dysarthric speech. The purpose of the study is to examine the interaction of lexical properties and speech articulation as measured acoustically in speakers with PD and healthy controls (HC) and the effect of lexical properties on the perception of their speech. Individuals diagnosed with PD and age-matched healthy controls read sentences with words that varied in word frequency and neighborhood density. Acoustic analysis was performed to compare second formant transitions in diphthongs, an indicator of the dynamics of tongue movement during speech production, across different lexical characteristics. Young listeners transcribed the spoken sentences and the transcription accuracy was compared across lexical conditions. The acoustic results indicate that both PD and HC speakers adjusted their articulation based on lexical properties but the PD group had significant reductions in second formant transitions compared to HC. Both groups of speakers increased second formant transitions for words with low frequency and low density, but the lexical effect is diphthong dependent. The change in second formant slope was limited in the PD group when the required formant movement for the diphthong is small. The data from listeners' perception of the speech by PD and HC show that listeners identified high frequency words with greater accuracy suggesting the use of lexical knowledge during the

  8. Speech Intelligibility and Psychosocial Functioning in Deaf Children and Teens with Cochlear Implants

    ERIC Educational Resources Information Center

    Freeman, Valerie; Pisoni, David B.; Kronenberger, William G.; Castellanos, Irina

    2017-01-01

    Deaf children with cochlear implants (CIs) are at risk for psychosocial adjustment problems, possibly due to delayed speech-language skills. This study investigated associations between a core component of spoken-language ability--speech intelligibility--and the psychosocial development of prelingually deaf CI users. Audio-transcription measures…

  9. Computer-Mediated Assessment of Intelligibility in Aphasia and Apraxia of Speech

    PubMed Central

    Haley, Katarina L.; Roth, Heidi; Grindstaff, Enetta; Jacks, Adam

    2011-01-01

    Background Previous work indicates that single word intelligibility tests developed for dysarthria are sensitive to segmental production errors in aphasic individuals with and without apraxia of speech. However, potential listener learning effects and difficulties adapting elicitation procedures to coexisting language impairments limit their applicability to left hemisphere stroke survivors. Aims The main purpose of this study was to examine basic psychometric properties for a new monosyllabic intelligibility test developed for individuals with aphasia and/or AOS. A related purpose was to examine clinical feasibility and potential to standardize a computer-mediated administration approach. Methods & Procedures A 600-item monosyllabic single word intelligibility test was constructed by assembling sets of phonetically similar words. Custom software was used to select 50 target words from this test in a pseudo-random fashion and to elicit and record production of these words by 23 speakers with aphasia and 20 neurologically healthy participants. To evaluate test-retest reliability, two identical sets of 50-word lists were elicited by requesting repetition after a live speaker model. To examine the effect of a different word set and auditory model, an additional set of 50 different words was elicited with a pre-recorded model. The recorded words were presented to normal-hearing listeners for identification via orthographic and multiple-choice response formats. To examine construct validity, production accuracy for each speaker was estimated via phonetic transcription and rating of overall articulation. Outcomes & Results Recording and listening tasks were completed in less than six minutes for all speakers and listeners. Aphasic speakers were significantly less intelligible than neurologically healthy speakers and displayed a wide range of intelligibility scores. Test-retest and inter-listener reliability estimates were strong. No significant difference was found in

  10. Direct magnitude estimates of speech intelligibility in dysarthria: effects of a chosen standard.

    PubMed

    Weismer, Gary; Laures, Jacqueline S

    2002-06-01

    Direct magnitude estimation (DME) has been used frequently as a perceptual scaling technique in studies of the speech intelligibility of persons with speech disorders. The technique is typically used with a standard, or reference stimulus, chosen as a good exemplar of "midrange" intelligibility. In several published studies, the standard has been chosen subjectively, usually on the basis of the expertise of the investigators. The current experiment demonstrates that a fixed set of sentence-level utterances, obtained from 4 individuals with dysarthria (2 with Parkinson disease, 2 with traumatic brain injury) as well as 3 neurologically normal speakers, is scaled differently depending on the identity of the standard. Four different standards were used in the main experiment, three of which were judged qualitatively in two independent evaluations to be good exemplars of midrange intelligibility. Acoustic analyses did not reveal obvious differences between these four standards but suggested that the standard with the worst-scaled intelligibility had much poorer voice source characteristics compared to the other three standards. Results are discussed in terms of possible standardization of midrange intelligibility exemplars for DME experiments.

  11. Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise

    PubMed Central

    Chen, Fei; Loizou, Philipos C.

    2012-01-01

    Recent evidence suggests that spectral change, as measured by cochlea-scaled entropy (CSE), predicts speech intelligibility better than the information carried by vowels or consonants in sentences. Motivated by this finding, the present study investigates whether intelligibility indices implemented to include segments marked with significant spectral change better predict speech intelligibility in noise than measures that include all phonetic segments paying no attention to vowels/consonants or spectral change. The prediction of two intelligibility measures [normalized covariance measure (NCM), coherence-based speech intelligibility index (CSII)] is investigated using three sentence-segmentation methods: relative root-mean-square (RMS) levels, CSE, and traditional phonetic segmentation of obstruents and sonorants. While the CSE method makes no distinction between spectral changes occurring within vowels/consonants, the RMS-level segmentation method places more emphasis on the vowel-consonant boundaries wherein the spectral change is often most prominent, and perhaps most robust, in the presence of noise. Higher correlation with intelligibility scores was obtained when including sentence segments containing a large number of consonant-vowel boundaries than when including segments with highest entropy or segments based on obstruent/sonorant classification. These data suggest that in the context of intelligibility measures the type of spectral change captured by the measure is important. PMID:22559382

  12. A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals With Parkinson's Disease

    PubMed Central

    Choi, Yaelin

    2017-01-01

    Purpose The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method A total of 48 speakers from the 4 speaker groups (AE speakers with PD, Korean speakers with PD, healthy English speakers, and healthy Korean speakers) were asked to read a paragraph in their native languages. Four acoustic variables were analyzed: acoustic vowel space, voice onset time contrast scores, normalized pairwise variability index, and articulation rate. Speech intelligibility scores were obtained from scaled estimates of sentences extracted from the paragraph. Results The findings indicated that the multiple regression models of speech intelligibility were different in Korean and AE, even with the same set of predictor variables and with speakers matched on speech intelligibility across languages. Analysis of the descriptive data for the acoustic variables showed the expected compression of the vowel space in speakers with PD in both languages, lower normalized pairwise variability index scores in Korean compared with AE, and no differences within or across language in articulation rate. Conclusions The results indicate that the basis of an intelligibility deficit in dysarthria is likely to depend on the native language of the speaker and listener. Additional research is required to explore other potential predictor variables, as well as additional language comparisons to pursue cross-linguistic considerations in classification and diagnosis of dysarthria types. PMID:28821018

  13. Examining explanations for fundamental frequency's contribution to speech intelligibility in noise

    NASA Astrophysics Data System (ADS)

    Schlauch, Robert S.; Miller, Sharon E.; Watson, Peter J.

    2005-09-01

    Laures and Weismer [JSLHR, 42, 1148 (1999)] reported that speech with natural variation in fundamental frequency (F0) is more intelligible in noise than speech with a flattened F0 contour. Cognitive-linguistic based explanations have been offered to account for this drop in intelligibility for the flattened condition, but a lower-level mechanism related to auditory streaming may be responsible. Numerous psychoacoustic studies have demonstrated that modulating a tone enables a listener to segregate it from background sounds. To test these rival hypotheses, speech recognition in noise was measured for sentences with six different F0 contours: unmodified, flattened at the mean, natural but exaggerated, reversed, and frequency modulated (rates of 2.5 and 5.0 Hz). The 180 stimulus sentences were produced by five talkers (30 sentences per condition). Speech recognition for fifteen listeners replicate earlier findings showing that flattening the F0 contour results in a roughly 10% reduction in recognition of key words compared with the natural condition. Although the exaggerated condition produced results comparable to those of the flattened condition, the other conditions with unnatural F0 contours all yielded significantly poorer performance than the flattened condition. These results support the cognitive, linguistic-based explanations for the reduction in performance.

  14. [The Freiburg speech intelligibility test : A pillar of speech audiometry in German-speaking countries].

    PubMed

    Hoth, S

    2016-08-01

    The Freiburg speech intelligibility test according to DIN 45621 was introduced around 60 years ago. For decades, and still today, the Freiburg test has been a standard whose relevance extends far beyond pure audiometry. It is used primarily to determine the speech perception threshold (based on two-digit numbers) and the ability to discriminate speech at suprathreshold presentation levels (based on monosyllabic nouns). Moreover, it is a measure of the degree of disability, the requirement for and success of technical hearing aids (auxiliaries directives), and the compensation for disability and handicap (Königstein recommendation). In differential audiological diagnostics, the Freiburg test contributes to the distinction between low- and high-frequency hearing loss, as well as to identification of conductive, sensory, neural, and central disorders. Currently, the phonemic and perceptual balance of the monosyllabic test lists is subject to critical discussions. Obvious deficiencies exist for testing speech recognition in noise. In this respect, alternatives such as sentence or rhyme tests in closed-answer inventories are discussed.

  15. Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters.

    PubMed

    McNeer, Richard R; Bennett, Christopher L; Horn, Danielle Bodzin; Dudaryk, Roman

    2017-06-01

    Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m (average VR = 131.1 [34.2] m). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R = 0.666, F(2, 37) = 39.9, P < .0001). For Dmax (maximum distance where there is <15% loss of consonant articulation), both VR and VC β coefficients were significant. For RT60 and Dmax, after controlling for VC, partial correlations were 0.825 (P

  16. Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters

    PubMed Central

    Bennett, Christopher L.; Horn, Danielle Bodzin; Dudaryk, Roman

    2017-01-01

    INTRODUCTION: Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. METHODS: We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. RESULTS: A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m3 (average VR = 131.1 [34.2] m3). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R2 = 0.666, F(2, 37) = 39.9, P < .0001). For Dmax (maximum distance where there is <15% loss of consonant articulation), both VR and VC β coefficients were significant. For RT60 and Dmax, after controlling for VC

  17. Listening with a foreign-accent: The interlanguage speech intelligibility benefit in Mandarin speakers of English

    PubMed Central

    Xie, Xin; Fowler, Carol A.

    2013-01-01

    This study examined the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. In the latter group, it also examined the role of the language environment and English proficiency. Three groups of listeners were tested: native English listeners (NE), Mandarin-speaking Chinese listeners in the US (M-US) and Mandarin listeners in Beijing, China (M-BJ). As a group, M-US and M-BJ listeners were matched on English proficiency and age of acquisition. A nonword transcription task was used. Identification accuracy for word-final stops in the nonwords established two independent interlanguage intelligibility effects. An interlanguage speech intelligibility benefit for listeners (ISIB-L) was manifest by both groups of Mandarin listeners outperforming native English listeners in identification of Mandarin-accented speech. In the benefit for talkers (ISIB-T), only M-BJ listeners were more accurate identifying Mandarin-accented speech than native English speech. Thus, both Mandarin groups demonstrated an ISIB-L while only the M-BJ group overall demonstrated an ISIB-T. The English proficiency of listeners was found to modulate the magnitude of the ISIB-T in both groups. Regression analyses also suggested that the listener groups differ in their use of acoustic information to identify voicing in stop consonants. PMID:24293741

  18. Comparing Binaural Pre-processing Strategies II: Speech Intelligibility of Bilateral Cochlear Implant Users.

    PubMed

    Baumgärtel, Regina M; Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-12-30

    Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. © The Author(s) 2015.

  19. On the relationship between auditory cognition and speech intelligibility in cochlear implant users: An ERP study.

    PubMed

    Finke, Mareike; Büchner, Andreas; Ruigendijk, Esther; Meyer, Martin; Sandmann, Pascale

    2016-07-01

    There is a high degree of variability in speech intelligibility outcomes across cochlear-implant (CI) users. To better understand how auditory cognition affects speech intelligibility with the CI, we performed an electroencephalography study in which we examined the relationship between central auditory processing, cognitive abilities, and speech intelligibility. Postlingually deafened CI users (N=13) and matched normal-hearing (NH) listeners (N=13) performed an oddball task with words presented in different background conditions (quiet, stationary noise, modulated noise). Participants had to categorize words as living (targets) or non-living entities (standards). We also assessed participants' working memory (WM) capacity and verbal abilities. For the oddball task, we found lower hit rates and prolonged response times in CI users when compared with NH listeners. Noise-related prolongation of the N1 amplitude was found for all participants. Further, we observed group-specific modulation effects of event-related potentials (ERPs) as a function of background noise. While NH listeners showed stronger noise-related modulation of the N1 latency, CI users revealed enhanced modulation effects of the N2/N4 latency. In general, higher-order processing (N2/N4, P3) was prolonged in CI users in all background conditions when compared with NH listeners. Longer N2/N4 latency in CI users suggests that these individuals have difficulties to map acoustic-phonetic features to lexical representations. These difficulties seem to be increased for speech-in-noise conditions when compared with speech in quiet background. Correlation analyses showed that shorter ERP latencies were related to enhanced speech intelligibility (N1, N2/N4), better lexical fluency (N1), and lower ratings of listening effort (N2/N4) in CI users. In sum, our findings suggest that CI users and NH listeners differ with regards to both the sensory and the higher-order processing of speech in quiet as well as in

  20. Effects of reverberation and noise on speech intelligibility in normal-hearing and aided hearing-impaired listeners.

    PubMed

    Xia, Jing; Xu, Buye; Pentony, Shareka; Xu, Jingjing; Swaminathan, Jayaganesh

    2018-03-01

    Many hearing-aid wearers have difficulties understanding speech in reverberant noisy environments. This study evaluated the effects of reverberation and noise on speech recognition in normal-hearing listeners and hearing-impaired listeners wearing hearing aids. Sixteen typical acoustic scenes with different amounts of reverberation and various types of noise maskers were simulated using a loudspeaker array in an anechoic chamber. Results showed that, across all listening conditions, speech intelligibility of aided hearing-impaired listeners was poorer than normal-hearing counterparts. Once corrected for ceiling effects, the differences in the effects of reverberation on speech intelligibility between the two groups were much smaller. This suggests that, at least, part of the difference in susceptibility to reverberation between normal-hearing and hearing-impaired listeners was due to ceiling effects. Across both groups, a complex interaction between the noise characteristics and reverberation was observed on the speech intelligibility scores. Further fine-grained analyses of the perception of consonants showed that, for both listener groups, final consonants were more susceptible to reverberation than initial consonants. However, differences in the perception of specific consonant features were observed between the groups.

  1. Intelligibility of Digital Speech Masked by Noise: Normal Hearing and Hearing Impaired Listeners

    DTIC Science & Technology

    1990-06-01

    spectrograms of these phrases were generated by a List 13 Processing Language (LISP) on a Symbolics 3670 artificial intelligence computer (see Figure 10). The...speech and the amount of difference varies with the type of vocoder. 26 ADPC INTELIGIBILITY AND TOE OF MAING 908 78- INTELLIGIBILITY 48 LI OS NORMA 30

  2. An Alternative to the Computational Speech Intelligibility Index Estimates: Direct Measurement of Rectangular Passband Intelligibilities

    ERIC Educational Resources Information Center

    Warren, Richard M.; Bashford, James A., Jr.; Lenz, Peter W.

    2011-01-01

    The need for determining the relative intelligibility of passbands spanning the speech spectrum has been addressed by publications of the American National Standards Institute (ANSI). When the Articulation Index (AI) standard (ANSI, S3.5, 1969, R1986) was developed, available filters confounded passband and slope contributions. The AI procedure…

  3. Speech intelligibility and subjective benefit in single-sided deaf adults after cochlear implantation.

    PubMed

    Finke, Mareike; Strauß-Schier, Angelika; Kludt, Eugen; Büchner, Andreas; Illg, Angelika

    2017-05-01

    Treatment with cochlear implants (CIs) in single-sided deaf individuals started less than a decade ago. CIs can successfully reduce incapacitating tinnitus on the deaf ear and allow, so some extent, the restoration of binaural hearing. Until now, systematic evaluations of subjective CI benefit in post-lingually single-sided deaf individuals and analyses of speech intelligibility outcome for the CI in isolation have been lacking. For the prospective part of this study, the Bern Benefit in Single-Sided Deafness Questionnaire (BBSS) was administered to 48 single-sided deaf CI users to evaluate the subjectively perceived CI benefit across different listening situations. In the retrospective part, speech intelligibility outcome with the CI up to 12 month post-activation was compared between 100 single-sided deaf CI users and 125 bilaterally implanted CI users (2nd implant). The positive median ratings in the BBSS differed significantly from zero for all items suggesting that most individuals with single-sided deafness rate their CI as beneficial across listening situations. The speech perception scores in quiet and noise improved significantly over time in both groups of CI users. Speech intelligibility with the CI in isolation was significantly better in bilaterally implanted CI users (2nd implant) compared to the scores obtained from single-sided deaf CI users. Our results indicate that CI users with single-sided deafness can reach open set speech understanding with their CI in isolation, encouraging the extension of the CI indication to individuals with normal hearing on the contralateral ear. Compared to the performance reached with bilateral CI users' second implant, speech reception threshold are lower, indicating an aural preference and dominance of the normal hearing ear. The results from the BBSS propose good satisfaction with the CI across several listening situations. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Performance in noise: Impact of reduced speech intelligibility on Sailor performance in a Navy command and control environment.

    PubMed

    Keller, M David; Ziriax, John M; Barns, William; Sheffield, Benjamin; Brungart, Douglas; Thomas, Tony; Jaeger, Bobby; Yankaskas, Kurt

    2017-06-01

    Noise, hearing loss, and electronic signal distortion, which are common problems in military environments, can impair speech intelligibility and thereby jeopardize mission success. The current study investigated the impact that impaired communication has on operational performance in a command and control environment by parametrically degrading speech intelligibility in a simulated shipborne Combat Information Center. Experienced U.S. Navy personnel served as the study participants and were required to monitor information from multiple sources and respond appropriately to communications initiated by investigators playing the roles of other personnel involved in a realistic Naval scenario. In each block of the scenario, an adaptive intelligibility modification system employing automatic gain control was used to adjust the signal-to-noise ratio to achieve one of four speech intelligibility levels on a Modified Rhyme Test: No Loss, 80%, 60%, or 40%. Objective and subjective measures of operational performance suggested that performance systematically degraded with decreasing speech intelligibility, with the largest drop occurring between 80% and 60%. These results confirm the importance of noise reduction, good communication design, and effective hearing conservation programs to maximize the operational effectiveness of military personnel. Published by Elsevier B.V.

  5. Evolution of the speech intelligibility of prelinguistically deaf children who received a cochlear implant

    NASA Astrophysics Data System (ADS)

    Bouchard, Marie-Eve; Cohen, Henri; Lenormand, Marie-Therese

    2005-04-01

    The 2 main objectives of this investigation are (1) to assess the evolution of the speech intelligibility of 12 prelinguistically deaf children implanted between 25 and 78 months of age and (2) to clarify the influence of the age at implantation on the intelligibility. Speech productions videorecorded at 6, 18 and 36 months following surgery during a standardized free play session. Selected syllables were then presented to 40 adults listeners who were asked to identify the vowels or the consonants they heard and to judge the quality of the segments. Perceived vowels were then located in the vocalic space whereas consonants were classified according to voicing, manner and place of articulation. 3 (Groups) ×3 (Times) ANOVA with repeated measures revealed a clear influence of time as well as age at implantation on the acquisition patterns. Speech intelligibility of these implanted children tended to improve as their experience with the device increased. Based on these results, it is proposed that sensory restoration following cochlear implant served as a probe to develop articulatory strategies allowing them to reach the intended acoustico-perceptual target.

  6. Intelligibility of foreign-accented speech: Effects of listening condition, listener age, and listener hearing status

    NASA Astrophysics Data System (ADS)

    Ferguson, Sarah Hargus

    2005-09-01

    It is well known that, for listeners with normal hearing, speech produced by non-native speakers of the listener's first language is less intelligible than speech produced by native speakers. Intelligibility is well correlated with listener's ratings of talker comprehensibility and accentedness, which have been shown to be related to several talker factors, including age of second language acquisition and level of similarity between the talker's native and second language phoneme inventories. Relatively few studies have focused on factors extrinsic to the talker. The current project explored the effects of listener and environmental factors on the intelligibility of foreign-accented speech. Specifically, monosyllabic English words previously recorded from two talkers, one a native speaker of American English and the other a native speaker of Spanish, were presented to three groups of listeners (young listeners with normal hearing, elderly listeners with normal hearing, and elderly listeners with hearing impairment; n=20 each) in three different listening conditions (undistorted words in quiet, undistorted words in 12-talker babble, and filtered words in quiet). Data analysis will focus on interactions between talker accent, listener age, listener hearing status, and listening condition. [Project supported by American Speech-Language-Hearing Association AARC Award.

  7. Construct-related validity of the TOCS measures: comparison of intelligibility and speaking rate scores in children with and without speech disorders.

    PubMed

    Hodge, Megan M; Gotzke, Carrie L

    2014-01-01

    This study evaluated construct-related validity of the Test of Children's Speech (TOCS). Intelligibility scores obtained using open-set word identification tasks (orthographic transcription) for the TOCS word and sentence tests and rate scores for the TOCS sentence test (words per minute or WPM and intelligible words per minute or IWPM) were compared for a group of 15 adults (18-30 years of age) with normal speech production and three groups of children: 48 3-6 year-olds with typical speech development and neurological histories (TDS), 48 3-6 year-olds with a speech sound disorder of unknown origin and no identified neurological impairment (SSD-UNK), and 22 3-10 year-olds with dysarthria and cerebral palsy (DYS). As expected, mean intelligibility scores and rates increased with age in the TDS group. However, word test intelligibility, WPM and IWPM scores for the 6 year-olds in the TDS group were significantly lower than those for the adults. The DYS group had significantly lower word and sentence test intelligibility and WPM and IWPM scores than the TDS and SSD-UNK groups. Compared to the TDS group, the SSD-UNK group also had significantly lower intelligibility scores for the word and sentence tests, and significantly lower IWPM, but not WPM scores on the sentence test. The results support the construct-related validity of TOCS as a tool for obtaining intelligibility and rate scores that are sensitive to group differences in 3-6 year-old children, with and without speech sound disorders, and to 3+ year-old children with speech disorders, with and without dysarthria. Readers will describe the word and sentence intelligibility and speaking rate performance of children with typically developing speech at age levels of 3, 4, 5 and 6 years, as measured by the Test of Children's Speech, and how these compare with adult speakers and two groups of children with speech disorders. They will also recognize what measures on this test differentiate children with speech sound

  8. Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.

    PubMed

    Fu, Szu-Wei; Li, Pei-Chun; Lai, Ying-Hui; Yang, Cheng-Chien; Hsieh, Li-Chun; Tsao, Yu

    2017-11-01

    Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients. Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient

  9. The Benefits of Bimodal Aiding on Extended Dimensions of Speech Perception: Intelligibility, Listening Effort, and Sound Quality

    PubMed Central

    Chalupper, Josef

    2017-01-01

    The benefits of combining a cochlear implant (CI) and a hearing aid (HA) in opposite ears on speech perception were examined in 15 adult unilateral CI recipients who regularly use a contralateral HA. A within-subjects design was carried out to assess speech intelligibility testing, listening effort ratings, and a sound quality questionnaire for the conditions CI alone, CIHA together, and HA alone when applicable. The primary outcome of bimodal benefit, defined as the difference between CIHA and CI, was statistically significant for speech intelligibility in quiet as well as for intelligibility in noise across tested spatial conditions. A reduction in effort on top of intelligibility at the highest tested signal-to-noise ratio was found. Moreover, the bimodal listening situation was rated to sound more voluminous, less tinny, and less unpleasant than CI alone. Listening effort and sound quality emerged as feasible and relevant measures to demonstrate bimodal benefit across a clinically representative range of bimodal users. These extended dimensions of speech perception can shed more light on the array of benefits provided by complementing a CI with a contralateral HA. PMID:28874096

  10. APEX/SPIN: a free test platform to measure speech intelligibility.

    PubMed

    Francart, Tom; Hofmann, Michael; Vanthornhout, Jonas; Van Deun, Lieselot; van Wieringen, Astrid; Wouters, Jan

    2017-02-01

    Measuring speech intelligibility in quiet and noise is important in clinical practice and research. An easy-to-use free software platform for conducting speech tests is presented, called APEX/SPIN. The APEX/SPIN platform allows the use of any speech material in combination with any noise. A graphical user interface provides control over a large range of parameters, such as number of loudspeakers, signal-to-noise ratio and parameters of the procedure. An easy-to-use graphical interface is provided for calibration and storage of calibration values. To validate the platform, perception of words in quiet and sentences in noise were measured both with APEX/SPIN and with an audiometer and CD player, which is a conventional setup in current clinical practice. Five normal-hearing listeners participated in the experimental evaluation. Speech perception results were similar for the APEX/SPIN platform and conventional procedures. APEX/SPIN is a freely available and open source platform that allows the administration of all kinds of custom speech perception tests and procedures.

  11. Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues.

    PubMed

    Wirtzfeld, Michael R; Ibrahim, Rasha A; Bruce, Ian C

    2017-10-01

    Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.

  12. Effects of Loud and Amplified Speech on Sentence and Word Intelligibility in Parkinson Disease

    ERIC Educational Resources Information Center

    Neel, Amy T.

    2009-01-01

    Purpose: In the two experiments in this study, the author examined the effects of increased vocal effort (loud speech) and amplification on sentence and word intelligibility in speakers with Parkinson disease (PD). Methods: Five talkers with PD produced sentences and words at habitual levels of effort and using loud speech techniques. Amplified…

  13. Effect of the Number of Presentations on Listener Transcriptions and Reliability in the Assessment of Speech Intelligibility in Children

    ERIC Educational Resources Information Center

    Lagerberg, Tove B.; Johnels, Jakob Åsberg; Hartelius, Lena; Persson, Christina

    2015-01-01

    Background: The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, "inter alia," to the speech material, the listener and the listener task. Aims: To explore the impact of the number of…

  14. Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

    NASA Astrophysics Data System (ADS)

    Mapp, Peter

    2002-11-01

    Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.

  15. Prosodic Stress, Information, and Intelligibility of Speech in Noise

    DTIC Science & Technology

    2009-02-28

    across periods during which acoustic information has been suppressed. 15. SUBJECT TERMS Robust speech intelligibility Computational model of...Research Fellow at the Department of Computer Science at the University of Southern California). This research involved superimposing acoustic and...presented at an invitational-only session of the Acoustical Society of America’s and European Acoustic Association’s joint meeting in 2008. In summary, the

  16. Effects of Lexical and Somatosensory Feedback on Long-Term Improvements in Intelligibility of Dysarthric Speech

    ERIC Educational Resources Information Center

    Borrie, Stephanie A.; Schäfer, Martina C. M.

    2017-01-01

    Purpose: Intelligibility improvements immediately following perceptual training with dysarthric speech using lexical feedback are comparable to those observed when training uses somatosensory feedback (Borrie & Schäfer, 2015). In this study, we investigated if these lexical and somatosensory guided improvements in listener intelligibility of…

  17. Speech communications in noise

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  18. The effect of compression and attention allocation on speech intelligibility. II

    NASA Astrophysics Data System (ADS)

    Choi, Sangsook; Carrell, Thomas

    2004-05-01

    Previous investigations of the effects of amplitude compression on measures of speech intelligibility have shown inconsistent results. Recently, a novel paradigm was used to investigate the possibility of more consistent findings with a measure of speech perception that is not based entirely on intelligibility (Choi and Carrell, 2003). That study exploited a dual-task paradigm using a pursuit rotor online visual-motor tracking task (Dlhopolsky, 2000) along with a word repetition task. Intensity-compressed words caused reduced performance on the tracking task as compared to uncompressed words when subjects engaged in a simultaneous word repetition task. This suggested an increased cognitive load when listeners processed compressed words. A stronger result might be obtained if a single resource (linguistic) is required rather than two (linguistic and visual-motor) resources. In the present experiment a visual lexical decision task and an auditory word repetition task were used. The visual stimuli for the lexical decision task were blurred and presented in a noise background. The compressed and uncompressed words for repetition were placed in speech-shaped noise. Participants with normal hearing and vision conducted word repetition and lexical decision tasks both independently and simultaneously. The pattern of results is discussed and compared to the previous study.

  19. The Role of Music in Speech Intelligibility of Learners with Post Lingual Hearing Impairment in Selected Units in Lusaka District

    ERIC Educational Resources Information Center

    Katongo, Emily Mwamba; Ndhlovu, Daniel

    2015-01-01

    This study sought to establish the role of music in speech intelligibility of learners with Post Lingual Hearing Impairment (PLHI) and strategies teachers used to enhance speech intelligibility in learners with PLHI in selected special units for the deaf in Lusaka district. The study used a descriptive research design. Qualitative and quantitative…

  20. Perceptual Measures of Speech from Individuals with Parkinson's Disease and Multiple Sclerosis: Intelligibility and beyond

    ERIC Educational Resources Information Center

    Sussman, Joan E.; Tjaden, Kris

    2012-01-01

    Purpose: The primary purpose of this study was to compare percent correct word and sentence intelligibility scores for individuals with multiple sclerosis (MS) and Parkinson's disease (PD) with scaled estimates of speech severity obtained for a reading passage. Method: Speech samples for 78 talkers were judged, including 30 speakers with MS, 16…

  1. Talker Differences in Clear and Conversational Speech: Vowel Intelligibility for Older Adults with Hearing Loss

    ERIC Educational Resources Information Center

    Ferguson, Sarah Hargus

    2012-01-01

    Purpose: To establish the range of talker variability for vowel intelligibility in clear versus conversational speech for older adults with hearing loss and to determine whether talkers who produced a clear speech benefit for young listeners with normal hearing also did so for older adults with hearing loss. Method: Clear and conversational vowels…

  2. Analysis of masking effects on speech intelligibility with respect to moving sound stimulus

    NASA Astrophysics Data System (ADS)

    Chen, Chiung Yao

    2004-05-01

    The purpose of this study is to compare the disturbed degree of speech by an immovable noise source and an apparent moving one (AMN). In the study of the sound localization, we found that source-directional sensitivity (SDS) well associates with the magnitude of interaural cross correlation (IACC). Ando et al. [Y. Ando, S. H. Kang, and H. Nagamatsu, J. Acoust. Soc. Jpn. (E) 8, 183-190 (1987)] reported that potential correlation between left and right inferior colliculus at auditory path in the brain is in harmony with the correlation function of amplitude input into two ear-canal entrances. We assume that the degree of disturbance under the apparent moving noisy source is probably different from that being installed in front of us within a constant distance in a free field (no reflection). Then, we found there is a different influence on speech intelligibility between a moving and a fixed source generated by 1/3-octave narrow-band noise with the center frequency 2 kHz. However, the reasons for the moving speed and the masking effects on speech intelligibility were uncertain.

  3. Ongoing slow oscillatory phase modulates speech intelligibility in cooperation with motor cortical activity.

    PubMed

    Onojima, Takayuki; Kitajo, Keiichi; Mizuhara, Hiroaki

    2017-01-01

    Neural oscillation is attracting attention as an underlying mechanism for speech recognition. Speech intelligibility is enhanced by the synchronization of speech rhythms and slow neural oscillation, which is typically observed as human scalp electroencephalography (EEG). In addition to the effect of neural oscillation, it has been proposed that speech recognition is enhanced by the identification of a speaker's motor signals, which are used for speech production. To verify the relationship between the effect of neural oscillation and motor cortical activity, we measured scalp EEG, and simultaneous EEG and functional magnetic resonance imaging (fMRI) during a speech recognition task in which participants were required to recognize spoken words embedded in noise sound. We proposed an index to quantitatively evaluate the EEG phase effect on behavioral performance. The results showed that the delta and theta EEG phase before speech inputs modulated the participant's response time when conducting speech recognition tasks. The simultaneous EEG-fMRI experiment showed that slow EEG activity was correlated with motor cortical activity. These results suggested that the effect of the slow oscillatory phase was associated with the activity of the motor cortex during speech recognition.

  4. Speech Intelligibility and Marital Communication in Amyotrophic Lateral Sclerosis: An Exploratory Study

    ERIC Educational Resources Information Center

    Joubert, Karin; Bornman, Juan; Alant, Erna

    2011-01-01

    Amyotrophic lateral sclerosis (ALS), a rapidly progressive neuromuscular disease, has a devastating impact not only on individuals diagnosed with ALS but also their spouses. Speech intelligibility, often compromised as a result of dysarthria, affects the couple's ability to maintain effective, intimate communication. The purpose of this…

  5. A physiologically-inspired model reproducing the speech intelligibility benefit in cochlear implant listeners with residual acoustic hearing.

    PubMed

    Zamaninezhad, Ladan; Hohmann, Volker; Büchner, Andreas; Schädler, Marc René; Jürgens, Tim

    2017-02-01

    This study introduces a speech intelligibility model for cochlear implant users with ipsilateral preserved acoustic hearing that aims at simulating the observed speech-in-noise intelligibility benefit when receiving simultaneous electric and acoustic stimulation (EA-benefit). The model simulates the auditory nerve spiking in response to electric and/or acoustic stimulation. The temporally and spatially integrated spiking patterns were used as the final internal representation of noisy speech. Speech reception thresholds (SRTs) in stationary noise were predicted for a sentence test using an automatic speech recognition framework. The model was employed to systematically investigate the effect of three physiologically relevant model factors on simulated SRTs: (1) the spatial spread of the electric field which co-varies with the number of electrically stimulated auditory nerves, (2) the "internal" noise simulating the deprivation of auditory system, and (3) the upper bound frequency limit of acoustic hearing. The model results show that the simulated SRTs increase monotonically with increasing spatial spread for fixed internal noise, and also increase with increasing the internal noise strength for a fixed spatial spread. The predicted EA-benefit does not follow such a systematic trend and depends on the specific combination of the model parameters. Beyond 300 Hz, the upper bound limit for preserved acoustic hearing is less influential on speech intelligibility of EA-listeners in stationary noise. The proposed model-predicted EA-benefits are within the range of EA-benefits shown by 18 out of 21 actual cochlear implant listeners with preserved acoustic hearing. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Intelligibility of emotional speech in younger and older adults.

    PubMed

    Dupuis, Kate; Pichora-Fuller, M Kathleen

    2014-01-01

    Little is known about the influence of vocal emotions on speech understanding. Word recognition accuracy for stimuli spoken to portray seven emotions (anger, disgust, fear, sadness, neutral, happiness, and pleasant surprise) was tested in younger and older listeners. Emotions were presented in either mixed (heterogeneous emotions mixed in a list) or blocked (homogeneous emotion blocked in a list) conditions. Three main hypotheses were tested. First, vocal emotion affects word recognition accuracy; specifically, portrayals of fear enhance word recognition accuracy because listeners orient to threatening information and/or distinctive acoustical cues such as high pitch mean and variation. Second, older listeners recognize words less accurately than younger listeners, but the effects of different emotions on intelligibility are similar across age groups. Third, blocking emotions in list results in better word recognition accuracy, especially for older listeners, and reduces the effect of emotion on intelligibility because as listeners develop expectations about vocal emotion, the allocation of processing resources can shift from emotional to lexical processing. Emotion was the within-subjects variable: all participants heard speech stimuli consisting of a carrier phrase followed by a target word spoken by either a younger or an older talker, with an equal number of stimuli portraying each of seven vocal emotions. The speech was presented in multi-talker babble at signal to noise ratios adjusted for each talker and each listener age group. Listener age (younger, older), condition (mixed, blocked), and talker (younger, older) were the main between-subjects variables. Fifty-six students (Mage= 18.3 years) were recruited from an undergraduate psychology course; 56 older adults (Mage= 72.3 years) were recruited from a volunteer pool. All participants had clinically normal pure-tone audiometric thresholds at frequencies ≤3000 Hz. There were significant main effects of

  7. Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss

    PubMed Central

    Fogerty, Daniel; Ahlstrom, Jayne B.; Bologna, William J.; Dubno, Judy R.

    2015-01-01

    This study investigated how single-talker modulated noise impacts consonant and vowel cues to sentence intelligibility. Younger normal-hearing, older normal-hearing, and older hearing-impaired listeners completed speech recognition tests. All listeners received spectrally shaped speech matched to their individual audiometric thresholds to ensure sufficient audibility with the exception of a second younger listener group who received spectral shaping that matched the mean audiogram of the hearing-impaired listeners. Results demonstrated minimal declines in intelligibility for older listeners with normal hearing and more evident declines for older hearing-impaired listeners, possibly related to impaired temporal processing. A correlational analysis suggests a common underlying ability to process information during vowels that is predictive of speech-in-modulated noise abilities. Whereas, the ability to use consonant cues appears specific to the particular characteristics of the noise and interruption. Performance declines for older listeners were mostly confined to consonant conditions. Spectral shaping accounted for the primary contributions of audibility. However, comparison with the young spectral controls who received identical spectral shaping suggests that this procedure may reduce wideband temporal modulation cues due to frequency-specific amplification that affected high-frequency consonants more than low-frequency vowels. These spectral changes may impact speech intelligibility in certain modulation masking conditions. PMID:26093436

  8. Environment-specific noise suppression for improved speech intelligibility by cochlear implant users.

    PubMed

    Hu, Yi; Loizou, Philipos C

    2010-06-01

    Attempts to develop noise-suppression algorithms that can significantly improve speech intelligibility in noise by cochlear implant (CI) users have met with limited success. This is partly because algorithms were sought that would work equally well in all listening situations. Accomplishing this has been quite challenging given the variability in the temporal/spectral characteristics of real-world maskers. A different approach is taken in the present study focused on the development of environment-specific noise suppression algorithms. The proposed algorithm selects a subset of the envelope amplitudes for stimulation based on the signal-to-noise ratio (SNR) of each channel. Binary classifiers, trained using data collected from a particular noisy environment, are first used to classify the mixture envelopes of each channel as either target-dominated (SNR>or=0 dB) or masker-dominated (SNR<0 dB). Only target-dominated channels are subsequently selected for stimulation. Results with CI listeners indicated substantial improvements (by nearly 44 percentage points at 5 dB SNR) in intelligibility with the proposed algorithm when tested with sentences embedded in three real-world maskers. The present study demonstrated that the environment-specific approach to noise reduction has the potential to restore speech intelligibility in noise to a level near to that attained in quiet.

  9. Cued Speech Transliteration: Effects of Accuracy and Lag Time on Message Intelligibility

    ERIC Educational Resources Information Center

    Krause, Jean C.; Lopez, Katherine A.

    2017-01-01

    This paper is the second in a series concerned with the level of access afforded to students who use educational interpreters. The first paper (Krause & Tessler, 2016) focused on factors affecting accuracy of messages produced by Cued Speech (CS) transliterators (expression). In this study, factors affecting intelligibility (reception by deaf…

  10. The Effect of Uni- and Bilateral Thalamic Deep Brain Stimulation on Speech in Patients With Essential Tremor: Acoustics and Intelligibility.

    PubMed

    Becker, Johannes; Barbe, Michael T; Hartinger, Mariam; Dembek, Till A; Pochmann, Jil; Wirths, Jochen; Allert, Niels; Mücke, Doris; Hermes, Anne; Meister, Ingo G; Visser-Vandewalle, Veerle; Grice, Martine; Timmermann, Lars

    2017-04-01

    Deep brain stimulation (DBS) of the ventral intermediate nucleus (VIM) is performed to suppress medically-resistant essential tremor (ET). However, stimulation induced dysarthria (SID) is a common side effect, limiting the extent to which tremor can be suppressed. To date, the exact pathogenesis of SID in VIM-DBS treated ET patients is unknown. We investigate the effect of inactivated, uni- and bilateral VIM-DBS on speech production in patients with ET. We employ acoustic measures, tempo, and intelligibility ratings and patient's self-estimated speech to quantify SID, with a focus on comparing bilateral to unilateral stimulation effects and the effect of electrode position on speech. Sixteen German ET patients participated in this study. Each patient was acoustically recorded with DBS-off, unilateral-right-hemispheric-DBS-on, unilateral-left-hemispheric-DBS-on, and bilateral-DBS-on during an oral diadochokinesis task and a read German standard text. To capture the extent of speech impairment, we measured syllable duration and intensity ratio during the DDK task. Naïve listeners rated speech tempo and speech intelligibility of the read text on a 5-point-scale. Patients had to rate their "ability to speak". We found an effect of bilateral compared to unilateral and inactivated stimulation on syllable durations and intensity ratio, as well as on external intelligibility ratings and patients' VAS scores. Additionally, VAS scores are associated with more laterally located active contacts. For speech ratings, we found an effect of syllable duration such that tempo and intelligibility was rated worse for speakers exhibiting greater syllable durations. Our data confirms that SID is more pronounced under bilateral compared to unilateral stimulation. Laterally located electrodes are associated with more severe SID according to patient's self-ratings. We can confirm the relation between diadochokinetic rate and SID in that listener's tempo and intelligibility ratings can be

  11. Impact of tongue reduction on overall speech intelligibility, articulation and oromyofunctional behavior in 4 children with Beckwith-Wiedemann syndrome.

    PubMed

    Van Lierde, K; Galiwango, G; Hodges, A; Bettens, K; Luyten, A; Vermeersch, H

    2012-01-01

    The purpose of this study was to determine the impact of partial glossectomy (using the keyhole technique) on speech intelligibility, articulation, resonance and oromyofunctional behavior. A partial glossectomy was performed in 4 children with Beckwith- Wiedemann syndrome between the ages of 0.5 and 3.1 years. An ENT assessment, a phonetic inventory, a phonemic and phonological analysis and a consensus perceptual evaluation of speech intelligibility, resonance and oromyofunctional behavior were performed. It was not possible in this study to separate the effects of the surgery from the typical developmental progress of speech sound mastery. Improved speech intelligibility, a more complete phonetic inventory, an increase in phonological skills, normal resonance and increased motor-oriented oral behavior were found in the postsurgical condition. The presence of phonetic distortions, lip incompetence and interdental tongue position were still present in the postsurgical condition. Speech therapy should be focused on correct phonetic placement and a motor-oriented approach to increase lip competence, and on functional tongue exercises and tongue lifting during the production of alveolars. Detailed analyses in a larger number of subjects with and without Beckwith-Wiedemann syndrome may help further illustrate the long-term impact of partial glossectomy. Copyright © 2011 S. Karger AG, Basel.

  12. Stimulation of the pedunculopontine nucleus area in Parkinson's disease: effects on speech and intelligibility.

    PubMed

    Pinto, Serge; Ferraye, Murielle; Espesser, Robert; Fraix, Valérie; Maillet, Audrey; Guirchoum, Jennifer; Layani-Zemour, Deborah; Ghio, Alain; Chabardès, Stéphan; Pollak, Pierre; Debû, Bettina

    2014-10-01

    Improvement of gait disorders following pedunculopontine nucleus area stimulation in patients with Parkinson's disease has previously been reported and led us to propose this surgical treatment to patients who progressively developed severe gait disorders and freezing despite optimal dopaminergic drug treatment and subthalamic nucleus stimulation. The outcome of our prospective study on the first six patients was somewhat mitigated, as freezing of gait and falls related to freezing were improved by low frequency electrical stimulation of the pedunculopontine nucleus area in some, but not all, patients. Here, we report the speech data prospectively collected in these patients with Parkinson's disease. Indeed, because subthalamic nucleus surgery may lead to speech impairment and a worsening of dysarthria in some patients with Parkinson's disease, we felt it was important to precisely examine any possible modulations of speech for a novel target for deep brain stimulation. Our results suggested a trend towards speech degradation related to the pedunculopontine nucleus area surgery (off stimulation) for aero-phonatory control (maximum phonation time), phono-articulatory coordination (oral diadochokinesis) and speech intelligibility. Possibly, the observed speech degradation may also be linked to the clinical characteristics of the group of patients. The influence of pedunculopontine nucleus area stimulation per se was more complex, depending on the nature of the task: it had a deleterious effect on maximum phonation time and oral diadochokinesis, and mixed effects on speech intelligibility. Whereas levodopa intake and subthalamic nucleus stimulation alone had no and positive effects on speech dimensions, respectively, a negative interaction between the two treatments was observed both before and after pedunculopontine nucleus area surgery. This combination effect did not seem to be modulated by pedunculopontine nucleus area stimulation. Although limited in our group of

  13. Digitised evaluation of speech intelligibility using vowels in maxillectomy patients.

    PubMed

    Sumita, Y I; Hattori, M; Murase, M; Elbashti, M E; Taniguchi, H

    2018-03-01

    Among the functional disabilities that patients face following maxillectomy, speech impairment is a major factor influencing quality of life. Proper rehabilitation of speech, which may include prosthodontic and surgical treatments and speech therapy, requires accurate evaluation of speech intelligibility (SI). A simple, less time-consuming yet accurate evaluation is desirable both for maxillectomy patients and the various clinicians providing maxillofacial treatment. This study sought to determine the utility of digital acoustic analysis of vowels for the prediction of SI in maxillectomy patients, based on a comprehensive understanding of speech production in the vocal tract of maxillectomy patients and its perception. Speech samples were collected from 33 male maxillectomy patients (mean age 57.4 years) in two conditions, without and with a maxillofacial prosthesis, and formant data for the vowels /a/,/e/,/i/,/o/, and /u/ were calculated based on linear predictive coding. The frequency range of formant 2 (F2) was determined by differences between the minimum and maximum frequency. An SI test was also conducted to reveal the relationship between SI score and F2 range. Statistical analyses were applied. F2 range and SI score were significantly different between the two conditions without and with a prosthesis (both P < .0001). F2 range was significantly correlated with SI score in both the conditions (Spearman's r = .843, P < .0001; r = .832, P < .0001, respectively). These findings indicate that calculating the F2 range from 5 vowels has clinical utility for the prediction of SI after maxillectomy. © 2017 John Wiley & Sons Ltd.

  14. Comparisons of Auditory Performance and Speech Intelligibility after Cochlear Implant Reimplantation in Mandarin-Speaking Users

    PubMed Central

    Hwang, Chung-Feng; Ko, Hui-Chen; Tsou, Yung-Ting; Chan, Kai-Chieh; Fang, Hsuan-Yeh; Wu, Che-Ming

    2016-01-01

    Objectives. We evaluated the causes, hearing, and speech performance before and after cochlear implant reimplantation in Mandarin-speaking users. Methods. In total, 589 patients who underwent cochlear implantation in our medical center between 1999 and 2014 were reviewed retrospectively. Data related to demographics, etiologies, implant-related information, complications, and hearing and speech performance were collected. Results. In total, 22 (3.74%) cases were found to have major complications. Infection (n = 12) and hard failure of the device (n = 8) were the most common major complications. Among them, 13 were reimplanted in our hospital. The mean scores of the Categorical Auditory Performance (CAP) and the Speech Intelligibility Rating (SIR) obtained before and after reimplantation were 5.5 versus 5.8 and 3.7 versus 4.3, respectively. The SIR score after reimplantation was significantly better than preoperation. Conclusions. Cochlear implantation is a safe procedure with low rates of postsurgical revisions and device failures. The Mandarin-speaking patients in this study who received reimplantation had restored auditory performance and speech intelligibility after surgery. Device soft failure was rare in our series, calling attention to Mandarin-speaking CI users requiring revision of their implants due to undesirable symptoms or decreasing performance of uncertain cause. PMID:27413753

  15. Inferior frontal sensitivity to common speech sounds is amplified by increasing word intelligibility.

    PubMed

    Vaden, Kenneth I; Kuchinsky, Stefanie E; Keren, Noam I; Harris, Kelly C; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A

    2011-11-01

    The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of phonotactic frequency to LIFG activity by manipulating word intelligibility in participants of varying age. Thirty six native English speakers, 19-79 years old (mean=50.5, sd=21.0) indicated with a button press whether they recognized 120 binaurally presented consonant-vowel-consonant words during a sparse sampling fMRI experiment (TR=8 s). Word intelligibility was manipulated by low-pass filtering (cutoff frequencies of 400 Hz, 1000 Hz, 1600 Hz, and 3150 Hz). Group analyses revealed a significant positive correlation between phonotactic frequency and LIFG activity, which was unaffected by age and hearing thresholds. A region of interest analysis revealed that the relation between phonotactic frequency and LIFG activity was significantly strengthened for the most intelligible words (low-pass cutoff at 3150 Hz). These results suggest that the responsiveness of the left inferior frontal cortex to phonotactic frequency reflects the downstream impact of word recognition rather than support of word recognition, at least when there are no speech production demands. Published by Elsevier Ltd.

  16. Integrating the acoustics of running speech into the pure tone audiogram: a step from audibility to intelligibility and disability.

    PubMed

    Corthals, Paul

    2008-01-01

    The aim of the present study is to construct a simple method for visualizing and quantifying the audibility of speech on the audiogram and to predict speech intelligibility. The proposed method involves a series of indices on the audiogram form reflecting the sound pressure level distribution of running speech. The indices that coincide with a patient's pure tone thresholds reflect speech audibility and give evidence of residual functional hearing capacity. Two validation studies were conducted among sensorineurally hearing-impaired participants (n = 56 and n = 37, respectively) to investigate the relation with speech recognition ability and hearing disability. The potential of the new audibility indices as predictors for speech reception thresholds is comparable to the predictive potential of the ANSI 1968 articulation index and the ANSI 1997 speech intelligibility index. The sum of indices or a weighted combination can explain considerable proportions of variance in speech reception results for sentences in quiet free field conditions. The proportions of variance that can be explained in questionnaire results on hearing disability are less, presumably because the threshold indices almost exclusively reflect message audibility and much less the psychosocial consequences of hearing deficits. The outcomes underpin the validity of the new audibility indexing system, even though the proposed method may be better suited for predicting relative performance across a set of conditions than for predicting absolute speech recognition performance. (c) 2007 S. Karger AG, Basel

  17. Contribution of Binaural Masking Release to Improved Speech Intelligibility for different Masker types.

    PubMed

    Sutojo, Sarinah; van de Par, Steven; Schoenmaker, Esther

    2018-06-01

    In situations with competing talkers or in the presence of masking noise, speech intelligibility can be improved by spatially separating the target speaker from the interferers. This advantage is generally referred to as spatial release from masking (SRM) and different mechanisms have been suggested to explain it. One proposed mechanism to benefit from spatial cues is the binaural masking release, which is purely stimulus driven. According to this mechanism, the spatial benefit results from differences in the binaural cues of target and masker, which need to appear simultaneously in time and frequency to improve the signal detection. In an alternative proposed mechanism, the differences in the interaural cues improve the segregation of auditory streams, a process, which involves top-down processing rather than being purely stimulus driven. Other than the cues that produce binaural masking release, the interaural cue differences between target and interferer required to improve stream segregation do not have to appear simultaneously in time and frequency. This study is concerned with the contribution of binaural masking release to SRM for three masker types that differ with respect to the amount of energetic masking they exert. Speech intelligibility was measured, employing a stimulus manipulation that inhibits binaural masking release, and analyzed with a metric to account for the number of better-ear glimpses. Results indicate that the contribution of the stimulus-driven binaural masking release plays a minor role while binaural stream segregation and the availability of glimpses in the better ear had a stronger influence on improving the speech intelligibility. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  18. Effects of Compression on Speech Acoustics, Intelligibility, and Sound Quality

    PubMed Central

    Souza, Pamela E.

    2002-01-01

    The topic of compression has been discussed quite extensively in the last 20 years (eg, Braida et al., 1982; Dillon, 1996, 2000; Dreschler, 1992; Hickson, 1994; Kuk, 2000 and 2002; Kuk and Ludvigsen, 1999; Moore, 1990; Van Tasell, 1993; Venema, 2000; Verschuure et al., 1996; Walker and Dillon, 1982). However, the latest comprehensive update by this journal was published in 1996 (Kuk, 1996). Since that time, use of compression hearing aids has increased dramatically, from half of hearing aids dispensed only 5 years ago to four out of five hearing aids dispensed today (Strom, 2002b). Most of today's digital and digitally programmable hearing aids are compression devices (Strom, 2002a). It is probable that within a few years, very few patients will be fit with linear hearing aids. Furthermore, compression has increased in complexity, with greater numbers of parameters under the clinician's control. Ideally, these changes will translate to greater flexibility and precision in fitting and selection. However, they also increase the need for information about the effects of compression amplification on speech perception and speech quality. As evidenced by the large number of sessions at professional conferences on fitting compression hearing aids, clinicians continue to have questions about compression technology and when and how it should be used. How does compression work? Who are the best candidates for this technology? How should adjustable parameters be set to provide optimal speech recognition? What effect will compression have on speech quality? These and other questions continue to drive our interest in this technology. This article reviews the effects of compression on the speech signal and the implications for speech intelligibility, quality, and design of clinical procedures. PMID:25425919

  19. Effects of cross-language voice training on speech perception: Whose familiar voices are more intelligible?

    PubMed Central

    Levi, Susannah V.; Winters, Stephen J.; Pisoni, David B.

    2011-01-01

    Previous research has shown that familiarity with a talker’s voice can improve linguistic processing (herein, “Familiar Talker Advantage”), but this benefit is constrained by the context in which the talker’s voice is familiar. The current study examined how familiarity affects intelligibility by manipulating the type of talker information available to listeners. One group of listeners learned to identify bilingual talkers’ voices from English words, where they learned language-specific talker information. A second group of listeners learned the same talkers from German words, and thus only learned language-independent talker information. After voice training, both groups of listeners completed a word recognition task with English words produced by both familiar and unfamiliar talkers. Results revealed that English-trained listeners perceived more phonemes correct for familiar than unfamiliar talkers, while German-trained listeners did not show improved intelligibility for familiar talkers. The absence of a processing advantage in speech intelligibility for the German-trained listeners demonstrates limitations on the Familiar Talker Advantage, which crucially depends on the language context in which the talkers’ voices were learned; knowledge of how a talker produces linguistically relevant contrasts in a particular language is necessary to increase speech intelligibility for words produced by familiar talkers. PMID:22225059

  20. Speech-on-speech masking with variable access to the linguistic content of the masker speech for native and nonnative english speakers.

    PubMed

    Calandruccio, Lauren; Bradlow, Ann R; Dhar, Sumitrajit

    2014-04-01

    Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared with native-accented English speech was reported in Calandruccio et al (2010a). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech Affiliationect masking release. A mixed-model design with within-subject (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech and high-intelligibility, moderate-intelligibility, and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Three listener groups were tested, including monolingual English speakers with normal hearing, nonnative English speakers with normal hearing, and monolingual English speakers with hearing loss. The nonnative English speakers were from various native language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetric mild sloping to moderate sensorineural hearing loss. Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the key words within the sentences (100 key words per masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and listener groups. Monolingual English speakers with normal hearing benefited when the competing speech signal was foreign accented compared with native

  1. Synthesized speech rate and pitch effects on intelligibility of warning messages for pilots

    NASA Technical Reports Server (NTRS)

    Simpson, C. A.; Marchionda-Frost, K.

    1984-01-01

    In civilian and military operations, a future threat-warning system with a voice display could warn pilots of other traffic, obstacles in the flight path, and/or terrain during low-altitude helicopter flights. The present study was conducted to learn whether speech rate and voice pitch of phoneme-synthesized speech affects pilot accuracy and response time to typical threat-warning messages. Helicopter pilots engaged in an attention-demanding flying task and listened for voice threat warnings presented in a background of simulated helicopter cockpit noise. Performance was measured by flying-task performance, threat-warning intelligibility, and response time. Pilot ratings were elicited for the different voice pitches and speech rates. Significant effects were obtained only for response time and for pilot ratings, both as a function of speech rate. For the few cases when pilots forgot to respond to a voice message, they remembered 90 percent of the messages accurately when queried for their response 8 to 10 sec later.

  2. The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy

    NASA Astrophysics Data System (ADS)

    Liu, Huei-Mei; Tsao, Feng-Ming; Kuhl, Patricia K.

    2005-06-01

    The purpose of this study was to examine the effect of reduced vowel working space on dysarthric talkers' speech intelligibility using both acoustic and perceptual approaches. In experiment 1, the acoustic-perceptual relationship between vowel working space area and speech intelligibility was examined in Mandarin-speaking young adults with cerebral palsy. Subjects read aloud 18 bisyllabic words containing the vowels /eye/, /aye/, and /you/ using their normal speaking rate. Each talker's words were identified by three normal listeners. The percentage of correct vowel and word identification were calculated as vowel intelligibility and word intelligibility, respectively. Results revealed that talkers with cerebral palsy exhibited smaller vowel working space areas compared to ten age-matched controls. The vowel working space area was significantly correlated with vowel intelligibility (r=0.632, p<0.005) and with word intelligibility (r=0.684, p<0.005). Experiment 2 examined whether tokens of expanded vowel working spaces were perceived as better vowel exemplars and represented with greater perceptual spaces than tokens of reduced vowel working spaces. The results of the perceptual experiment support this prediction. The distorted vowels of talkers with cerebral palsy compose a smaller acoustic space that results in shrunken intervowel perceptual distances for listeners. .

  3. The impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners (L).

    PubMed

    Kokkinakis, Kostas; Loizou, Philipos C

    2011-09-01

    The purpose of this study is to determine the relative impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners. Sentences were presented in two conditions wherein reverberant consonant segments were replaced with clean consonants, and in another condition wherein reverberant vowel segments were replaced with clean vowels. The underlying assumption is that self-masking effects would dominate in the first condition, whereas overlap-masking effects would dominate in the second condition. Results indicated that the degradation of speech intelligibility in reverberant conditions is caused primarily by self-masking effects that give rise to flattened formant transitions. © 2011 Acoustical Society of America

  4. Alternative Speech Communication System for Persons with Severe Speech Disorders

    NASA Astrophysics Data System (ADS)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  5. The effect of noise-induced hearing loss on the intelligibility of speech in noise

    NASA Astrophysics Data System (ADS)

    Smoorenburg, G. F.; Delaat, J. A. P. M.; Plomp, R.

    1981-06-01

    Speech reception thresholds, both in quiet and in noise, and tone audiograms were measured for 14 normal ears (7 subjects) and 44 ears (22 subjects) with noise-induced hearing loss. Maximum hearing loss in the 4-6 kHz region equalled 40 to 90 dB (losses exceeded by 90% and 10%, respectively). Hearing loss for speech in quiet measured with respect to the median speech reception threshold for normal ears ranged from 1.8 dB to 13.4 dB. For speech in noise the numbers are 1.2 dB to 7.0 dB which means that the subjects with noise-induced hearing loss need a 1.2 to 7.0 dB higher signal-to-noise ratio than normal to understand sentences equally well. A hearing loss for speech of 1 dB corresponds to a decrease in sentence intelligibility of 15 to 20%. The relation between hearing handicap conceived as a reduced ability to understand speech and tone audiogram is discussed. The higher signal-to-noise ratio needed by people with noise-induced hearing loss to understand speech in noisy environments is shown to be due partly to the decreased bandwidth of their hearing caused by the noise dip.

  6. Deep Learning-Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients.

    PubMed

    Lai, Ying-Hui; Tsao, Yu; Lu, Xugang; Chen, Fei; Su, Yu-Ting; Chen, Kuang-Chao; Chen, Yu-Hsuan; Chen, Li-Ching; Po-Hung Li, Lieber; Lee, Chin-Hui

    2018-01-20

    We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing. The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions. When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion

  7. Processing Techniques for Intelligibility Improvement to Speech with Co-Channel Interference.

    DTIC Science & Technology

    1983-09-01

    processing was found to be always less than in the original unprocessed co-channel sig- nali also as the length of the comb filter increased, the...7 D- i35 702 PROCESSING TECHNIQUES FOR INTELLIGIBILITY IMPRO EMENT 1𔃼.TO SPEECH WITH CO-C..(U) SIGNAL TECHNOLOGY INC GOLETACA B A HANSON ET AL SEP...11111111122 11111.25 1111 .4 111.6 MICROCOPY RESOLUTION TEST CHART NATIONAL BUREAU Of STANDARDS- 1963-A RA R.83-225 Set ,’ember 1983 PROCESSING

  8. Speech-on-speech masking with variable access to the linguistic content of the masker speech for native and non-native speakers of English

    PubMed Central

    Calandruccio, Lauren; Bradlow, Ann R.; Dhar, Sumitrajit

    2013-01-01

    Background Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared to native-accented English speech was reported in Calandruccio, Dhar and Bradlow (2010). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. Purpose The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech affect masking release. Research Design A mixed model design with within- (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech, and high-intelligibility, moderate-intelligibility and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Study Sample Three listener groups were tested including monolingual English speakers with normal hearing, non-native speakers of English with normal hearing, and monolingual speakers of English with hearing loss. The non-native speakers of English were from various native-language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetrical, mild sloping to moderate sensorineural hearing loss. Data Collection and Analysis Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the keywords within the sentences (100 keywords/masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and the listener groups. Results Monolingual speakers of English with normal

  9. Audio-visual speech intelligibility benefits with bilateral cochlear implants when talker location varies.

    PubMed

    van Hoesel, Richard J M

    2015-04-01

    One of the key benefits of using cochlear implants (CIs) in both ears rather than just one is improved localization. It is likely that in complex listening scenes, improved localization allows bilateral CI users to orient toward talkers to improve signal-to-noise ratios and gain access to visual cues, but to date, that conjecture has not been tested. To obtain an objective measure of that benefit, seven bilateral CI users were assessed for both auditory-only and audio-visual speech intelligibility in noise using a novel dynamic spatial audio-visual test paradigm. For each trial conducted in spatially distributed noise, first, an auditory-only cueing phrase that was spoken by one of four talkers was selected and presented from one of four locations. Shortly afterward, a target sentence was presented that was either audio-visual or, in another test configuration, audio-only and was spoken by the same talker and from the same location as the cueing phrase. During the target presentation, visual distractors were added at other spatial locations. Results showed that in terms of speech reception thresholds (SRTs), the average improvement for bilateral listening over the better performing ear alone was 9 dB for the audio-visual mode, and 3 dB for audition-alone. Comparison of bilateral performance for audio-visual and audition-alone showed that inclusion of visual cues led to an average SRT improvement of 5 dB. For unilateral device use, no such benefit arose, presumably due to the greatly reduced ability to localize the target talker to acquire visual information. The bilateral CI speech intelligibility advantage over the better ear in the present study is much larger than that previously reported for static talker locations and indicates greater everyday speech benefits and improved cost-benefit than estimated to date.

  10. Speech outcomes in Cantonese patients after glossectomy.

    PubMed

    Wong, Ripley Kit; Poon, Esther Sok-Man; Woo, Cynthia Yuen-Man; Chan, Sabina Ching-Shun; Wong, Elsa Siu-Ping; Chu, Ada Wai-Sze

    2007-08-01

    We sought to determine the major factors affecting speech production of Cantonese-speaking glossectomized patients. Error pattern was analyzed. Forty-one Cantonese-speaking subjects who had undergone glossectomy > or = 6 months previously were recruited. Speech production evaluation included (1) phonetic error analysis in nonsense syllable; (2) speech intelligibility in sentences evaluated by naive listeners; (3) overall speech intelligibility in conversation evaluated by experienced speech therapists. Patients receiving adjuvant radiotherapy had significantly poorer segmental and connected speech production. Total or subtotal glossectomy also resulted in poor speech outcomes. Patients having free flap reconstruction showed the best speech outcomes. Patients without lymph node metastasis had significantly better speech scores when compared with patients with lymph node metastasis. Initial consonant production had the worst scores, while vowel production was the least affected. Speech outcomes of Cantonese-speaking glossectomized patients depended on the severity of the disease. Initial consonants had the greatest effect on speech intelligibility.

  11. Assessment of the Speech Intelligibility Performance of Post Lingual Cochlear Implant Users at Different Signal-to-Noise Ratios Using the Turkish Matrix Test.

    PubMed

    Polat, Zahra; Bulut, Erdoğan; Ataş, Ahmet

    2016-09-01

    Spoken word recognition and speech perception tests in quiet are being used as a routine in assessment of the benefit which children and adult cochlear implant users receive from their devices. Cochlear implant users generally demonstrate high level performances in these test materials as they are able to achieve high level speech perception ability in quiet situations. Although these test materials provide valuable information regarding Cochlear Implant (CI) users' performances in optimal listening conditions, they do not give realistic information regarding performances in adverse listening conditions, which is the case in the everyday environment. The aim of this study was to assess the speech intelligibility performance of post lingual CI users in the presence of noise at different signal-to-noise ratio with the Matrix Test developed for Turkish language. Cross-sectional study. The thirty post lingual implant user adult subjects, who had been using implants for a minimum of one year, were evaluated with Turkish Matrix test. Subjects' speech intelligibility was measured using the adaptive and non-adaptive Matrix Test in quiet and noisy environments. The results of the study show a correlation between Pure Tone Average (PTA) values of the subjects and Matrix test Speech Reception Threshold (SRT) values in the quiet. Hence, it is possible to asses PTA values of CI users using the Matrix Test also. However, no correlations were found between Matrix SRT values in the quiet and Matrix SRT values in noise. Similarly, the correlation between PTA values and intelligibility scores in noise was also not significant. Therefore, it may not be possible to assess the intelligibility performance of CI users using test batteries performed in quiet conditions. The Matrix Test can be used to assess the benefit of CI users from their systems in everyday life, since it is possible to perform intelligibility test with the Matrix test using a material that CI users experience in

  12. Expanding the phenotypic profile of Kleefstra syndrome: A female with low-average intelligence and childhood apraxia of speech.

    PubMed

    Samango-Sprouse, Carole; Lawson, Patrick; Sprouse, Courtney; Stapleton, Emily; Sadeghin, Teresa; Gropman, Andrea

    2016-05-01

    Kleefstra syndrome (KS) is a rare neurogenetic disorder most commonly caused by deletion in the 9q34.3 chromosomal region and is associated with intellectual disabilities, severe speech delay, and motor planning deficits. To our knowledge, this is the first patient (PQ, a 6-year-old female) with a 9q34.3 deletion who has near normal intelligence, and developmental dyspraxia with childhood apraxia of speech (CAS). At 6, the Wechsler Preschool and Primary Intelligence testing (WPPSI-III) revealed a Verbal IQ of 81 and Performance IQ of 79. The Beery Buktenica Test of Visual Motor Integration, 5th Edition (VMI) indicated severe visual motor deficits: VMI = 51; Visual Perception = 48; Motor Coordination < 45. On the Receptive One Word Picture Vocabulary Test-R (ROWPVT-R), she had standard scores of 96 and 99 in contrast to an Expressive One Word Picture Vocabulary-R (EOWPVT-R) standard scores of 73 and 82, revealing a discrepancy in vocabulary domains on both evaluations. Preschool Language Scale-4 (PLS-4) on PQ's first evaluation reveals a significant difference between auditory comprehension and expressive communication with standard scores of 78 and 57, respectively, further supporting the presence of CAS. This patient's near normal intelligence expands the phenotypic profile as well as the prognosis associated with KS. The identification of CAS in this patient provides a novel explanation for the previously reported speech delay and expressive language disorder. Further research is warranted on the impact of CAS on intelligence and behavioral outcome in KS. Therapeutic and prognostic implications are discussed. © 2016 Wiley Periodicals, Inc.

  13. An Analysis of Individual Differences in Recognizing Monosyllabic Words Under the Speech Intelligibility Index Framework

    PubMed Central

    Shen, Yi; Kern, Allison B.

    2018-01-01

    Individual differences in the recognition of monosyllabic words, either in isolation (NU6 test) or in sentence context (SPIN test), were investigated under the theoretical framework of the speech intelligibility index (SII). An adaptive psychophysical procedure, namely the quick-band-importance-function procedure, was developed to enable the fitting of the SII model to individual listeners. Using this procedure, the band importance function (i.e., the relative weights of speech information across the spectrum) and the link function relating the SII to recognition scores can be simultaneously estimated while requiring only 200 to 300 trials of testing. Octave-frequency band importance functions and link functions were estimated separately for NU6 and SPIN materials from 30 normal-hearing listeners who were naïve to speech recognition experiments. For each type of speech material, considerable individual differences in the spectral weights were observed in some but not all frequency regions. At frequencies where the greatest intersubject variability was found, the spectral weights were correlated between the two speech materials, suggesting that the variability in spectral weights reflected listener-originated factors. PMID:29532711

  14. Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility

    PubMed Central

    Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

    2014-01-01

    Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise. PMID:25566159

  15. Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility.

    PubMed

    Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

    2014-01-01

    Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise.

  16. Predicting speech intelligibility in noise for hearing-critical jobs

    NASA Astrophysics Data System (ADS)

    Soli, Sigfrid D.; Laroche, Chantal; Giguere, Christian

    2003-10-01

    Many jobs require auditory abilities such as speech communication, sound localization, and sound detection. An employee for whom these abilities are impaired may constitute a safety risk for himself or herself, for fellow workers, and possibly for the general public. A number of methods have been used to predict these abilities from diagnostic measures of hearing (e.g., the pure-tone audiogram); however, these methods have not proved to be sufficiently accurate for predicting performance in the noise environments where hearing-critical jobs are performed. We have taken an alternative and potentially more accurate approach. A direct measure of speech intelligibility in noise, the Hearing in Noise Test (HINT), is instead used to screen individuals. The screening criteria are validated by establishing the empirical relationship between the HINT score and the auditory abilities of the individual, as measured in laboratory recreations of real-world workplace noise environments. The psychometric properties of the HINT enable screening of individuals with an acceptable amount of error. In this presentation, we will describe the predictive model and report the results of field measurements and laboratory studies used to provide empirical validation of the model. [Work supported by Fisheries and Oceans Canada.

  17. Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

    ERIC Educational Resources Information Center

    Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

    2018-01-01

    Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…

  18. Intelligibility in speech maskers with a binaural cochlear implant sound coding strategy inspired by the contralateral medial olivocochlear reflex.

    PubMed

    Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Gorospe, José M; Ruiz, Santiago Santa Cruz; Benito, Fernando; Wilson, Blake S

    2017-05-01

    We have recently proposed a binaural cochlear implant (CI) sound processing strategy inspired by the contralateral medial olivocochlear reflex (the MOC strategy) and shown that it improves intelligibility in steady-state noise (Lopez-Poveda et al., 2016, Ear Hear 37:e138-e148). The aim here was to evaluate possible speech-reception benefits of the MOC strategy for speech maskers, a more natural type of interferer. Speech reception thresholds (SRTs) were measured in six bilateral and two single-sided deaf CI users with the MOC strategy and with a standard (STD) strategy. SRTs were measured in unilateral and bilateral listening conditions, and for target and masker stimuli located at azimuthal angles of (0°, 0°), (-15°, +15°), and (-90°, +90°). Mean SRTs were 2-5 dB better with the MOC than with the STD strategy for spatially separated target and masker sources. For bilateral CI users, the MOC strategy (1) facilitated the intelligibility of speech in competition with spatially separated speech maskers in both unilateral and bilateral listening conditions; and (2) led to an overall improvement in spatial release from masking in the two listening conditions. Insofar as speech is a more natural type of interferer than steady-state noise, the present results suggest that the MOC strategy holds potential for promising outcomes for CI users. Copyright © 2017. Published by Elsevier B.V.

  19. Speech Prosody Across Stimulus Types for Individuals with Parkinson's Disease.

    PubMed

    K-Y Ma, Joan; Schneider, Christine B; Hoffmann, Rüdiger; Storch, Alexander

    2015-01-01

    Up to 89% of the individuals with Parkinson's disease (PD) experience speech problem over the course of the disease. Speech prosody and intelligibility are two of the most affected areas in hypokinetic dysarthria. However, assessment of these areas could potentially be problematic as speech prosody and intelligibility could be affected by the type of speech materials employed. To comparatively explore the effects of different types of speech stimulus on speech prosody and intelligibility in PD speakers. Speech prosody and intelligibility of two groups of individuals with varying degree of dysarthria resulting from PD was compared to that of a group of control speakers using sentence reading, passage reading and monologue. Acoustic analysis including measures on fundamental frequency (F0), intensity and speech rate was used to form a prosodic profile for each individual. Speech intelligibility was measured for the speakers with dysarthria using direct magnitude estimation. Difference in F0 variability between the speakers with dysarthria and control speakers was only observed in sentence reading task. Difference in the average intensity level was observed for speakers with mild dysarthria to that of the control speakers. Additionally, there were stimulus effect on both intelligibility and prosodic profile. The prosodic profile of PD speakers was different from that of the control speakers in the more structured task, and lower intelligibility was found in less structured task. This highlighted the value of both structured and natural stimulus to evaluate speech production in PD speakers.

  20. Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding?

    PubMed

    Meeuws, Matthias; Pascoal, David; Bermejo, Iñigo; Artaso, Miguel; De Ceulaer, Geert; Govaerts, Paul J

    2017-07-01

    The software application FOX ('Fitting to Outcome eXpert') is an intelligent agent to assist in the programing of cochlear implant (CI) processors. The current version utilizes a mixture of deterministic and probabilistic logic which is able to improve over time through a learning effect. This study aimed at assessing whether this learning capacity yields measurable improvements in speech understanding. A retrospective study was performed on 25 consecutive CI recipients with a median CI use experience of 10 years who came for their annual CI follow-up fitting session. All subjects were assessed by means of speech audiometry with open set monosyllables at 40, 55, 70, and 85 dB SPL in quiet with their home MAP. Other psychoacoustic tests were executed depending on the audiologist's clinical judgment. The home MAP and the corresponding test results were entered into FOX. If FOX suggested to make MAP changes, they were implemented and another speech audiometry was performed with the new MAP. FOX suggested MAP changes in 21 subjects (84%). The within-subject comparison showed a significant median improvement of 10, 3, 1, and 7% at 40, 55, 70, and 85 dB SPL, respectively. All but two subjects showed an instantaneous improvement in their mean speech audiometric score. Persons with long-term CI use, who received a FOX-assisted CI fitting at least 6 months ago, display improved speech understanding after MAP modifications, as recommended by the current version of FOX. This can be explained only by intrinsic improvements in FOX's algorithms, as they have resulted from learning. This learning is an inherent feature of artificial intelligence and it may yield measurable benefit in speech understanding even in long-term CI recipients.

  1. The Use of Artificial Neural Networks to Estimate Speech Intelligibility from Acoustic Variables: A Preliminary Analysis.

    ERIC Educational Resources Information Center

    Metz, Dale Evan; And Others

    1992-01-01

    A preliminary scheme for estimating the speech intelligibility of hearing-impaired speakers from acoustic parameters, using a computerized artificial neural network to process mathematically the acoustic input variables, is outlined. Tests with 60 hearing-impaired speakers found the scheme to be highly accurate in identifying speakers separated by…

  2. Long-term impact of tongue reduction on speech intelligibility, articulation and oromyofunctional behaviour in a child with Beckwith-Wiedemann syndrome.

    PubMed

    Van Lierde, K M; Mortier, G; Huysman, E; Vermeersch, H

    2010-03-01

    The purpose of the present case study was to determine the long-term impact of partial glossectomy (using the keyhole technique) on overall speech intelligibility and articulation in a Dutch-speaking child with Beckwith-Wiedemann syndrome (BWS). Furthermore the present study is meant as a contribution to the further delineation of the phonation, resonance, articulation and language characteristics and oral behaviour in a child with BWS. Detailed information on the speech and language characteristics of children with BWS may lead to better guidance of pediatric management programs. The child's speech was assessed 9 years after partial glossectomy with regard to ENT characteristics, overall intelligibility (perceptual consensus evaluation), articulation (phonetic and phonological errors), voice (videostroboscopy, vocal quality), resonance (perceptual, nasometric assessment), language (expressive and receptive) and oral behaviour. A class III malocclusion, an anterior open bite, diastema, overangulation of lower incisors and an enlarged but normal symmetric shaped tongue were present. The overall speech intelligibility improved from severely impaired (presurgical) to slightly impaired (5 months post-glossectomy) to normal (9 years postoperative). Comparative phonetic inventory showed a remarkable improvement of articulation. Nine years post-glossectomy three types of distortions seemed to predominate: a rhotacism and sigmatism and the substitution of the alveolar /z/. Oral behaviour, vocal characteristics and resonance were normal, but problems with expressive syntactic abilities were present. The long-term impact of partial glossectomy, using the keyhole technique (preserving the vascularity and the nervous input of the remaining intrinsic tongue muscles), on speech intelligibility, articulation, and oral behaviour in this Dutch-speaking child with congenital macroglossia can be regarded as successful. It is not clear how these expressive syntactical problems

  3. The Influence of Cochlear Mechanical Dysfunction, Temporal Processing Deficits, and Age on the Intelligibility of Audible Speech in Noise for Hearing-Impaired Listeners

    PubMed Central

    Johannesen, Peter T.; Pérez-González, Patricia; Kalluri, Sridhar; Blanco, José L.

    2016-01-01

    The aim of this study was to assess the relative importance of cochlear mechanical dysfunction, temporal processing deficits, and age on the ability of hearing-impaired listeners to understand speech in noisy backgrounds. Sixty-eight listeners took part in the study. They were provided with linear, frequency-specific amplification to compensate for their audiometric losses, and intelligibility was assessed for speech-shaped noise (SSN) and a time-reversed two-talker masker (R2TM). Behavioral estimates of cochlear gain loss and residual compression were available from a previous study and were used as indicators of cochlear mechanical dysfunction. Temporal processing abilities were assessed using frequency modulation detection thresholds. Age, audiometric thresholds, and the difference between audiometric threshold and cochlear gain loss were also included in the analyses. Stepwise multiple linear regression models were used to assess the relative importance of the various factors for intelligibility. Results showed that (a) cochlear gain loss was unrelated to intelligibility, (b) residual cochlear compression was related to intelligibility in SSN but not in a R2TM, (c) temporal processing was strongly related to intelligibility in a R2TM and much less so in SSN, and (d) age per se impaired intelligibility. In summary, all factors affected intelligibility, but their relative importance varied across maskers. PMID:27604779

  4. Dysfluencies in the speech of adults with intellectual disabilities and reported speech difficulties.

    PubMed

    Coppens-Hofman, Marjolein C; Terband, Hayo R; Maassen, Ben A M; van Schrojenstein Lantman-De Valk, Henny M J; van Zaalen-op't Hof, Yvonne; Snik, Ad F M

    2013-01-01

    In individuals with an intellectual disability, speech dysfluencies are more common than in the general population. In clinical practice, these fluency disorders are generally diagnosed and treated as stuttering rather than cluttering. To characterise the type of dysfluencies in adults with intellectual disabilities and reported speech difficulties with an emphasis on manifestations of stuttering and cluttering, which distinction is to help optimise treatment aimed at improving fluency and intelligibility. The dysfluencies in the spontaneous speech of 28 adults (18-40 years; 16 men) with mild and moderate intellectual disabilities (IQs 40-70), who were characterised as poorly intelligible by their caregivers, were analysed using the speech norms for typically developing adults and children. The speakers were subsequently assigned to different diagnostic categories by relating their resulting dysfluency profiles to mean articulatory rate and articulatory rate variability. Twenty-two (75%) of the participants showed clinically significant dysfluencies, of which 21% were classified as cluttering, 29% as cluttering-stuttering and 25% as clear cluttering at normal articulatory rate. The characteristic pattern of stuttering did not occur. The dysfluencies in the speech of adults with intellectual disabilities and poor intelligibility show patterns that are specific for this population. Together, the results suggest that in this specific group of dysfluent speakers interventions should be aimed at cluttering rather than stuttering. The reader will be able to (1) describe patterns of dysfluencies in the speech of adults with intellectual disabilities that are specific for this group of people, (2) explain that a high rate of dysfluencies in speech is potentially a major determiner of poor intelligibility in adults with ID and (3) describe suggestions for intervention focusing on cluttering rather than stuttering in dysfluent speakers with ID. Copyright © 2013 Elsevier Inc

  5. An Experimental Determination of the Intelligibility of Two Different Speech Synthesizers in Noise.

    DTIC Science & Technology

    1987-12-01

    to control the signal to noise ratios and to maintain an overall SPL of 80 dBC. The calibration of the speech signal was not easy due to its non ...ft" 432 AN EXPERIMENTAL DETERMINATION OF THE INTELLIGIBILITY OF 1.11 TiHO DIFFERENT SPE (U) PENNSYLVANIA STATE UNIV UNIVERSITY PARK APPLIED RESEARCH ...0 0 * Applied Research Laboratory The Pennsylvania State University CD4 A’ - 4,. ,__ TOO AN EcPERi.taTrAL DETEMINATION OF TlE IIjZINTELLIGIBILITY OF

  6. Effects of speaking task on intelligibility in Parkinson’s disease

    PubMed Central

    TJADEN, KRIS; WILDING, GREG

    2017-01-01

    Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare intelligibility estimates obtained for a reading passage and an extemporaneous monologue produced by12 speakers with Parkinson’s disease (PD). The relationship between structural characteristics of utterances and scaled intelligibility was explored within speakers. Speakers were audio-recorded while reading a paragraph and producing a monologue. Speech samples were separated into individual utterances for presentation to 70 listeners who judged intelligibility using orthographic transcription and direct magnitude estimation (DME). Results suggest that scaled estimates of intelligibility for reading show potential for indexing intelligibility of an extemporaneous monologue. Within-speaker variation in scaled intelligibility also was related to the number of words per speech run for extemporaneous speech. PMID:20887216

  7. Intelligibility and Clarity of Reverberant Speech: Effects of Wide Dynamic Range Compression Release Time and Working Memory

    ERIC Educational Resources Information Center

    Reinhart, Paul N.; Souza, Pamela E.

    2016-01-01

    Purpose: The purpose of this study was to examine the effects of varying wide dynamic range compression (WDRC) release time on intelligibility and clarity of reverberant speech. The study also considered the role of individual working memory. Method: Thirty older listeners with mild to moderately-severe sloping sensorineural hearing loss…

  8. A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals with Parkinson's Disease

    ERIC Educational Resources Information Center

    Kim, Yunjung; Choi, Yaelin

    2017-01-01

    Purpose: The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method: A total of 48 speakers from the 4 speaker groups (AE speakers with…

  9. Effects of Additional Low-Pass-Filtered Speech on Listening Effort for Noise-Band-Vocoded Speech in Quiet and in Noise.

    PubMed

    Pals, Carina; Sarampalis, Anastasios; van Dijk, Mart; Başkent, Deniz

    2018-05-11

    Residual acoustic hearing in electric-acoustic stimulation (EAS) can benefit cochlear implant (CI) users in increased sound quality, speech intelligibility, and improved tolerance to noise. The goal of this study was to investigate whether the low-pass-filtered acoustic speech in simulated EAS can provide the additional benefit of reducing listening effort for the spectrotemporally degraded signal of noise-band-vocoded speech. Listening effort was investigated using a dual-task paradigm as a behavioral measure, and the NASA Task Load indeX as a subjective self-report measure. The primary task of the dual-task paradigm was identification of sentences presented in three experiments at three fixed intelligibility levels: at near-ceiling, 50%, and 79% intelligibility, achieved by manipulating the presence and level of speech-shaped noise in the background. Listening effort for the primary intelligibility task was reflected in the performance on the secondary, visual response time task. Experimental speech processing conditions included monaural or binaural vocoder, with added low-pass-filtered speech (to simulate EAS) or without (to simulate CI). In Experiment 1, in quiet with intelligibility near-ceiling, additional low-pass-filtered speech reduced listening effort compared with binaural vocoder, in line with our expectations, although not compared with monaural vocoder. In Experiments 2 and 3, for speech in noise, added low-pass-filtered speech allowed the desired intelligibility levels to be reached at less favorable speech-to-noise ratios, as expected. It is interesting that this came without the cost of increased listening effort usually associated with poor speech-to-noise ratios; at 50% intelligibility, even a reduction in listening effort on top of the increased tolerance to noise was observed. The NASA Task Load indeX did not capture these differences. The dual-task results provide partial evidence for a potential decrease in listening effort as a result of

  10. Developing the Alphabetic Principle to Aid Text-Based Augmentative and Alternative Communication Use by Adults With Low Speech Intelligibility and Intellectual Disabilities.

    PubMed

    Schmidt-Naylor, Anna C; Saunders, Kathryn J; Brady, Nancy C

    2017-05-17

    We explored alphabet supplementation as an augmentative and alternative communication strategy for adults with minimal literacy. Study 1's goal was to teach onset-letter selection with spoken words and assess generalization to untaught words, demonstrating the alphabetic principle. Study 2 incorporated alphabet supplementation within a naming task and then assessed effects on speech intelligibility. Three men with intellectual disabilities (ID) and low speech intelligibility participated. Study 1 used a multiple-probe design, across three 20-word sets, to show that our computer-based training improved onset-letter selection. We also probed generalization to untrained words. Study 2 taught onset-letter selection for 30 new words chosen for functionality. Five listeners transcribed speech samples of the 30 words in 2 conditions: speech only and speech with alphabet supplementation. Across studies 1 and 2, participants demonstrated onset-letter selection for at least 90 words. Study 1 showed evidence of the alphabetic principle for some but not all word sets. In study 2, participants readily used alphabet supplementation, enabling listeners to understand twice as many words. This is the first demonstration of alphabet supplementation in individuals with ID and minimal literacy. The large number of words learned holds promise both for improving communication and providing a foundation for improved literacy.

  11. Developing the Alphabetic Principle to Aid Text-Based Augmentative and Alternative Communication Use by Adults With Low Speech Intelligibility and Intellectual Disabilities

    PubMed Central

    Schmidt-Naylor, Anna C.; Brady, Nancy C.

    2017-01-01

    Purpose We explored alphabet supplementation as an augmentative and alternative communication strategy for adults with minimal literacy. Study 1's goal was to teach onset-letter selection with spoken words and assess generalization to untaught words, demonstrating the alphabetic principle. Study 2 incorporated alphabet supplementation within a naming task and then assessed effects on speech intelligibility. Method Three men with intellectual disabilities (ID) and low speech intelligibility participated. Study 1 used a multiple-probe design, across three 20-word sets, to show that our computer-based training improved onset-letter selection. We also probed generalization to untrained words. Study 2 taught onset-letter selection for 30 new words chosen for functionality. Five listeners transcribed speech samples of the 30 words in 2 conditions: speech only and speech with alphabet supplementation. Results Across studies 1 and 2, participants demonstrated onset-letter selection for at least 90 words. Study 1 showed evidence of the alphabetic principle for some but not all word sets. In study 2, participants readily used alphabet supplementation, enabling listeners to understand twice as many words. Conclusions This is the first demonstration of alphabet supplementation in individuals with ID and minimal literacy. The large number of words learned holds promise both for improving communication and providing a foundation for improved literacy. PMID:28474087

  12. Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality

    PubMed Central

    Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E.; Moore, Brian C. J.

    2018-01-01

    Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the “clean” speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids. PMID:29708061

  13. Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality.

    PubMed

    Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J

    2018-01-01

    Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.

  14. Fundamental frequency discrimination and speech perception in noise in cochlear implant simulationsa)

    PubMed Central

    Carroll, Jeff; Zeng, Fan-Gang

    2007-01-01

    Increasing the number of channels at low frequencies improves discrimination of fundamental frequency (F0) in cochlear implants [Geurts and Wouters 2004]. We conducted three experiments to test whether improved F0 discrimination can be translated into increased speech intelligibility in noise in a cochlear implant simulation. The first experiment measured F0 discrimination and speech intelligibility in quiet as a function of channel density over different frequency regions. The results from this experiment showed a tradeoff in performance between F0 discrimination and speech intelligibility with a limited number of channels. The second experiment tested whether improved F0 discrimination and optimizing this tradeoff could improve speech performance with a competing talker. However, improved F0 discrimination did not improve speech intelligibility in noise. The third experiment identified the critical number of channels needed at low frequencies to improve speech intelligibility in noise. The result showed that, while 16 channels below 500 Hz were needed to observe any improvement in speech intelligibility in noise, even 32 channels did not achieve normal performance. Theoretically, these results suggest that without accurate spectral coding, F0 discrimination and speech perception in noise are two independent processes. Practically, the present results illustrate the need to increase the number of independent channels in cochlear implants. PMID:17604581

  15. Neurogenic Orofacial Weakness and Speech in Adults With Dysarthria

    PubMed Central

    Makashay, Matthew J.; Helou, Leah B.; Clark, Heather M.

    2017-01-01

    Purpose This study compared orofacial strength between adults with dysarthria and neurologically normal (NN) matched controls. In addition, orofacial muscle weakness was examined for potential relationships to speech impairments in adults with dysarthria. Method Matched groups of 55 adults with dysarthria and 55 NN adults generated maximum pressure (Pmax) against an air-filled bulb during lingual elevation, protrusion and lateralization, and buccodental and labial compressions. These orofacial strength measures were compared with speech intelligibility, perceptual ratings of speech, articulation rate, and fast syllable-repetition rate. Results The dysarthria group demonstrated significantly lower orofacial strength than the NN group on all tasks. Lingual strength correlated moderately and buccal strength correlated weakly with most ratings of speech deficits. Speech intelligibility was not sensitive to dysarthria severity. Individuals with severely reduced anterior lingual elevation Pmax (< 18 kPa) had normal to profoundly impaired sentence intelligibility (99%–6%) and moderately to severely impaired speech (26%–94% articulatory imprecision; 33%–94% overall severity). Conclusions Results support the presence of orofacial muscle weakness in adults with dysarthrias of varying etiologies but reinforce tenuous links between orofacial strength and speech production disorders. By examining individual data, preliminary evidence emerges to suggest that speech, but not necessarily intelligibility, is likely to be impaired when lingual weakness is severe. PMID:28763804

  16. Reduced efficiency of audiovisual integration for nonnative speech.

    PubMed

    Yi, Han-Gyol; Phelps, Jasmine E B; Smiljanic, Rajka; Chandrasekaran, Bharath

    2013-11-01

    The role of visual cues in native listeners' perception of speech produced by nonnative speakers has not been extensively studied. Native perception of English sentences produced by native English and Korean speakers in audio-only and audiovisual conditions was examined. Korean speakers were rated as more accented in audiovisual than in the audio-only condition. Visual cues enhanced word intelligibility for native English speech but less so for Korean-accented speech. Reduced intelligibility of Korean-accented audiovisual speech was associated with implicit visual biases, suggesting that listener-related factors partially influence the efficiency of audiovisual integration for nonnative speech perception.

  17. Female voice communications in high level aircraft cockpit noises--part II: vocoder and automatic speech recognition systems.

    PubMed

    Nixon, C; Anderson, T; Morris, L; McCavitt, A; McKinley, R; Yeager, D; McDaniel, M

    1998-11-01

    The intelligibility of female and male speech is equivalent under most ordinary living conditions. However, due to small differences between their acoustic speech signals, called speech spectra, one can be more or less intelligible than the other in certain situations such as high levels of noise. Anecdotal information, supported by some empirical observations, suggests that some of the high intensity noise spectra of military aircraft cockpits may degrade the intelligibility of female speech more than that of male speech. In an applied research study, the intelligibility of female and male speech was measured in several high level aircraft cockpit noise conditions experienced in military aviation. In Part I, (Nixon CW, et al. Aviat Space Environ Med 1998; 69:675-83) female speech intelligibility measured in the spectra and levels of aircraft cockpit noises and with noise-canceling microphones was lower than that of the male speech in all conditions. However, the differences were small and only those at some of the highest noise levels were significant. Although speech intelligibility of both genders was acceptable during normal cruise noises, improvements are required in most of the highest levels of noise created during maximum aircraft operating conditions. These results are discussed in a Part I technical report. This Part II report examines the intelligibility in the same aircraft cockpit noises of vocoded female and male speech and the accuracy with which female and male speech in some of the cockpit noises were understood by automatic speech recognition systems. The intelligibility of vocoded female speech was generally the same as that of vocoded male speech. No significant differences were measured between the recognition accuracy of male and female speech by the automatic speech recognition systems. The intelligibility of female and male speech was equivalent for these conditions.

  18. Suppressed Alpha Oscillations Predict Intelligibility of Speech and its Acoustic Details

    PubMed Central

    Weisz, Nathan

    2012-01-01

    Modulations of human alpha oscillations (8–13 Hz) accompany many cognitive processes, but their functional role in auditory perception has proven elusive: Do oscillatory dynamics of alpha reflect acoustic details of the speech signal and are they indicative of comprehension success? Acoustically presented words were degraded in acoustic envelope and spectrum in an orthogonal design, and electroencephalogram responses in the frequency domain were analyzed in 24 participants, who rated word comprehensibility after each trial. First, the alpha power suppression during and after a degraded word depended monotonically on spectral and, to a lesser extent, envelope detail. The magnitude of this alpha suppression exhibited an additional and independent influence on later comprehension ratings. Second, source localization of alpha suppression yielded superior parietal, prefrontal, as well as anterior temporal brain areas. Third, multivariate classification of the time–frequency pattern across participants showed that patterns of late posterior alpha power allowed best for above-chance classification of word intelligibility. Results suggest that both magnitude and topography of late alpha suppression in response to single words can indicate a listener's sensitivity to acoustic features and the ability to comprehend speech under adverse listening conditions. PMID:22100354

  19. Objective speech quality evaluation of real-time speech coders

    NASA Astrophysics Data System (ADS)

    Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.

    1984-02-01

    This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.

  20. Hybridizing Conversational and Clear Speech to Investigate the Source of Increased Intelligibility in Speakers with Parkinson's Disease

    ERIC Educational Resources Information Center

    Tjaden, Kris; Kain, Alexander; Lam, Jennifer

    2014-01-01

    Purpose: A speech analysis-resynthesis paradigm was used to investigate segmental and suprasegmental acoustic variables explaining intelligibility variation for 2 speakers with Parkinson's disease (PD). Method: Sentences were read in conversational and clear styles. Acoustic characteristics from clear sentences were extracted and applied to…

  1. Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)5: Speech Intelligibility Performance with Double Hearing Protection, HGU-84/P Flight Helmet

    DTIC Science & Technology

    2017-04-06

    Pressure Level (SPL) background pink noise. The speech intelligibility tests shall result in a Modified Rhyme Test (MRT) score as listed below...Speech intelligibility testing shall be measured per ANSI S3.2 for each background pink noise level using a minimum of ten talkers and of ten...listeners. The test shall be conducted wearing the JSAM-TA using appropriate communication 6 DISTRIBUTION STATEMENT A: Approved for public release

  2. Infant Perception of Atypical Speech Signals

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  3. On the importance of early reflections for speech in rooms.

    PubMed

    Bradley, J S; Sato, H; Picard, M

    2003-06-01

    This paper presents the results of new studies based on speech intelligibility tests in simulated sound fields and analyses of impulse response measurements in rooms used for speech communication. The speech intelligibility test results confirm the importance of early reflections for achieving good conditions for speech in rooms. The addition of early reflections increased the effective signal-to-noise ratio and related speech intelligibility scores for both impaired and nonimpaired listeners. The new results also show that for common conditions where the direct sound is reduced, it is only possible to understand speech because of the presence of early reflections. Analyses of measured impulse responses in rooms intended for speech show that early reflections can increase the effective signal-to-noise ratio by up to 9 dB. A room acoustics computer model is used to demonstrate that the relative importance of early reflections can be influenced by the room acoustics design.

  4. Neural and Behavioral Mechanisms of Clear Speech

    ERIC Educational Resources Information Center

    Luque, Jenna Silver

    2017-01-01

    Clear speech is a speaking style that has been shown to improve intelligibility in adverse listening conditions, for various listener and talker populations. Clear-speech phonetic enhancements include a slowed speech rate, expanded vowel space, and expanded pitch range. Although clear-speech phonetic enhancements have been demonstrated across a…

  5. Preliminary evaluation of synthetic speech

    DOT National Transportation Integrated Search

    1972-08-01

    The report briefly discusses the methods for storing and generating synthetic speech and a preliminary evaluation of the intelligibility of a speech synthesizer having a 75-word vocabulary selected for air traffic control messages. A program is sugge...

  6. Teachers' perceptions of students with speech sound disorders: a quantitative and qualitative analysis.

    PubMed

    Overby, Megan; Carrell, Thomas; Bernthal, John

    2007-10-01

    This study examined 2nd-grade teachers' perceptions of the academic, social, and behavioral competence of students with speech sound disorders (SSDs). Forty-eight 2nd-grade teachers listened to 2 groups of sentences differing by intelligibility and pitch but spoken by a single 2nd grader. For each sentence group, teachers rated the speaker's academic, social, and behavioral competence using an adapted version of the Teacher Rating Scale of the Self-Perception Profile for Children (S. Harter, 1985) and completed 3 open-ended questions. The matched-guise design controlled for confounding speaker and stimuli variables that were inherent in prior studies. Statistically significant differences in teachers' expectations of children's academic, social, and behavioral performances were found between moderately intelligible and normal intelligibility speech. Teachers associated moderately intelligible low-pitched speech with more behavior problems than moderately intelligible high-pitched speech or either pitch with normal intelligibility. One third of the teachers reported that they could not accurately predict a child's school performance based on the child's speech skills, one third of the teachers causally related school difficulty to SSD, and one third of the teachers made no comment. Intelligibility and speaker pitch appear to be speech variables that influence teachers' perceptions of children's school performance.

  7. The prediction of speech intelligibility in classrooms using computer models

    NASA Astrophysics Data System (ADS)

    Dance, Stephen; Dentoni, Roger

    2005-04-01

    Two classrooms were measured and modeled using the industry standard CATT model and the Web model CISM. Sound levels, reverberation times and speech intelligibility were predicted in these rooms using data for 7 octave bands. It was found that overall sound levels could be predicted to within 2 dB by both models. However, overall reverberation time was found to be accurately predicted by CATT 14% prediction error, but not by CISM, 41% prediction error. This compared to a 30% prediction error using classical theory. As for STI: CATT predicted within 11%, CISM to within 3% and Sabine to within 28% of the measured value. It should be noted that CISM took approximately 15 seconds to calculate, while CATT took 15 minutes. CISM is freely available on-line at www.whyverne.co.uk/acoustics/Pages/cism/cism.html

  8. Audiovisual Asynchrony Detection in Human Speech

    ERIC Educational Resources Information Center

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  9. Evaluation of Speech Intelligibility and Sound Localization Abilities with Hearing Aids Using Binaural Wireless Technology.

    PubMed

    Ibrahim, Iman; Parsa, Vijay; Macpherson, Ewan; Cheesman, Margaret

    2013-01-02

    Wireless synchronization of the digital signal processing (DSP) features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC). Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT) and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance.

  10. A music perception disorder (congenital amusia) influences speech comprehension.

    PubMed

    Liu, Fang; Jiang, Cunmei; Wang, Bei; Xu, Yi; Patel, Aniruddh D

    2015-01-01

    This study investigated the underlying link between speech and music by examining whether and to what extent congenital amusia, a musical disorder characterized by degraded pitch processing, would impact spoken sentence comprehension for speakers of Mandarin, a tone language. Sixteen Mandarin-speaking amusics and 16 matched controls were tested on the intelligibility of news-like Mandarin sentences with natural and flat fundamental frequency (F0) contours (created via speech resynthesis) under four signal-to-noise (SNR) conditions (no noise, +5, 0, and -5dB SNR). While speech intelligibility in quiet and extremely noisy conditions (SNR=-5dB) was not significantly compromised by flattened F0, both amusic and control groups achieved better performance with natural-F0 sentences than flat-F0 sentences under moderately noisy conditions (SNR=+5 and 0dB). Relative to normal listeners, amusics demonstrated reduced speech intelligibility in both quiet and noise, regardless of whether the F0 contours of the sentences were natural or flattened. This deficit in speech intelligibility was not associated with impaired pitch perception in amusia. These findings provide evidence for impaired speech comprehension in congenital amusia, suggesting that the deficit of amusics extends beyond pitch processing and includes segmental processing. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Measures for assessing architectural speech security (privacy) of closed offices and meeting rooms.

    PubMed

    Gover, Bradford N; Bradley, John S

    2004-12-01

    Objective measures were investigated as predictors of the speech security of closed offices and rooms. A new signal-to-noise type measure is shown to be a superior indicator for security than existing measures such as the Articulation Index, the Speech Intelligibility Index, the ratio of the loudness of speech to that of noise, and the A-weighted level difference of speech and noise. This new measure is a weighted sum of clipped one-third-octave-band signal-to-noise ratios; various weightings and clipping levels are explored. Listening tests had 19 subjects rate the audibility and intelligibility of 500 English sentences, filtered to simulate transmission through various wall constructions, and presented along with background noise. The results of the tests indicate that the new measure is highly correlated with sentence intelligibility scores and also with three security thresholds: the threshold of intelligibility (below which speech is unintelligible), the threshold of cadence (below which the cadence of speech is inaudible), and the threshold of audibility (below which speech is inaudible). The ratio of the loudness of speech to that of noise, and simple A-weighted level differences are both shown to be well correlated with these latter two thresholds (cadence and audibility), but not well correlated with intelligibility.

  12. Intelligent interfaces for expert systems

    NASA Technical Reports Server (NTRS)

    Villarreal, James A.; Wang, Lui

    1988-01-01

    Vital to the success of an expert system is an interface to the user which performs intelligently. A generic intelligent interface is being developed for expert systems. This intelligent interface was developed around the in-house developed Expert System for the Flight Analysis System (ESFAS). The Flight Analysis System (FAS) is comprised of 84 configuration controlled FORTRAN subroutines that are used in the preflight analysis of the space shuttle. In order to use FAS proficiently, a person must be knowledgeable in the areas of flight mechanics, the procedures involved in deploying a certain payload, and an overall understanding of the FAS. ESFAS, still in its developmental stage, is taking into account much of this knowledge. The generic intelligent interface involves the integration of a speech recognizer and synthesizer, a preparser, and a natural language parser to ESFAS. The speech recognizer being used is capable of recognizing 1000 words of connected speech. The natural language parser is a commercial software package which uses caseframe instantiation in processing the streams of words from the speech recognizer or the keyboard. The systems configuration is described along with capabilities and drawbacks.

  13. Perceptual Learning of Interrupted Speech

    PubMed Central

    Benard, Michel Ruben; Başkent, Deniz

    2013-01-01

    The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations. PMID:23469266

  14. Analysis of a simplified normalized covariance measure based on binary weighting functions for predicting the intelligibility of noise-suppressed speech.

    PubMed

    Chen, Fei; Loizou, Philipos C

    2010-12-01

    The normalized covariance measure (NCM) has been shown previously to predict reliably the intelligibility of noise-suppressed speech containing non-linear distortions. This study analyzes a simplified NCM measure that requires only a small number of bands (not necessarily contiguous) and uses simple binary (1 or 0) weighting functions. The rationale behind the use of a small number of bands is to account for the fact that the spectral information contained in contiguous or nearby bands is correlated and redundant. The modified NCM measure was evaluated with speech intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech corrupted by four different types of maskers (car, babble, train, and street interferences). High correlation (r = 0.8) was obtained with the modified NCM measure even when only one band was used. Further analysis revealed a masker-specific pattern of correlations when only one band was used, and bands with low correlation signified the corresponding envelopes that have been severely distorted by the noise-suppression algorithm and/or the masker. Correlation improved to r = 0.84 when only two disjoint bands (centered at 325 and 1874 Hz) were used. Even further improvements in correlation (r = 0.85) were obtained when three or four lower-frequency (<700 Hz) bands were selected.

  15. An Ecosystem of Intelligent ICT Tools for Speech-Language Therapy Based on a Formal Knowledge Model.

    PubMed

    Robles-Bykbaev, Vladimir; López-Nores, Martín; Pazos-Arias, José; Quisi-Peralta, Diego; García-Duque, Jorge

    2015-01-01

    The language and communication constitute the development mainstays of several intellectual and cognitive skills in humans. However, there are millions of people around the world who suffer from several disabilities and disorders related with language and communication, while most of the countries present a lack of corresponding services related with health care and rehabilitation. On these grounds, we are working to develop an ecosystem of intelligent ICT tools to support speech and language pathologists, doctors, students, patients and their relatives. This ecosystem has several layers and components, integrating Electronic Health Records management, standardized vocabularies, a knowledge database, an ontology of concepts from the speech-language domain, and an expert system. We discuss the advantages of such an approach through experiments carried out in several institutions assisting children with a wide spectrum of disabilities.

  16. Acoustics of Clear Speech: Effect of Instruction

    ERIC Educational Resources Information Center

    Lam, Jennifer; Tjaden, Kris; Wilding, Greg

    2012-01-01

    Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…

  17. A comparative intelligibility study of single-microphone noise reduction algorithms.

    PubMed

    Hu, Yi; Loizou, Philipos C

    2007-09-01

    The evaluation of intelligibility of noise reduction algorithms is reported. IEEE sentences and consonants were corrupted by four types of noise including babble, car, street and train at two signal-to-noise ratio levels (0 and 5 dB), and then processed by eight speech enhancement methods encompassing four classes of algorithms: spectral subtractive, sub-space, statistical model based and Wiener-type algorithms. The enhanced speech was presented to normal-hearing listeners for identification. With the exception of a single noise condition, no algorithm produced significant improvements in speech intelligibility. Information transmission analysis of the consonant confusion matrices indicated that no algorithm improved significantly the place feature score, significantly, which is critically important for speech recognition. The algorithms which were found in previous studies to perform the best in terms of overall quality, were not the same algorithms that performed the best in terms of speech intelligibility. The subspace algorithm, for instance, was previously found to perform the worst in terms of overall quality, but performed well in the present study in terms of preserving speech intelligibility. Overall, the analysis of consonant confusion matrices suggests that in order for noise reduction algorithms to improve speech intelligibility, they need to improve the place and manner feature scores.

  18. Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Klein, Harriet B.; Liu-Shea, May

    2009-01-01

    Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

  19. Dual-microphone and binaural noise reduction techniques for improved speech intelligibility by hearing aid users

    NASA Astrophysics Data System (ADS)

    Yousefian Jazi, Nima

    Spatial filtering and directional discrimination has been shown to be an effective pre-processing approach for noise reduction in microphone array systems. In dual-microphone hearing aids, fixed and adaptive beamforming techniques are the most common solutions for enhancing the desired speech and rejecting unwanted signals captured by the microphones. In fact, beamformers are widely utilized in systems where spatial properties of target source (usually in front of the listener) is assumed to be known. In this dissertation, some dual-microphone coherence-based speech enhancement techniques applicable to hearing aids are proposed. All proposed algorithms operate in the frequency domain and (like traditional beamforming techniques) are purely based on the spatial properties of the desired speech source and does not require any knowledge of noise statistics for calculating the noise reduction filter. This benefit gives our algorithms the ability to address adverse noise conditions, such as situations where interfering talker(s) speaks simultaneously with the target speaker. In such cases, the (adaptive) beamformers lose their effectiveness in suppressing interference, since the noise channel (reference) cannot be built and updated accordingly. This difference is the main advantage of the proposed techniques in the dissertation over traditional adaptive beamformers. Furthermore, since the suggested algorithms are independent of noise estimation, they offer significant improvement in scenarios that the power level of interfering sources are much more than that of target speech. The dissertation also shows the premise behind the proposed algorithms can be extended and employed to binaural hearing aids. The main purpose of the investigated techniques is to enhance the intelligibility level of speech, measured through subjective listening tests with normal hearing and cochlear implant listeners. However, the improvement in quality of the output speech achieved by the

  20. The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.

    PubMed

    De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

    2016-10-01

    This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.

  1. Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition.

    PubMed

    Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T

    2018-02-15

    The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the target. We also assessed whether the spectral resolution of the noise-vocoded stimuli affected the presence of LRM and SRM under these conditions. In Experiment 1, a mixed factorial design was used to simultaneously manipulate the masker language (within-subject, English vs. Dutch), the simulated masker location (within-subject, right, center, left), and the spectral resolution (between-subjects, 6 vs. 12 channels) of noise-vocoded target-masker combinations presented at +25 dB signal-to-noise ratio (SNR). In Experiment 2, the study was repeated using a spectral resolution of 12 channels at +15 dB SNR. In both experiments, listeners' intelligibility of noise-vocoded targets was better when the background masker was Dutch, demonstrating reliable LRM in all conditions. The pattern of results in Experiment 1 was not reliably different across the 6- and 12-channel noise-vocoded speech. Finally, a reliable spatial benefit (SRM) was detected only in the more challenging SNR condition (Experiment 2). The current study is the first to report a clear LRM benefit in noise-vocoded speech-in-speech recognition. Our results indicate that this benefit is available even under spectrally degraded conditions and that it may augment the benefit due to spatial separation of target speech and competing backgrounds.

  2. Cognitive Functions in Childhood Apraxia of Speech

    ERIC Educational Resources Information Center

    Nijland, Lian; Terband, Hayo; Maassen, Ben

    2015-01-01

    Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…

  3. The impact of phonetic dissimilarity on the perception of foreign accented speech

    NASA Astrophysics Data System (ADS)

    Weil, Shawn A.

    2003-10-01

    Non-normative speech (i.e., synthetic speech, pathological speech, foreign accented speech) is more difficult to process for native listeners than is normative speech. Does perceptual dissimilarity affect only intelligibility, or are there other costs to processing? The current series of experiments investigates both the intelligibility and time course of foreign accented speech (FAS) perception. Native English listeners heard single English words spoken by both native English speakers and non-native speakers (Mandarin or Russian). Words were chosen based on the similarity between the phonetic inventories of the respective languages. Three experimental designs were used: a cross-modal matching task, a word repetition (shadowing) task, and two subjective ratings tasks which measured impressions of accentedness and effortfulness. The results replicate previous investigations that have found that FAS significantly lowers word intelligibility. Furthermore, in FAS as well as perceptual effort, in the word repetition task, correct responses are slower to accented words than to nonaccented words. An analysis indicates that both intelligibility and reaction time are, in part, functions of the similarity between the talker's utterance and the listener's representation of the word.

  4. The Speech Intelligibility Index and the Pure-Tone Average as Predictors of Lexical Ability in Children Fit with Hearing Aids

    ERIC Educational Resources Information Center

    Stiles, Derek J.; Bentler, Ruth A.; McGregor, Karla K.

    2012-01-01

    Purpose: To determine whether a clinically obtainable measure of audibility, the aided Speech Intelligibility Index (SII; American National Standards Institute, 2007), is more sensitive than the pure-tone average (PTA) at predicting the lexical abilities of children who wear hearing aids (CHA). Method: School-age CHA and age-matched children with…

  5. Perception of intelligibility and qualities of non-native accented speakers.

    PubMed

    Fuse, Akiko; Navichkova, Yuliya; Alloggio, Krysteena

    To provide effective treatment to clients, speech-language pathologists must be understood, and be perceived to demonstrate the personal qualities necessary for therapeutic practice (e.g., resourcefulness and empathy). One factor that could interfere with the listener's perception of non-native speech is the speaker's accent. The current study explored the relationship between how accurately listeners could understand non-native speech and their perceptions of personal attributes of the speaker. Additionally, this study investigated how listeners' familiarity and experience with other languages may influence their perceptions of non-native accented speech. Through an online survey, native monolingual and bilingual English listeners rated four non-native accents (i.e., Spanish, Chinese, Russian, and Indian) on perceived intelligibility and perceived personal qualities (i.e., professionalism, intelligence, resourcefulness, empathy, and patience) necessary for speech-language pathologists. The results indicated significant relationships between the perception of intelligibility and the perception of personal qualities (i.e., professionalism, intelligence, and resourcefulness) attributed to non-native speakers. However, these findings were not supported for the Chinese accent. Bilingual listeners judged the non-native speech as more intelligible in comparison to monolingual listeners. No significant differences were found in the ratings between bilingual listeners who share the same language background as the speaker and other bilingual listeners. Based on the current findings, greater perception of intelligibility was the key to promoting a positive perception of personal qualities such as professionalism, intelligence, and resourcefulness, important for speech-language pathologists. The current study found evidence to support the claim that bilinguals have a greater ability in understanding non-native accented speech compared to monolingual listeners. The results

  6. The Effectiveness of Clear Speech as a Masker

    ERIC Educational Resources Information Center

    Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

    2010-01-01

    Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…

  7. Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)3: Noise Attenuation and Speech Intelligibility Performance with Double Hearing Protection, HGU-68/P Flight Helmet

    DTIC Science & Technology

    2017-03-31

    dB Sound Pressure Level (SPL) background pink noise. The speech intelligibility tests shall result in a Modified Rhyme Test (MRT) score as listed...below. Speech intelligibility testing shall be measured per ANSI S3.2 for each background pink noise level using a minimum of ten talkers and of ten...listeners. The test shall be conducted wearing the JSAM-TA using appropriate communication amplification. Test must include the configurations

  8. Speech Perception in Individuals with Auditory Neuropathy

    ERIC Educational Resources Information Center

    Zeng, Fan-Gang; Liu, Sheng

    2006-01-01

    Purpose: Speech perception in participants with auditory neuropathy (AN) was systematically studied to answer the following 2 questions: Does noise present a particular problem for people with AN: Can clear speech and cochlear implants alleviate this problem? Method: The researchers evaluated the advantage in intelligibility of clear speech over…

  9. Evaluation of Speech Intelligibility and Sound Localization Abilities with Hearing Aids Using Binaural Wireless Technology

    PubMed Central

    Ibrahim, Iman; Parsa, Vijay; Macpherson, Ewan; Cheesman, Margaret

    2012-01-01

    Wireless synchronization of the digital signal processing (DSP) features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC). Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT) and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance. PMID:26557339

  10. Deep Brain Stimulation of the Subthalamic Nucleus Parameter Optimization for Vowel Acoustics and Speech Intelligibility in Parkinson's Disease

    ERIC Educational Resources Information Center

    Knowles, Thea; Adams, Scott; Abeyesekera, Anita; Mancinelli, Cynthia; Gilmore, Greydon; Jog, Mandar

    2018-01-01

    Purpose: The settings of 3 electrical stimulation parameters were adjusted in 12 speakers with Parkinson's disease (PD) with deep brain stimulation of the subthalamic nucleus (STN-DBS) to examine their effects on vowel acoustics and speech intelligibility. Method: Participants were tested under permutations of low, mid, and high STN-DBS frequency,…

  11. Rhythm Perception and Its Role in Perception and Learning of Dysrhythmic Speech.

    PubMed

    Borrie, Stephanie A; Lansford, Kaitlin L; Barrett, Tyson S

    2017-03-01

    The perception of rhythm cues plays an important role in recognizing spoken language, especially in adverse listening conditions. Indeed, this has been shown to hold true even when the rhythm cues themselves are dysrhythmic. This study investigates whether expertise in rhythm perception provides a processing advantage for perception (initial intelligibility) and learning (intelligibility improvement) of naturally dysrhythmic speech, dysarthria. Fifty young adults with typical hearing participated in 3 key tests, including a rhythm perception test, a receptive vocabulary test, and a speech perception and learning test, with standard pretest, familiarization, and posttest phases. Initial intelligibility scores were calculated as the proportion of correct pretest words, while intelligibility improvement scores were calculated by subtracting this proportion from the proportion of correct posttest words. Rhythm perception scores predicted intelligibility improvement scores but not initial intelligibility. On the other hand, receptive vocabulary scores predicted initial intelligibility scores but not intelligibility improvement. Expertise in rhythm perception appears to provide an advantage for processing dysrhythmic speech, but a familiarization experience is required for the advantage to be realized. Findings are discussed in relation to the role of rhythm in speech processing and shed light on processing models that consider the consequence of rhythm abnormalities in dysarthria.

  12. Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech

    PubMed Central

    Nakajima, Yoshitaka; Matsuda, Mizuki; Ueda, Kazuo; Remijn, Gerard B.

    2018-01-01

    Temporal resolution needed for Japanese speech communication was measured. A new experimental paradigm that can reflect the spectro-temporal resolution necessary for healthy listeners to perceive speech is introduced. As a first step, we report listeners' intelligibility scores of Japanese speech with a systematically degraded temporal resolution, so-called “mosaic speech”: speech mosaicized in the coordinates of time and frequency. The results of two experiments show that mosaic speech cut into short static segments was almost perfectly intelligible with a temporal resolution of 40 ms or finer. Intelligibility dropped for a temporal resolution of 80 ms, but was still around 50%-correct level. The data are in line with previous results showing that speech signals separated into short temporal segments of <100 ms can be remarkably robust in terms of linguistic-content perception against drastic manipulations in each segment, such as partial signal omission or temporal reversal. The human perceptual system thus can extract meaning from unexpectedly rough temporal information in speech. The process resembles that of the visual system stringing together static movie frames of ~40 ms into vivid motion. PMID:29740295

  13. An Evaluation of Output Signal to Noise Ratio as a Predictor of Cochlear Implant Speech Intelligibility.

    PubMed

    Watkins, Greg D; Swanson, Brett A; Suaning, Gregg J

    2018-02-22

    Cochlear implant (CI) sound processing strategies are usually evaluated in clinical studies involving experienced implant recipients. Metrics which estimate the capacity to perceive speech for a given set of audio and processing conditions provide an alternative means to assess the effectiveness of processing strategies. The aim of this research was to assess the ability of the output signal to noise ratio (OSNR) to accurately predict speech perception. It was hypothesized that compared with the other metrics evaluated in this study (1) OSNR would have equivalent or better accuracy and (2) OSNR would be the most accurate in the presence of variable levels of speech presentation. For the first time, the accuracy of OSNR as a metric which predicts speech intelligibility was compared, in a retrospective study, with that of the input signal to noise ratio (ISNR) and the short-term objective intelligibility (STOI) metric. Because STOI measured audio quality at the input to a CI sound processor, a vocoder was applied to the sound processor output and STOI was also calculated for the reconstructed audio signal (vocoder short-term objective intelligibility [VSTOI] metric). The figures of merit calculated for each metric were Pearson correlation of the metric and a psychometric function fitted to sentence scores at each predictor value (Pearson sigmoidal correlation [PSIG]), epsilon insensitive root mean square error (RMSE*) of the psychometric function and the sentence scores, and the statistical deviance of the fitted curve to the sentence scores (D). Sentence scores were taken from three existing data sets of Australian Sentence Tests in Noise results. The AuSTIN tests were conducted with experienced users of the Nucleus CI system. The score for each sentence was the proportion of morphemes the participant correctly repeated. In data set 1, all sentences were presented at 65 dB sound pressure level (SPL) in the presence of four-talker Babble noise. Each block of

  14. Correlational Analysis of Speech Intelligibility Tests and Metrics for Speech Transmission

    DTIC Science & Technology

    2017-12-04

    frequency scale (male voice; normal voice effort) ............................... 4 Fig. 2 Diagram of a speech communication system (Letowski...languages. Consonants contain mostly high frequency (above 1500 Hz) speech energy, but this energy is relatively small in comparison to that of the whole...voices (Letowski et al. 1993). Since the mid- frequency spectral region contains mostly vowel energy while consonants are high frequency sounds, an

  15. Disentangling syntax and intelligibility in auditory language comprehension.

    PubMed

    Friederici, Angela D; Kotz, Sonja A; Scott, Sophie K; Obleser, Jonas

    2010-03-01

    Studies of the neural basis of spoken language comprehension typically focus on aspects of auditory processing by varying signal intelligibility, or on higher-level aspects of language processing such as syntax. Most studies in either of these threads of language research report brain activation including peaks in the superior temporal gyrus (STG) and/or the superior temporal sulcus (STS), but it is not clear why these areas are recruited in functionally different studies. The current fMRI study aims to disentangle the functional neuroanatomy of intelligibility and syntax in an orthogonal design. The data substantiate functional dissociations between STS and STG in the left and right hemispheres: first, manipulations of speech intelligibility yield bilateral mid-anterior STS peak activation, whereas syntactic phrase structure violations elicit strongly left-lateralized mid STG and posterior STS activation. Second, ROI analyses indicate all interactions of speech intelligibility and syntactic correctness to be located in the left frontal and temporal cortex, while the observed right-hemispheric activations reflect less specific responses to intelligibility and syntax. Our data demonstrate that the mid-to-anterior STS activation is associated with increasing speech intelligibility, while the mid-to-posterior STG/STS is more sensitive to syntactic information within the speech. 2009 Wiley-Liss, Inc.

  16. Dramatic Effects of Speech Task on Motor and Linguistic Planning in Severely Dysfluent Parkinsonian Speech

    ERIC Educational Resources Information Center

    Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

    2012-01-01

    In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…

  17. Parental and Spousal Self-Efficacy of Young Adults Who Are Deaf or Hard of Hearing: Relationship to Speech Intelligibility

    ERIC Educational Resources Information Center

    Adi-Bensaid, Limor; Michael, Rinat; Most, Tova; Gali-Cinamon, Rachel

    2012-01-01

    This study examined the parental and spousal self-efficacy (SE) of adults who are deaf and who are hard of hearing (d/hh) in relation to their speech intelligibility. Forty individuals with hearing loss completed self-report measures: Spousal SE in a relationship with a spouse who was hearing/deaf, parental SE to a child who was hearing/deaf, and…

  18. Associations between speech features and phenotypic severity in Treacher Collins syndrome

    PubMed Central

    2014-01-01

    Background Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Methods Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5–74 years, median 34 years) divided into three groups comprising children 5–10 years (n = 4), adolescents 11–18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0–6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Results Children and adolescents presented with significantly higher speech composite scores (median 4, range 1–6) than adults (median 1, range 0–5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31–99) than in adults (98%, range 93–100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability

  19. Associations between speech features and phenotypic severity in Treacher Collins syndrome.

    PubMed

    Asten, Pamela; Akre, Harriet; Persson, Christina

    2014-04-28

    Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in

  20. SPEECH EVALUATION WITH AND WITHOUT PALATAL OBTURATOR IN PATIENTS SUBMITTED TO MAXILLECTOMY

    PubMed Central

    de Carvalho-Teles, Viviane; Pegoraro-Krook, Maria Inês; Lauris, José Roberto Pereira

    2006-01-01

    Most patients who have undergone resection of the maxillae due to benign or malignant tumors in the palatomaxillary region present with speech and swallowing disorders. Coupling of the oral and nasal cavities increases nasal resonance, resulting in hypernasality and unintelligible speech. Prosthodontic rehabilitation of maxillary resections with effective separation of the oral and nasal cavities can improve speech and esthetics, and assist the psychosocial adjustment of the patient as well. The objective of this study was to evaluate the efficacy of the palatal obturator prosthesis on speech intelligibility and resonance of 23 patients with age ranging from 18 to 83 years (Mean = 49.5 years), who had undergone inframedial-structural maxillectomy. The patients were requested to count from 1 to 20, to repeat 21 words and to spontaneously speak for 15 seconds, once with and again without the prosthesis, for tape recording purposes. The resonance and speech intelligibility were judged by 5 speech language pathologists from the tape recordings samples. The results have shown that the majority of patients (82.6%) significantly improved their speech intelligibility, and 16 patients (69.9%) exhibited a significant hypernasality reduction with the obturator in place. The results of this study indicated that maxillary obturator prosthesis was efficient to improve the speech intelligibility and resonance in patients who had undergone maxillectomy. PMID:19089242

  1. Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces

    PubMed Central

    Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Savariaux, Christophe; Yvert, Blaise

    2016-01-01

    Restoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI) controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips) into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN) trained on electromagnetic articulography (EMA) data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer. PMID:27880768

  2. Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

    PubMed

    Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

    2012-12-01

    Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.

  3. The Intelligibility of Indian English. Monograph No. 4.

    ERIC Educational Resources Information Center

    Bansal, R. K.

    Twenty-four English speakers from various regions of India were tested for the intelligibility of their speech. Recordings of speech in a variety of contexts were evaluated by listeners from the United Kingdom, the United States, Nigeria, and Germany. On the basis of the resulting intelligibility scores, factors which tend to hinder…

  4. Acoustic properties of naturally produced clear speech at normal speaking rates

    NASA Astrophysics Data System (ADS)

    Krause, Jean C.; Braida, Louis D.

    2004-01-01

    Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.

  5. Breath-Group Intelligibility in Dysarthria: Characteristics and Underlying Correlates

    ERIC Educational Resources Information Center

    Yunusova, Yana; Weismer, Gary; Kent, Ray D.; Rusche, Nicole M.

    2005-01-01

    Purpose: This study was designed to determine whether within-speaker fluctuations in speech intelligibility occurred among speakers with dysarthria who produced a reading passage, and, if they did, whether selected linguistic and acoustic variables predicted the variations in speech intelligibility. Method: Participants with dysarthria included a…

  6. Children with a cochlear implant: characteristics and determinants of speech recognition, speech-recognition growth rate, and speech production.

    PubMed

    Wie, Ona Bø; Falkenberg, Eva-Signe; Tvete, Ole; Tomblin, Bruce

    2007-05-01

    The objectives of the study were to describe the characteristics of the first 79 prelingually deaf cochlear implant users in Norway and to investigate to what degree the variation in speech recognition, speech- recognition growth rate, and speech production could be explained by the characteristics of the child, the cochlear implant, the family, and the educational setting. Data gathered longitudinally were analysed using descriptive statistics, multiple regression, and growth-curve analysis. The results show that more than 50% of the variation could be explained by these characteristics. Daily user-time, non-verbal intelligence, mode of communication, length of CI experience, and educational placement had the highest effect on the outcome. The results also indicate that children educated in a bilingual approach to education have better speech perception and faster speech perception growth rate with increased focus on spoken language.

  7. Formant-Frequency Variation and Informational Masking of Speech by Extraneous Formants: Evidence Against Dynamic and Speech-Specific Acoustical Constraints

    PubMed Central

    2014-01-01

    How speech is separated perceptually from other speech remains poorly understood. Recent research indicates that the ability of an extraneous formant to impair intelligibility depends on the variation of its frequency contour. This study explored the effects of manipulating the depth and pattern of that variation. Three formants (F1+F2+F3) constituting synthetic analogues of natural sentences were distributed across the 2 ears, together with a competitor for F2 (F2C) that listeners must reject to optimize recognition (left = F1+F2C; right = F2+F3). The frequency contours of F1 − F3 were each scaled to 50% of their natural depth, with little effect on intelligibility. Competitors were created either by inverting the frequency contour of F2 about its geometric mean (a plausibly speech-like pattern) or using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Adding a competitor typically reduced intelligibility; this reduction depended on the depth of F2C variation, being greatest for 100%-depth, intermediate for 50%-depth, and least for 0%-depth (constant) F2Cs. This suggests that competitor impact depends on overall depth of frequency variation, not depth relative to that for the target formants. The absence of tuning (i.e., no minimum in intelligibility for the 50% case) suggests that the ability to reject an extraneous formant does not depend on similarity in the depth of formant-frequency variation. Furthermore, triangle-wave competitors were as effective as their more speech-like counterparts, suggesting that the selection of formants from the ensemble also does not depend on speech-specific constraints. PMID:24842068

  8. Talker Differences in Clear and Conversational Speech: Acoustic Characteristics of Vowels

    ERIC Educational Resources Information Center

    Ferguson, Sarah Hargus; Kewley-Port, Diane

    2007-01-01

    Purpose: To determine the specific acoustic changes that underlie improved vowel intelligibility in clear speech. Method: Seven acoustic metrics were measured for conversational and clear vowels produced by 12 talkers--6 who previously were found (S. H. Ferguson, 2004) to produce a large clear speech vowel intelligibility effect for listeners with…

  9. Auditory cortex activation to natural speech and simulated cochlear implant speech measured with functional near-infrared spectroscopy.

    PubMed

    Pollonini, Luca; Olds, Cristen; Abaya, Homer; Bortfeld, Heather; Beauchamp, Michael S; Oghalai, John S

    2014-03-01

    The primary goal of most cochlear implant procedures is to improve a patient's ability to discriminate speech. To accomplish this, cochlear implants are programmed so as to maximize speech understanding. However, programming a cochlear implant can be an iterative, labor-intensive process that takes place over months. In this study, we sought to determine whether functional near-infrared spectroscopy (fNIRS), a non-invasive neuroimaging method which is safe to use repeatedly and for extended periods of time, can provide an objective measure of whether a subject is hearing normal speech or distorted speech. We used a 140 channel fNIRS system to measure activation within the auditory cortex in 19 normal hearing subjects while they listed to speech with different levels of intelligibility. Custom software was developed to analyze the data and compute topographic maps from the measured changes in oxyhemoglobin and deoxyhemoglobin concentration. Normal speech reliably evoked the strongest responses within the auditory cortex. Distorted speech produced less region-specific cortical activation. Environmental sounds were used as a control, and they produced the least cortical activation. These data collected using fNIRS are consistent with the fMRI literature and thus demonstrate the feasibility of using this technique to objectively detect differences in cortical responses to speech of different intelligibility. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Speech outcome in unilateral complete cleft lip and palate patients: a descriptive study.

    PubMed

    Rullo, R; Di Maggio, D; Addabbo, F; Rullo, F; Festa, V M; Perillo, L

    2014-09-01

    In this study, resonance and articulation disorders were examined in a group of patients surgically treated for cleft lip and palate, considering family social background, and children's ability of self monitoring their speech output while speaking. Fifty children (32 males and 18 females) mean age 6.5 ± 1.6 years, affected by non-syndromic complete unilateral cleft of the lip and palate underwent the same surgical protocol. The speech level was evaluated using the Accordi's speech assessment protocol that focuses on intelligibility, nasality, nasal air escape, pharyngeal friction, and glottal stop. Pearson product-moment correlation analysis was used to detect significant associations between analysed parameters. A total of 16% (8 children) of the sample had severe to moderate degree of nasality and nasal air escape, presence of pharyngeal friction and glottal stop, which obviously compromise speech intelligibility. Ten children (10%) showed a barely acceptable phonological outcome: nasality and nasal air escape were mild to moderate, but the intelligibility remained poor. Thirty-two children (64%) had normal speech. Statistical analysis revealed a significant correlation between the severity of nasal resonance and nasal air escape (p ≤ 0.05). No statistical significant correlation was found between the final intelligibility and the patient social background, neither between the final intelligibility nor the age of the patients. The differences in speech outcome could be explained with a specific, subjective, and inborn ability, different for each child, in self-monitoring their speech output.

  11. Using on-line altered auditory feedback treating Parkinsonian speech

    NASA Astrophysics Data System (ADS)

    Wang, Emily; Verhagen, Leo; de Vries, Meinou H.

    2005-09-01

    Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].

  12. Cognitive Processing Speed, Working Memory, and the Intelligibility of Hearing Aid-Processed Speech in Persons with Hearing Impairment

    PubMed Central

    Yumba, Wycliffe Kabaywe

    2017-01-01

    Previous studies have demonstrated that successful listening with advanced signal processing in digital hearing aids is associated with individual cognitive capacity, particularly working memory capacity (WMC). This study aimed to examine the relationship between cognitive abilities (cognitive processing speed and WMC) and individual listeners’ responses to digital signal processing settings in adverse listening conditions. A total of 194 native Swedish speakers (83 women and 111 men), aged 33–80 years (mean = 60.75 years, SD = 8.89), with bilateral, symmetrical mild to moderate sensorineural hearing loss who had completed a lexical decision speed test (measuring cognitive processing speed) and semantic word-pair span test (SWPST, capturing WMC) participated in this study. The Hagerman test (capturing speech recognition in noise) was conducted using an experimental hearing aid with three digital signal processing settings: (1) linear amplification without noise reduction (NoP), (2) linear amplification with noise reduction (NR), and (3) non-linear amplification without NR (“fast-acting compression”). The results showed that cognitive processing speed was a better predictor of speech intelligibility in noise, regardless of the types of signal processing algorithms used. That is, there was a stronger association between cognitive processing speed and NR outcomes and fast-acting compression outcomes (in steady state noise). We observed a weaker relationship between working memory and NR, but WMC did not relate to fast-acting compression. WMC was a relatively weaker predictor of speech intelligibility in noise. These findings might have been different if the participants had been provided with training and or allowed to acclimatize to binary masking noise reduction or fast-acting compression. PMID:28861009

  13. Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)3: Noise Attenuation and Speech Intelligibility Performance with Double Hearing Protection, HGU-55A/P JHMCS Flight Helmet

    DTIC Science & Technology

    2017-03-01

    in an environment 71-115 dB Sound Pressure Level (SPL) background pink noise. The speech intelligibility tests shall result in a Modified Rhyme... Test (MRT) score as listed below. Speech intelligibility testing shall be measured per ANSI S3.2 for each background pink noise level using a...minimum of ten talkers and of ten listeners. The test shall be conducted wearing the JSAM-TA using appropriate communication amplification. Test must

  14. Effects of Speaking Task on Intelligibility in Parkinson's Disease

    ERIC Educational Resources Information Center

    Tjaden, Kris; Wilding, Greg

    2011-01-01

    Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare…

  15. Speech intelligibility index predictions for young and old listeners in automobile noise: Can the index be improved by incorporating factors other than absolute threshold?

    NASA Astrophysics Data System (ADS)

    Saweikis, Meghan; Surprenant, Aimée M.; Davies, Patricia; Gallant, Don

    2003-10-01

    While young and old subjects with comparable audiograms tend to perform comparably on speech recognition tasks in quiet environments, the older subjects have more difficulty than the younger subjects with recognition tasks in degraded listening conditions. This suggests that factors other than an absolute threshold may account for some of the difficulty older listeners have on recognition tasks in noisy environments. Many metrics, including the Speech Intelligibility Index (SII), used to measure speech intelligibility, only consider an absolute threshold when accounting for age related hearing loss. Therefore these metrics tend to overestimate the performance for elderly listeners in noisy environments [Tobias et al., J. Acoust. Soc. Am. 83, 859-895 (1988)]. The present studies examine the predictive capabilities of the SII in an environment with automobile noise present. This is of interest because people's evaluation of the automobile interior sound is closely linked to their ability to carry on conversations with their fellow passengers. The four studies examine whether, for subjects with age related hearing loss, the accuracy of the SII can be improved by incorporating factors other than an absolute threshold into the model. [Work supported by Ford Motor Company.

  16. Speech intelligibility with helicopter noise: tests of three helmet-mounted communication systems.

    PubMed

    Ribera, John E; Mozo, Ben T; Murphy, Barbara A

    2004-02-01

    Military aviator helmet communications systems are designed to enhance speech intelligibility (SI) in background noise and reduce exposure to harmful levels of noise. Some aviators, over the course of their aviation career, develop noise-induced hearing loss that may affect their ability to perform required tasks. New technology can improve SI in noise for aviators with normal hearing as well as those with hearing loss. SI in noise scores were obtained from 40 rotary-wing aviators (20 with normal hearing and 20 with hearing-loss waivers). There were three communications systems evaluated: a standard SPH-4B, an SPH-4B aviator helmet modified with communications earplug (CEP), and an SPH-4B modified with active noise reduction (ANR). Subjects' SI was better in noise with newer technologies than with the standard issue aviator helmet. A significant number of aviators on waivers for hearing loss performed within the range of their normal hearing counterparts when wearing the newer technology. The rank order of perceived speech clarity was 1) CEP, 2) ANR, and 3) unmodified SPH-4B. To insure optimum SI in noise for rotary-wing aviators, consideration should be given to retrofitting existing aviator helmets with new technology, and incorporating such advances in communication systems of the future. Review of standards for determining fitness to fly is needed.

  17. Spectrotemporal modulation sensitivity for hearing-impaired listeners: dependence on carrier center frequency and the relationship to speech intelligibility.

    PubMed

    Mehraei, Golbarg; Gallun, Frederick J; Leek, Marjorie R; Bernstein, Joshua G W

    2014-07-01

    Poor speech understanding in noise by hearing-impaired (HI) listeners is only partly explained by elevated audiometric thresholds. Suprathreshold-processing impairments such as reduced temporal or spectral resolution or temporal fine-structure (TFS) processing ability might also contribute. Although speech contains dynamic combinations of temporal and spectral modulation and TFS content, these capabilities are often treated separately. Modulation-depth detection thresholds for spectrotemporal modulation (STM) applied to octave-band noise were measured for normal-hearing and HI listeners as a function of temporal modulation rate (4-32 Hz), spectral ripple density [0.5-4 cycles/octave (c/o)] and carrier center frequency (500-4000 Hz). STM sensitivity was worse than normal for HI listeners only for a low-frequency carrier (1000 Hz) at low temporal modulation rates (4-12 Hz) and a spectral ripple density of 2 c/o, and for a high-frequency carrier (4000 Hz) at a high spectral ripple density (4 c/o). STM sensitivity for the 4-Hz, 4-c/o condition for a 4000-Hz carrier and for the 4-Hz, 2-c/o condition for a 1000-Hz carrier were correlated with speech-recognition performance in noise after partialling out the audiogram-based speech-intelligibility index. Poor speech-reception and STM-detection performance for HI listeners may be related to a combination of reduced frequency selectivity and a TFS-processing deficit limiting the ability to track spectral-peak movements.

  18. Spectrotemporal modulation sensitivity for hearing-impaired listeners: Dependence on carrier center frequency and the relationship to speech intelligibility

    PubMed Central

    Mehraei, Golbarg; Gallun, Frederick J.; Leek, Marjorie R.; Bernstein, Joshua G. W.

    2014-01-01

    Poor speech understanding in noise by hearing-impaired (HI) listeners is only partly explained by elevated audiometric thresholds. Suprathreshold-processing impairments such as reduced temporal or spectral resolution or temporal fine-structure (TFS) processing ability might also contribute. Although speech contains dynamic combinations of temporal and spectral modulation and TFS content, these capabilities are often treated separately. Modulation-depth detection thresholds for spectrotemporal modulation (STM) applied to octave-band noise were measured for normal-hearing and HI listeners as a function of temporal modulation rate (4–32 Hz), spectral ripple density [0.5–4 cycles/octave (c/o)] and carrier center frequency (500–4000 Hz). STM sensitivity was worse than normal for HI listeners only for a low-frequency carrier (1000 Hz) at low temporal modulation rates (4–12 Hz) and a spectral ripple density of 2 c/o, and for a high-frequency carrier (4000 Hz) at a high spectral ripple density (4 c/o). STM sensitivity for the 4-Hz, 4-c/o condition for a 4000-Hz carrier and for the 4-Hz, 2-c/o condition for a 1000-Hz carrier were correlated with speech-recognition performance in noise after partialling out the audiogram-based speech-intelligibility index. Poor speech-reception and STM-detection performance for HI listeners may be related to a combination of reduced frequency selectivity and a TFS-processing deficit limiting the ability to track spectral-peak movements. PMID:24993215

  19. Multi-time resolution analysis of speech: evidence from psychophysics

    PubMed Central

    Chait, Maria; Greenberg, Steven; Arai, Takayuki; Simon, Jonathan Z.; Poeppel, David

    2015-01-01

    How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10–40 Hz modulation frequency) and syllable-sized (2–10 Hz modulation frequency) units in speech processing. However, it is not clear how information associated with such different time scales interacts in a manner relevant for speech perception. We report behavioral experiments on speech intelligibility employing a stimulus that allows us to investigate how distinct temporal modulations in speech are treated separately and whether they are combined. We created sentences in which the slow (~4 Hz; Slow) and rapid (~33 Hz; Shigh) modulations—corresponding to ~250 and ~30 ms, the average duration of syllables and certain phonetic properties, respectively—were selectively extracted. Although Slow and Shigh have low intelligibility when presented separately, dichotic presentation of Shigh with Slow results in supra-additive performance, suggesting a synergistic relationship between low- and high-modulation frequencies. A second experiment desynchronized presentation of the Slow and Shigh signals. Desynchronizing signals relative to one another had no impact on intelligibility when delays were less than ~45 ms. Longer delays resulted in a steep intelligibility decline, providing further evidence of integration or binding of information within restricted temporal windows. Our data suggest that human speech perception uses multi-time resolution processing. Signals are concurrently analyzed on at least two separate time scales, the intermediate representations of these analyses are integrated, and the resulting bound percept has significant consequences for speech intelligibility—a view compatible with recent insights from neuroscience implicating multi-timescale auditory

  20. Talker- and language-specific effects on speech intelligibility in noise assessed with bilingual talkers: Which language is more robust against noise and reverberation?

    PubMed

    Hochmuth, Sabine; Jürgens, Tim; Brand, Thomas; Kollmeier, Birger

    2015-01-01

    Investigate talker- and language-specific aspects of speech intelligibility in noise and reverberation using highly comparable matrix sentence tests across languages. Matrix sentences spoken by German/Russian and German/Spanish bilingual talkers were recorded. These sentences were used to measure speech reception thresholds (SRTs) with native listeners in the respective languages in different listening conditions (stationary and fluctuating noise, multi-talker babble, reverberated speech-in-noise condition). Four German/Russian and four German/Spanish bilingual talkers; 20 native German-speaking, 10 native Russian-speaking, and 10 native Spanish-speaking listeners. Across-talker SRT differences of up to 6 dB were found for both groups of bilinguals. SRTs of German/Russian bilingual talkers were the same in both languages. SRTs of German/Spanish bilingual talkers were higher when they talked in Spanish than when they talked in German. The benefit from listening in the gaps was similar across all languages. The detrimental effect of reverberation was larger for Spanish than for German and Russian. Within the limitations set by the number and slight accentedness of talkers and other possible confounding factors, talker- and test-condition-dependent differences were isolated from the language effect: Russian and German exhibited similar intelligibility in noise and reverberation, whereas Spanish was more impaired in these situations.

  1. Perceptual analysis of speech following traumatic brain injury in childhood.

    PubMed

    Cahill, Louise M; Murdoch, Bruce E; Theodoros, Deborah G

    2002-05-01

    To investigate perceptually the speech dimensions, oromotor function, and speech intelligibility of a group of individuals with traumatic brain injury (TBI) acquired in childhood. The speech of 24 children with TBI was analysed perceptually and compared with that of a group of non-neurologically impaired children matched for age and sex. The 16 dysarthric TBI subjects were significantly less intelligible than the control subjects, and demonstrated significant impairment in 12 of the 33 speech dimensions rated. In addition, the eight non-dysarthric TBI subjects were significantly impaired in many areas of oromotor function on the Frenchay Dysarthria Assessment, indicating some degree of pre-clinical speech impairment. The results of the perceptual analysis are discussed in terms of the possible underlying pathophysiological bases of the deviant speech features identified, and the need for a comprehensive instrumental assessment, to more accurately determine the level of breakdown in the speech production mechanism in children following TBI.

  2. An algorithm to improve speech recognition in noise for hearing-impaired listeners

    PubMed Central

    Healy, Eric W.; Yoho, Sarah E.; Wang, Yuxuan; Wang, DeLiang

    2013-01-01

    Despite considerable effort, monaural (single-microphone) algorithms capable of increasing the intelligibility of speech in noise have remained elusive. Successful development of such an algorithm is especially important for hearing-impaired (HI) listeners, given their particular difficulty in noisy backgrounds. In the current study, an algorithm based on binary masking was developed to separate speech from noise. Unlike the ideal binary mask, which requires prior knowledge of the premixed signals, the masks used to segregate speech from noise in the current study were estimated by training the algorithm on speech not used during testing. Sentences were mixed with speech-shaped noise and with babble at various signal-to-noise ratios (SNRs). Testing using normal-hearing and HI listeners indicated that intelligibility increased following processing in all conditions. These increases were larger for HI listeners, for the modulated background, and for the least-favorable SNRs. They were also often substantial, allowing several HI listeners to improve intelligibility from scores near zero to values above 70%. PMID:24116438

  3. Prosodic Features and Speech Naturalness in Individuals with Dysarthria

    ERIC Educational Resources Information Center

    Klopfenstein, Marie I.

    2012-01-01

    Despite the importance of speech naturalness to treatment outcomes, little research has been done on what constitutes speech naturalness and how to best maximize naturalness in relationship to other treatment goals like intelligibility. In addition, previous literature alludes to the relationship between prosodic aspects of speech and speech…

  4. Criterion-related validity of the Test of Children's Speech sentence intelligibility measure for children with cerebral palsy and dysarthria.

    PubMed

    Hodge, Megan; Gotzke, Carrie Lynne

    2014-08-01

    To evaluate the criterion-related validity of the TOCS+ sentence measure (TOCS+, Hodge, Daniels & Gotzke, 2009 ) for children with dysarthria and CP by comparing intelligibility and rate scores obtained concurrently from the TOCS+ and from a conversational sample. Twenty children (3 to 10 years old) diagnosed with spastic cerebral palsy (CP) participated. Nineteen children also had a confirmed diagnosis of dysarthria. Children's intelligibility and speaking rate scores obtained from the TOCS+, which uses imitation of sets of randomly selected items ranging from 2-7 words (80 words in total) and from a contiguous 100-word conversational speech were compared. Mean intelligibility scores were 46.5% (SD = 26.4%) and 50.9% (SD = 19.1%) and mean rates in words per minute (WPM) were 90.2 (SD = 22.3) and 94.1 (SD = 25.6), respectively, for the TOCS+ and conversational samples. No significant differences were found between the two conditions for intelligibility or rate scores. Strong correlations were found between the TOCS+ and conversational samples for intelligibility (r = 0.86; p < 0.001) and WPM (r = 0.77; p < 0.001), supporting the criterion validity of the TOCS+ sentence task as a time efficient procedure for measuring intelligibility and rate in children with CP, with and without confirmed dysarthria. The results support the criterion validity of the TOCS+ sentence task as a time efficient procedure for measuring intelligibility and rate in children with CP, with and without confirmed dysarthria. Children varied in their relative performance on the two speaking tasks, reflecting the complexity of factors that influence intelligibility and rate scores.

  5. Intelligibility and Acceptability Testing for Speech Technology

    DTIC Science & Technology

    1992-05-22

    information in memory (Luce, Feustel, and Pisoni, 1983). In high workload or multiple task situations, the added effort of listening to degraded speech can lead...the DRT provides diagnostic feature scores on six phonemic features: voicing, nasality, sustention , sibilation, graveness, and compactness, and on a...of other speech materials (e.g., polysyllabic words, paragraphs) and methods ( memory , comprehension, reaction time) have been used to evaluate the

  6. Effects of subthalamic stimulation on speech of consecutive patients with Parkinson disease

    PubMed Central

    Zrinzo, L.; Martinez-Torres, I.; Frost, E.; Pinto, S.; Foltynie, T.; Holl, E.; Petersen, E.; Roughton, M.; Hariz, M.I.; Limousin, P.

    2011-01-01

    Objective: Subthalamic nucleus deep brain stimulation (STN-DBS) is an effective treatment for advanced Parkinson disease (PD). Following STN-DBS, speech intelligibility can deteriorate, limiting its beneficial effect. Here we prospectively examined the short- and long-term speech response to STN-DBS in a consecutive series of patients to identify clinical and surgical factors associated with speech change. Methods: Thirty-two consecutive patients were assessed before surgery, then 1 month, 6 months, and 1 year after STN-DBS in 4 conditions on- and off-medication with on- and off-stimulation using established and validated speech and movement scales. Fifteen of these patients were followed up for 3 years. A control group of 12 patients with PD were followed up for 1 year. Results: Within the surgical group, speech intelligibility significantly deteriorated by an average of 14.2% ± 20.15% off-medication and 16.9% ± 21.8% on-medication 1 year after STN-DBS. The medical group deteriorated by 3.6% ± 5.5% and 4.5% ± 8.8%, respectively. Seven patients showed speech amelioration after surgery. Loudness increased significantly in all tasks with stimulation. A less severe preoperative on-medication motor score was associated with a more favorable speech response to STN-DBS after 1 year. Medially located electrodes on the left STN were associated with a significantly higher risk of speech deterioration than electrodes within the nucleus. There was a strong relationship between high voltage in the left electrode and poor speech outcome at 1 year. Conclusion: The effect of STN-DBS on speech is variable and multifactorial, with most patients exhibiting decline of speech intelligibility. Both medical and surgical issues contribute to deterioration of speech in STN-DBS patients. Classification of evidence: This study provides Class III evidence that STN-DBS for PD results in deterioration in speech intelligibility in all combinations of medication and stimulation states at 1

  7. Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech

    PubMed Central

    Van Ackeren, Markus Johannes; Barbero, Francesca M; Mattioni, Stefania; Bottini, Roberto

    2018-01-01

    The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives. PMID:29338838

  8. STANFORD ARTIFICIAL INTELLIGENCE PROJECT.

    DTIC Science & Technology

    ARTIFICIAL INTELLIGENCE , GAME THEORY, DECISION MAKING, BIONICS, AUTOMATA, SPEECH RECOGNITION, GEOMETRIC FORMS, LEARNING MACHINES, MATHEMATICAL MODELS, PATTERN RECOGNITION, SERVOMECHANISMS, SIMULATION, BIBLIOGRAPHIES.

  9. Measures to Evaluate the Effects of DBS on Speech Production

    PubMed Central

    Weismer, Gary; Yunusova, Yana; Bunton, Kate

    2011-01-01

    The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066

  10. A laboratory study for assessing speech privacy in a simulated open-plan office.

    PubMed

    Lee, P J; Jeon, J Y

    2014-06-01

    The aim of this study is to assess speech privacy in open-plan office using two recently introduced single-number quantities: the spatial decay rate of speech, DL(2,S) [dB], and the A-weighted sound pressure level of speech at a distance of 4 m, L(p,A,S,4) m [dB]. Open-plan offices were modeled using a DL(2,S) of 4, 8, and 12 dB, and L(p,A,S,4) m was changed in three steps, from 43 to 57 dB.Auditory experiments were conducted at three locations with source–receiver distances of 8, 16, and 24 m, while background noise level was fixed at 30 dBA.A total of 20 subjects were asked to rate the speech intelligibility and listening difficulty of 240 Korean sentences in such surroundings. The speech intelligibility scores were not affected by DL(2,S) or L(p,A,S,4) m at a source–receiver distance of 8 m; however, listening difficulty ratings were significantly changed with increasing DL(2,S) and L(p,A,S,4) m values. At other locations, the influences of DL(2,S) and L(p,A,S,4) m on speech intelligibility and listening difficulty ratings were significant. It was also found that the speech intelligibility scores and listening difficulty ratings were considerably changed with increasing the distraction distance (r(D)). Furthermore, listening difficulty is more sensitive to variations in DL(2,S) and L(p,A,S,4) m than intelligibility scores for sound fields with high speech transmission performances. The recently introduced single-number quantities in the ISO standard, based on the spatial distribution of sound pressure level, were associated with speech privacy in an open-plan office. The results support single-number quantities being suitable to assess speech privacy, mainly at large distances. This new information can be considered when designing open-plan offices and making acoustic guidelines of open-plan offices.

  11. Effect of Fundamental Frequency on Judgments of Electrolaryngeal Speech

    ERIC Educational Resources Information Center

    Nagle, Kathy F.; Eadie, Tanya L.; Wright, Derek R.; Sumida, Yumi A.

    2012-01-01

    Purpose: To determine (a) the effect of fundamental frequency (f0) on speech intelligibility, acceptability, and perceived gender in electrolaryngeal (EL) speakers, and (b) the effect of known gender on speech acceptability in EL speakers. Method: A 2-part study was conducted. In Part 1, 34 healthy adults provided speech recordings using…

  12. Speech Intelligibility in Various Noise Conditions with the Nucleus® 5 CP810 Sound Processor.

    PubMed

    Dillier, Norbert; Lai, Wai Kong

    2015-06-11

    The Nucleus(®) 5 System Sound Processor (CP810, Cochlear™, Macquarie University, NSW, Australia) contains two omnidirectional microphones. They can be configured as a fixed directional microphone combination (called Zoom) or as an adaptive beamformer (called Beam), which adjusts the directivity continuously to maximally reduce the interfering noise. Initial evaluation studies with the CP810 had compared performance and usability of the new processor in comparison with the Freedom™ Sound Processor (Cochlear™) for speech in quiet and noise for a subset of the processing options. This study compares the two processing options suggested to be used in noisy environments, Zoom and Beam, for various sound field conditions using a standardized speech in noise matrix test (Oldenburg sentences test). Nine German-speaking subjects who previously had been using the Freedom speech processor and subsequently were upgraded to the CP810 device participated in this series of additional evaluation tests. The speech reception threshold (SRT for 50% speech intelligibility in noise) was determined using sentences presented via loudspeaker at 65 dB SPL in front of the listener and noise presented either via the same loudspeaker (S0N0) or at 90 degrees at either the ear with the sound processor (S0NCI+) or the opposite unaided ear (S0NCI-). The fourth noise condition consisted of three uncorrelated noise sources placed at 90, 180 and 270 degrees. The noise level was adjusted through an adaptive procedure to yield a signal to noise ratio where 50% of the words in the sentences were correctly understood. In spatially separated speech and noise conditions both Zoom and Beam could improve the SRT significantly. For single noise sources, either ipsilateral or contralateral to the cochlear implant sound processor, average improvements with Beam of 12.9 and 7.9 dB in SRT were found. The average SRT of -8 dB for Beam in the diffuse noise condition (uncorrelated noise from both sides and

  13. Intelligibility of 4-Year-Old Children with and without Cerebral Palsy

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Schueler, Brynn; Schultz, Laurel; DuHadway, Caitlin

    2012-01-01

    Purpose: The authors examined speech intelligibility in typically developing (TD) children and 3 groups of children with cerebral palsy (CP) who were classified into speech/language profile groups following Hustad, Gorton, and Lee (2010). Questions addressed differences in transcription intelligibility scores among groups, the effects of utterance…

  14. Speech evaluation after palatal augmentation in patients undergoing glossectomy.

    PubMed

    de Carvalho-Teles, Viviane; Sennes, Luiz Ubirajara; Gielow, Ingrid

    2008-10-01

    To assess, in patients undergoing glossectomy, the influence of the palatal augmentation prosthesis on the speech intelligibility and acoustic spectrographic characteristics of the formants of oral vowels in Brazilian Portuguese, specifically the first 3 formants (F1 [/a,e,u/], F2 [/o,ó,u/], and F3 [/a,ó/]). Speech evaluation with and without a palatal augmentation prosthesis using blinded randomized listener judgments. Tertiary referral center. Thirty-six patients (33 men and 3 women) aged 30 to 80 (mean [SD], 53.9 [10.5]) years underwent glossectomy (14, total glossectomy; 12, total glossectomy and partial mandibulectomy; 6, hemiglossectomy; and 4, subtotal glossectomy) with use of the augmentation prosthesis for at least 3 months before inclusion in the study. Spontaneous speech intelligibility (assessed by expert listeners using a 4-category scale) and spectrographic formants assessment. We found a statistically significant improvement of spontaneous speech intelligibility and the average number of correctly identified syllables with the use of the prosthesis (P < .05). Statistically significant differences occurred for the F1 values of the vowels /a,e,u/; for F2 values, there was a significant difference of the vowels /o,ó,u/; and for F3 values, there was a significant difference of the vowels /a,ó/ (P < .001). The palatal augmentation prosthesis improved the intelligibility of spontaneous speech and syllables for patients who underwent glossectomy. It also increased the F2 and F3 values for all vowels and the F1 values for the vowels /o,ó,u/. This effect brought the values of many vowel formants closer to normal.

  15. [Relevance of psychosocial factors in speech rehabilitation after laryngectomy].

    PubMed

    Singer, S; Fuchs, M; Dietz, A; Klemm, E; Kienast, U; Meyer, A; Oeken, J; Täschner, R; Wulke, C; Schwarz, R

    2007-12-01

    Often it is assumed that psychosocial and sociodemographic factors cause the success of voice rehabilitation after laryngectomy. Aim of this study was to analyze the association between these parameters. Based on tumor registries of six ENT-clinics all patients were surveyed, who were laryngectomized in the years before (N = 190). Success of voice rehabilitation has been assessed as speech intelligibility measured with the postlaryngectomy-telephone-intelligibility-test. For the assessment of the psychosocial parameters validated and standardized instruments were used if possible. Statistical analysis was done by multiple logistic regression analysis. Low speech intelligibility is associated with reduced conversations (OR 0.970) and social activity (OR 1.049). Patients are more likely to talk with esophageal voice when their motivation for learning the new voice was high (OR 7.835) and when they assessed their speech therapist as important for their motivation (OR 4.794). The risk to communicate merely by whispering is higher when patients live together with a partner (OR 5.293), when they talk seldomly (OR 1.017) and when they are not very active in social contexts (OR 0.966). Psychosocial factors can only partly explain how voice rehabilitation after laryngectomy becomes a success. Speech intelligibility is associated with active communication behaviour, whereas the use of an esophageal voice is correlated with motivation. It seems that the gaining of tracheoesophageal puncture voice is independent of psychosocial factors.

  16. Contributions of local speech encoding and functional connectivity to audio-visual speech perception

    PubMed Central

    Giordano, Bruno L; Ince, Robin A A; Gross, Joachim; Schyns, Philippe G; Panzeri, Stefano; Kayser, Christoph

    2017-01-01

    Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments. DOI: http://dx.doi.org/10.7554/eLife.24763.001 PMID:28590903

  17. Reception of distorted speech.

    DOT National Transportation Integrated Search

    1973-12-01

    Noise, either in the form of masking or in the form of distortion products, interferes with speech intelligibility. When the signal-to-noise ratio is bad enough, articulation can drop to unacceptably--even dangerously--low levels. However, listeners ...

  18. Speech-Message Extraction from Interference Introduced by External Distributed Sources

    NASA Astrophysics Data System (ADS)

    Kanakov, V. A.; Mironov, N. A.

    2017-08-01

    The problem of this study involves the extraction of a speech signal originating from a certain spatial point and calculation of the intelligibility of the extracted voice message. It is solved by the method of decreasing the influence of interference from the speech-message sources on the extracted signal. This method is based on introducing the time delays, which depend on the spatial coordinates, to the recording channels. Audio records of the voices of eight different people were used as test objects during the studies. It is proved that an increase in the number of microphones improves intelligibility of the speech message which is extracted from interference.

  19. Synthesized Speech Output and Children: A Scoping Review

    ERIC Educational Resources Information Center

    Drager, Kathryn D. R.; Reichle, Joe; Pinkoski, Carrie

    2010-01-01

    Purpose: Many computer-based augmentative and alternative communication systems in use by children have speech output. This article (a) provides a scoping review of the literature addressing the intelligibility and listener comprehension of synthesized speech output with children and (b) discusses future research directions. Method: Studies…

  20. Speech versus non-speech as irrelevant sound: controlling acoustic variation.

    PubMed

    Little, Jason S; Martin, Frances Heritage; Thomson, Richard H S

    2010-09-01

    Functional differences between speech and non-speech within the irrelevant sound effect were investigated using repeated and changing formats of irrelevant sounds in the form of intelligible words and unintelligible signal correlated noise (SCN) versions of the words. Event-related potentials were recorded from 25 females aged between 18 and 25 while they completed a serial order recall task in the presence of irrelevant sound or silence. As expected and in line with the changing-state hypothesis both words and SCN produced robust changing-state effects. However, words produced a greater changing-state effect than SCN indicating that the spectral detail inherent within speech accounts for the greater irrelevant sound effect and changing-state effect typically observed with speech. ERP data in the form of N1 amplitude was modulated within some irrelevant sound conditions suggesting that attentional aspects are involved in the elicitation of the irrelevant sound effect. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  1. Successful and rapid response of speech bulb reduction program combined with speech therapy in velopharyngeal dysfunction: a case report.

    PubMed

    Shin, Yu-Jeong; Ko, Seung-O

    2015-12-01

    Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.

  2. Rasch Analysis of Word Identification and Magnitude Estimation Scaling Responses in Measuring Naive Listeners' Judgments of Speech Intelligibility of Children with Severe-to-Profound Hearing Impairments

    ERIC Educational Resources Information Center

    Beltyukova, Svetlana A.; Stone, Gregory M.; Ellis, Lee W.

    2008-01-01

    Purpose: Speech intelligibility research typically relies on traditional evidence of reliability and validity. This investigation used Rasch analysis to enhance understanding of the functioning and meaning of scores obtained with 2 commonly used procedures: word identification (WI) and magnitude estimation scaling (MES). Method: Narrative samples…

  3. Got EQ?: Increasing Cultural and Clinical Competence through Emotional Intelligence

    ERIC Educational Resources Information Center

    Robertson, Shari A.

    2007-01-01

    Cultural intelligence has been described across three parameters of human behavior: cognitive intelligence, emotional intelligence (EQ), and physical intelligence. Each contributes a unique and important perspective to the ability of speech-language pathologists and audiologists to provide benefits to their clients regardless of cultural…

  4. Discrepant visual speech facilitates covert selective listening in "cocktail party" conditions.

    PubMed

    Williams, Jason A

    2012-06-01

    The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.

  5. Association of Velopharyngeal Insufficiency With Quality of Life and Patient-Reported Outcomes After Speech Surgery.

    PubMed

    Bhuskute, Aditi; Skirko, Jonathan R; Roth, Christina; Bayoumi, Ahmed; Durbin-Johnson, Blythe; Tollefson, Travis T

    2017-09-01

    Patients with cleft palate and other causes of velopharyngeal insufficiency (VPI) suffer adverse effects on social interactions and communication. Measurement of these patient-reported outcomes is needed to help guide surgical and nonsurgical care. To further validate the VPI Effects on Life Outcomes (VELO) instrument, measure the change in quality of life (QOL) after speech surgery, and test the association of change in speech with change in QOL. Prospective descriptive cohort including children and young adults undergoing speech surgery for VPI in a tertiary academic center. Participants completed the validated VELO instrument before and after surgical treatment. The main outcome measures were preoperative and postoperative VELO scores and the perceptual speech assessment of speech intelligibility. The VELO scores are divided into subscale domains. Changes in VELO after surgery were analyzed using linear regression models. VELO scores were analyzed as a function of speech intelligibility adjusting for age and cleft type. The correlation between speech intelligibility rating and VELO scores was estimated using the polyserial correlation. Twenty-nine patients (13 males and 16 females) were included. Mean (SD) age was 7.9 (4.1) years (range, 4-20 years). Pharyngeal flap was used in 14 (48%) cases, Furlow palatoplasty in 12 (41%), and sphincter pharyngoplasty in 1 (3%). The mean (SD) preoperative speech intelligibility rating was 1.71 (1.08), which decreased postoperatively to 0.79 (0.93) in 24 patients who completed protocol (P < .01). The VELO scores improved after surgery (P<.001) as did most subscale scores. Caregiver impact did not change after surgery (P = .36). Speech Intelligibility was correlated with preoperative and postoperative total VELO score (P < .01) and to preoperative subscale domains (situational difficulty [VELO-SiD, P = .005] and perception by others [VELO-PO, P = .05]) and postoperative subscale domains (VELO-SiD [P

  6. Effect of the speed of a single-channel dynamic range compressor on intelligibility in a competing speech task

    NASA Astrophysics Data System (ADS)

    Stone, Michael A.; Moore, Brian C. J.

    2003-08-01

    Using a ``noise-vocoder'' cochlear implant simulator [Shannon et al., Science 270, 303-304 (1995)], the effect of the speed of dynamic range compression on speech intelligibility was assessed, using normal-hearing subjects. The target speech had a level 5 dB above that of the competing speech. Initially, baseline performance was measured with no compression active, using between 4 and 16 processing channels. Then, performance was measured using a fast-acting compressor and a slow-acting compressor, each operating prior to the vocoder simulation. The fast system produced significant gain variation over syllabic timescales. The slow system produced significant gain variation only over the timescale of sentences. With no compression active, about six channels were necessary to achieve 50% correct identification of words in sentences. Sixteen channels produced near-maximum performance. Slow-acting compression produced no significant degradation relative to the baseline. However, fast-acting compression consistently reduced performance relative to that for the baseline, over a wide range of performance levels. It is suggested that fast-acting compression degrades performance for two reasons: (1) because it introduces correlated fluctuations in amplitude in different frequency bands, which tends to produce perceptual fusion of the target and background sounds and (2) because it reduces amplitude modulation depth and intensity contrasts.

  7. The effect of bone conduction microphone placement on intensity and spectrum of transmitted speech items.

    PubMed

    Tran, Phuong K; Letowski, Tomasz R; McBride, Maranda E

    2013-06-01

    Speech signals can be converted into electrical audio signals using either conventional air conduction (AC) microphone or a contact bone conduction (BC) microphone. The goal of this study was to investigate the effects of the location of a BC microphone on the intensity and frequency spectrum of the recorded speech. Twelve locations, 11 on the talker's head and 1 on the collar bone, were investigated. The speech sounds were three vowels (/u/, /a/, /i/) and two consonants (/m/, /∫/). The sounds were produced by 12 talkers. Each sound was recorded simultaneously with two BC microphones and an AC microphone. Analyzed spectral data showed that the BC recordings made at the forehead of the talker were the most similar to the AC recordings, whereas the collar bone recordings were most different. Comparison of the spectral data with speech intelligibility data collected in another study revealed a strong negative relationship between BC speech intelligibility and the degree of deviation of the BC speech spectrum from the AC spectrum. In addition, the head locations that resulted in the highest speech intelligibility were associated with the lowest output signals among all tested locations. Implications of these findings for BC communication are discussed.

  8. [Intermodal timing cues for audio-visual speech recognition].

    PubMed

    Hashimoto, Masahiro; Kumashiro, Masaharu

    2004-06-01

    The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.

  9. Longitudinal follow-up to evaluate speech disorders in early-treated patients with infantile-onset Pompe disease.

    PubMed

    Zeng, Yin-Ting; Hwu, Wuh-Liang; Torng, Pao-Chuan; Lee, Ni-Chung; Shieh, Jeng-Yi; Lu, Lu; Chien, Yin-Hsiu

    2017-05-01

    Patients with infantile-onset Pompe disease (IOPD) can be treated by recombinant human acid alpha glucosidase (rhGAA) replacement beginning at birth with excellent survival rates, but they still commonly present with speech disorders. This study investigated the progress of speech disorders in these early-treated patients and ascertained the relationship with treatments. Speech disorders, including hypernasal resonance, articulation disorders, and speech intelligibility, were scored by speech-language pathologists using auditory perception in seven early-treated patients over a period of 6 years. Statistical analysis of the first and last evaluations of the patients was performed with the Wilcoxon signed-rank test. A total of 29 speech samples were analyzed. All the patients suffered from hypernasality, articulation disorder, and impairment in speech intelligibility at the age of 3 years. The conditions were stable, and 2 patients developed normal or near normal speech during follow-up. Speech therapy and a high dose of rhGAA appeared to improve articulation in 6 of the 7 patients (86%, p = 0.028) by decreasing the omission of consonants, which consequently increased speech intelligibility (p = 0.041). Severity of hypernasality greatly reduced only in 2 patients (29%, p = 0.131). Speech disorders were common even in early and successfully treated patients with IOPD; however, aggressive speech therapy and high-dose rhGAA could improve their speech disorders. Copyright © 2016 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  10. Neural Oscillations Carry Speech Rhythm through to Comprehension

    PubMed Central

    Peelle, Jonathan E.; Davis, Matthew H.

    2012-01-01

    A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners’ processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope contains important markers for place and manner of articulation, segmental information, and speech rate. Behavioral studies demonstrate that amplitude envelope information is relied upon by listeners and plays a key role in speech intelligibility. Extending behavioral findings, data from neuroimaging – particularly electroencephalography (EEG) and magnetoencephalography (MEG) – point to phase locking by ongoing cortical oscillations to low-frequency information (~4–8 Hz) in the speech envelope. This phase modulation effectively encodes a prediction of when important events (such as stressed syllables) are likely to occur, and acts to increase sensitivity to these relevant acoustic cues. We suggest a framework through which such neural entrainment to speech rhythm can explain effects of speech rate on word and segment perception (i.e., that the perception of phonemes and words in connected speech is influenced by preceding speech rate). Neuroanatomically, acoustic amplitude modulations are processed largely bilaterally in auditory cortex, with intelligible speech resulting in differential recruitment of left-hemisphere regions. Notable among these is lateral anterior temporal cortex, which we propose functions in a domain-general fashion to support ongoing memory and integration of meaningful input. Together, the reviewed evidence suggests that low-frequency oscillations in the acoustic speech signal form the foundation of a rhythmic hierarchy supporting spoken language, mirrored by phase-locked oscillations in the human brain. PMID:22973251

  11. The Galker test of speech reception in noise; associations with background variables, middle ear status, hearing, and language in Danish preschool children.

    PubMed

    Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend; Dørup, Jens; Lous, Jørgen

    2016-01-01

    We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons, analysis of variance (ANOVA) was used. Interrelations were adjusted for using a non-parametric graphic model. In unadjusted analyses, the Galker test was associated with gender, age group, language development (Reynell revised scale), audiometry, and tympanometry. The Galker score was also associated with the parents' and day care teachers' reports on the children's vocabulary, sentence construction, and pronunciation. Type B tympanograms were associated with a mean hearing 5-6dB below that of than type A, C1, or C2. In the graphic analysis, Galker scores were closely and significantly related to Reynell test scores (Gamma (G)=0.35), the children's age group (G=0.33), and the day care teachers' assessment of the children's vocabulary (G=0.26). The Galker test of speech reception in noise appears promising as an easy and quick tool for evaluating preschool children's understanding of spoken words in noise, and it correlated well with the day care teachers' reports and less with the parents' reports. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  12. Dysarthria in Mandarin-Speaking Children with Cerebral Palsy: Speech Subsystem Profiles

    ERIC Educational Resources Information Center

    Chen, Li-Mei; Hustad, Katherine C.; Kent, Ray D.; Lin, Yu Ching

    2018-01-01

    Purpose: This study explored the speech characteristics of Mandarin-speaking children with cerebral palsy (CP) and typically developing (TD) children to determine (a) how children in the 2 groups may differ in their speech patterns and (b) the variables correlated with speech intelligibility for words and sentences. Method: Data from 6 children…

  13. Eyes and ears: Using eye tracking and pupillometry to understand challenges to speech recognition.

    PubMed

    Van Engen, Kristin J; McLaughlin, Drew J

    2018-05-04

    Although human speech recognition is often experienced as relatively effortless, a number of common challenges can render the task more difficult. Such challenges may originate in talkers (e.g., unfamiliar accents, varying speech styles), the environment (e.g. noise), or in listeners themselves (e.g., hearing loss, aging, different native language backgrounds). Each of these challenges can reduce the intelligibility of spoken language, but even when intelligibility remains high, they can place greater processing demands on listeners. Noisy conditions, for example, can lead to poorer recall for speech, even when it has been correctly understood. Speech intelligibility measures, memory tasks, and subjective reports of listener difficulty all provide critical information about the effects of such challenges on speech recognition. Eye tracking and pupillometry complement these methods by providing objective physiological measures of online cognitive processing during listening. Eye tracking records the moment-to-moment direction of listeners' visual attention, which is closely time-locked to unfolding speech signals, and pupillometry measures the moment-to-moment size of listeners' pupils, which dilate in response to increased cognitive load. In this paper, we review the uses of these two methods for studying challenges to speech recognition. Copyright © 2018. Published by Elsevier B.V.

  14. Signal Processing Methods for Removing the Effects of Whole Body Vibration upon Speech

    NASA Technical Reports Server (NTRS)

    Bitner, Rachel M.; Begault, Durand R.

    2014-01-01

    Humans may be exposed to whole-body vibration in environments where clear speech communications are crucial, particularly during the launch phases of space flight and in high-performance aircraft. Prior research has shown that high levels of vibration cause a decrease in speech intelligibility. However, the effects of whole-body vibration upon speech are not well understood, and no attempt has been made to restore speech distorted by whole-body vibration. In this paper, a model for speech under whole-body vibration is proposed and a method to remove its effect is described. The method described reduces the perceptual effects of vibration, yields higher ASR accuracy scores, and may significantly improve intelligibility. Possible applications include incorporation within communication systems to improve radio-communication systems in environments such a spaceflight, aviation, or off-road vehicle operations.

  15. Multichannel spatial auditory display for speech communications

    NASA Technical Reports Server (NTRS)

    Begault, D. R.; Erbe, T.; Wenzel, E. M. (Principal Investigator)

    1994-01-01

    A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.

  16. Multichannel spatial auditory display for speech communications.

    PubMed

    Begault, D R; Erbe, T

    1994-10-01

    A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.

  17. The role of short-time intensity and envelope power for speech intelligibility and psychoacoustic masking.

    PubMed

    Biberger, Thomas; Ewert, Stephan D

    2017-08-01

    The generalized power spectrum model [GPSM; Biberger and Ewert (2016). J. Acoust. Soc. Am. 140, 1023-1038], combining the "classical" concept of the power-spectrum model (PSM) and the envelope power spectrum-model (EPSM), was demonstrated to account for several psychoacoustic and speech intelligibility (SI) experiments. The PSM path of the model uses long-time power signal-to-noise ratios (SNRs), while the EPSM path uses short-time envelope power SNRs. A systematic comparison of existing SI models for several spectro-temporal manipulations of speech maskers and gender combinations of target and masker speakers [Schubotz et al. (2016). J. Acoust. Soc. Am. 140, 524-540] showed the importance of short-time power features. Conversely, Jørgensen et al. [(2013). J. Acoust. Soc. Am. 134, 436-446] demonstrated a higher predictive power of short-time envelope power SNRs than power SNRs using reverberation and spectral subtraction. Here the GPSM was extended to utilize short-time power SNRs and was shown to account for all psychoacoustic and SI data of the three mentioned studies. The best processing strategy was to exclusively use either power or envelope-power SNRs, depending on the experimental task. By analyzing both domains, the suggested model might provide a useful tool for clarifying the contribution of amplitude modulation masking and energetic masking.

  18. Optimal speech level for speech transmission in a noisy environment for young adults and aged persons

    NASA Astrophysics Data System (ADS)

    Sato, Hayato; Ota, Ryo; Morimoto, Masayuki; Sato, Hiroshi

    2005-04-01

    Assessing sound environment of classrooms for the aged is a very important issue, because classrooms can be used by the aged for their lifelong learning, especially in the aged society. Hence hearing loss due to aging is a considerable factor for classrooms. In this study, the optimal speech level in noisy fields for both young adults and aged persons was investigated. Listening difficulty ratings and word intelligibility scores for familiar words were used to evaluate speech transmission performance. The results of the tests demonstrated that the optimal speech level for moderate background noise (i.e., less than around 60 dBA) was fairly constant. Meanwhile, the optimal speech level depended on the speech-to-noise ratio when the background noise level exceeded around 60 dBA. The minimum required speech level to minimize difficulty ratings for the aged was higher than that for the young. However, the minimum difficulty ratings for both the young and the aged were given in the range of speech level of 70 to 80 dBA of speech level.

  19. Intensive Speech and Language Therapy for Older Children with Cerebral Palsy: A Systems Approach

    ERIC Educational Resources Information Center

    Pennington, Lindsay; Miller, Nick; Robson, Sheila; Steen, Nick

    2010-01-01

    Aim: To investigate whether speech therapy using a speech systems approach to controlling breath support, phonation, and speech rate can increase the speech intelligibility of children with dysarthria and cerebral palsy (CP). Method: Sixteen children with dysarthria and CP participated in a modified time series design. Group characteristics were…

  20. Linkage of Speech Sound Disorder to Reading Disability Loci

    ERIC Educational Resources Information Center

    Smith, Shelley D.; Pennington, Bruce F.; Boada, Richard; Shriberg, Lawrence D.

    2005-01-01

    Background: Speech sound disorder (SSD) is a common childhood disorder characterized by developmentally inappropriate errors in speech production that greatly reduce intelligibility. SSD has been found to be associated with later reading disability (RD), and there is also evidence for both a cognitive and etiological overlap between the two…

  1. Speech and communication in Parkinson’s disease: a cross-sectional exploratory study in the UK

    PubMed Central

    Barnish, Maxwell S; Horton, Simon M C; Butterfint, Zoe R; Clark, Allan B; Atkinson, Rachel A; Deane, Katherine H O

    2017-01-01

    Objective To assess associations between cognitive status, intelligibility, acoustics and functional communication in PD. Design Cross-sectional exploratory study of functional communication, including a within-participants experimental design for listener assessment. Setting A major academic medical centre in the East of England, UK. Participants Questionnaire data were assessed for 45 people with Parkinson’s disease (PD), who had self-reported speech or communication difficulties and did not have clinical dementia. Acoustic and listener analyses were conducted on read and conversational speech for 20 people with PD and 20 familiar conversation partner controls without speech, language or cognitive difficulties. Main outcome measures Functional communication assessed by the Communicative Participation Item Bank (CPIB) and Communicative Effectiveness Survey (CES). Results People with PD had lower intelligibility than controls for both the read (mean difference 13.7%, p=0.009) and conversational (mean difference 16.2%, p=0.04) sentences. Intensity and pause were statistically significant predictors of intelligibility in read sentences. Listeners were less accurate identifying the intended emotion in the speech of people with PD (14.8% point difference across conditions, p=0.02) and this was associated with worse speaker cognitive status (16.7% point difference, p=0.04). Cognitive status was a significant predictor of functional communication using CPIB (F=8.99, p=0.005, η2 = 0.15) but not CES. Intelligibility in conversation sentences was a statistically significant predictor of CPIB (F=4.96, p=0.04, η2 = 0.19) and CES (F=13.65, p=0.002, η2 = 0.43). Read sentence intelligibility was not a significant predictor of either outcome. Conclusions Cognitive status was an important predictor of functional communication—the role of intelligibility was modest and limited to conversational and not read speech. Our results highlight the importance of focusing on

  2. Acoustic Changes in the Speech of Children with Cerebral Palsy Following an Intensive Program of Dysarthria Therapy

    ERIC Educational Resources Information Center

    Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

    2018-01-01

    Background: The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. Aims: To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. Methods & Procedures: We recorded 16…

  3. GRIN2A: an aptly named gene for speech dysfunction.

    PubMed

    Turner, Samantha J; Mayes, Angela K; Verhoeven, Andrea; Mandelstam, Simone A; Morgan, Angela T; Scheffer, Ingrid E

    2015-02-10

    To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. © 2015 American Academy of Neurology.

  4. Acoustic assessment of speech privacy curtains in two nursing units

    PubMed Central

    Pope, Diana S.; Miller-Klein, Erik T.

    2016-01-01

    Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959

  5. Acoustic assessment of speech privacy curtains in two nursing units.

    PubMed

    Pope, Diana S; Miller-Klein, Erik T

    2016-01-01

    Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.

  6. Effectiveness of Speech Therapy in Adults with Intellectual Disabilities

    ERIC Educational Resources Information Center

    Terband, Hayo; Coppens-Hofman, Marjolein C.; Reffeltrath, Maaike; Maassen, Ben A. M.

    2018-01-01

    Background: This study investigated the effect of speech therapy in a heterogeneous group of adults with intellectual disability. Method: Thirty-six adults with mild and moderate intellectual disabilities (IQs 40-70; age 18-40 years) with reported poor speech intelligibility received tailored training in articulation and listening skills delivered…

  7. Cross-Channel Amplitude Sweeps Are Crucial to Speech Intelligibility

    ERIC Educational Resources Information Center

    Prendergast, Garreth; Green, Gary G. R.

    2012-01-01

    Classical views of speech perception argue that the static and dynamic characteristics of spectral energy peaks (formants) are the acoustic features that underpin phoneme recognition. Here we use representations where the amplitude modulations of sub-band filtered speech are described, precisely, in terms of co-sinusoidal pulses. These pulses are…

  8. Differential effects of speech prostheses in glossectomized patients.

    PubMed

    Leonard, R J; Gillis, R

    1990-12-01

    Five patients representing different categories of glossal resection were fitted with prostheses specifically designed to improve speech. Speech recordings made for subjects with and without their prostheses were subjected to a variety of analyses. Prosthetic influence on listeners' judgments of severity level/intelligibility, number of consonants in error, and on the acoustic measure F2 range of vowels was evaluated. Findings indicated that all subjects demonstrated improvement on the speech measures. However, the extent of improvement on each measure varied across speakers and resection categories. Implications of the findings for prosthetic speech rehabilitation in this population are discussed.

  9. Speech impairment in Down syndrome: a review.

    PubMed

    Kent, Ray D; Vorperian, Houri K

    2013-02-01

    This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.

  10. Children's Attitudes Toward Peers With Unintelligible Speech Associated With Cleft Lip and/or Palate.

    PubMed

    Lee, Alice; Gibbon, Fiona E; Spivey, Kimberley

    2017-05-01

      The objective of this study was to investigate whether reduced speech intelligibility in children with cleft palate affects social and personal attribute judgments made by typically developing children of different ages.   The study (1) measured the correlation between intelligibility scores of speech samples from children with cleft palate and social and personal attribute judgments made by typically developing children based on these samples and (2) compared the attitude judgments made by children of different ages. Participants   A total of 90 typically developing children, 30 in each of three age groups (7 to 8 years, 9 to 10 years, and 11 to 12 years).   Speech intelligibility scores and typically developing children's attitudes were measured using eight social and personal attributes on a three-point rating scale.   There was a significant correlation between the speech intelligibility scores and attitude judgments for a number of traits: "sick-healthy" as rated by the children aged 7 to 8 years, "no friends-friends" by the children aged 9 to 10 years, and "ugly-good looking" and "no friends-friends" by the children aged 11 to 12 years. Children aged 7 to 8 years gave significantly lower ratings for "mean-kind" but higher ratings for "shy-outgoing" when compared with the other two groups.   Typically developing children tended to make negative social and personal attribute judgments about children with cleft palate based solely on the intelligibility of their speech. Society, educators, and health professionals should work together to ensure that children with cleft palate are not stigmatized by their peers.

  11. The Auditory-Brainstem Response to Continuous, Non-repetitive Speech Is Modulated by the Speech Envelope and Reflects Speech Processing

    PubMed Central

    Reichenbach, Chagit S.; Braiman, Chananel; Schiff, Nicholas D.; Hudspeth, A. J.; Reichenbach, Tobias

    2016-01-01

    The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem. The ABR is also employed to investigate the auditory brainstem in a multitude of tasks related to hearing, such as processing speech or selectively focusing on one speaker in a noisy environment. Such research measures the response of the brainstem to short speech signals such as vowels or words. Because the voltage signal of the ABR has a tiny amplitude, several hundred to a thousand repetitions of the acoustic signal are needed to obtain a reliable response. The large number of repetitions poses a challenge to assessing cognitive functions due to neural adaptation. Here we show that continuous, non-repetitive speech, lasting several minutes, may be employed to measure the ABR. Because the speech is not repeated during the experiment, the precise temporal form of the ABR cannot be determined. We show, however, that important structural features of the ABR can nevertheless be inferred. In particular, the brainstem responds at the fundamental frequency of the speech signal, and this response is modulated by the envelope of the voiced parts of speech. We accordingly introduce a novel measure that assesses the ABR as modulated by the speech envelope, at the fundamental frequency of speech and at the characteristic latency of the response. This measure has a high signal-to-noise ratio and can hence be employed effectively to measure the ABR to continuous speech. We use this novel measure to show that the ABR is weaker to intelligible speech than to unintelligible, time-reversed speech. The methods presented here can be employed for further research on speech processing in the auditory brainstem and can lead to the development of future clinical diagnosis of brainstem function. PMID:27303286

  12. Noise-immune multisensor transduction of speech

    NASA Astrophysics Data System (ADS)

    Viswanathan, Vishu R.; Henry, Claudia M.; Derr, Alan G.; Roucos, Salim; Schwartz, Richard M.

    1986-08-01

    Two types of configurations of multiple sensors were developed, tested and evaluated in speech recognition application for robust performance in high levels of acoustic background noise: One type combines the individual sensor signals to provide a single speech signal input, and the other provides several parallel inputs. For single-input systems, several configurations of multiple sensors were developed and tested. Results from formal speech intelligibility and quality tests in simulated fighter aircraft cockpit noise show that each of the two-sensor configurations tested outperforms the constituent individual sensors in high noise. Also presented are results comparing the performance of two-sensor configurations and individual sensors in speaker-dependent, isolated-word speech recognition tests performed using a commercial recognizer (Verbex 4000) in simulated fighter aircraft cockpit noise.

  13. Speech and Swallowing in Parkinson’s Disease

    PubMed Central

    Tjaden, Kris

    2009-01-01

    Dysarthria and dysphagia occur frequently in Parkinson’s disease (PD). Reduced speech intelligibility is a significant functional limitation of dysarthria, and in the case of PD is likely related articulatory and phonatory impairment. Prosodically-based treatments show the most promise for addressing these deficits as well as for maximizing speech intelligibility. Communication-oriented strategies also may help to enhance mutual understanding between a speaker and listener. Dysphagia in PD can result in serious health issues, including aspiration pneumonia, malnutrition, and dehydration. Early identification of swallowing abnormalities is critical so as to minimize the impact of dysphagia on health status and quality of life. Feeding modifications, compensatory strategies, and therapeutic swallowing techniques all have a role in the management of dysphagia in PD. PMID:19946386

  14. The Comprehension of Rapid Speech by the Blind: Part III. Final Report.

    ERIC Educational Resources Information Center

    Foulke, Emerson

    Accounts of completed and ongoing research conducted from 1964 to 1968 are presented on the subject of accelerated speech as a substitute for the written word. Included are a review of the research on intelligibility and comprehension of accelerated speech, some methods for controlling the word rate of recorded speech, and a comparison of…

  15. The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

    2011-01-01

    In a sample of 46 children aged 4-7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants' speech, prosody, and voice were compared with data from 40 typically-developing children, 13…

  16. [Improving speech comprehension using a new cochlear implant speech processor].

    PubMed

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  17. Open ended intelligence: the individuation of intelligent agents

    NASA Astrophysics Data System (ADS)

    Weinbaum Weaver, David; Veitas, Viktoras

    2017-03-01

    Artificial general intelligence is a field of research aiming to distil the principles of intelligence that operate independently of a specific problem domain and utilise these principles in order to synthesise systems capable of performing any intellectual task a human being is capable of and beyond. While "narrow" artificial intelligence which focuses on solving specific problems such as speech recognition, text comprehension, visual pattern recognition and robotic motion has shown impressive breakthroughs lately, understanding general intelligence remains elusive. We propose a paradigm shift from intelligence perceived as a competence of individual agents defined in relation to an a priori given problem domain or a goal, to intelligence perceived as a formative process of self-organisation. We call this process open-ended intelligence. Starting with a brief introduction of the current conceptual approach, we expose a number of serious limitations that are traced back to the ontological roots of the concept of intelligence. Open-ended intelligence is then developed as an abstraction of the process of human cognitive development, so its application can be extended to general agents and systems. We introduce and discuss three facets of the idea: the philosophical concept of individuation, sense-making and the individuation of general cognitive agents. We further show how open-ended intelligence can be framed in terms of a distributed, self-organising network of interacting elements and how such process is scalable. The framework highlights an important relation between coordination and intelligence and a new understanding of values.

  18. Multi-channel spatial auditory display for speech communications

    NASA Astrophysics Data System (ADS)

    Begault, Durand; Erbe, Tom

    1993-10-01

    A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.

  19. Multi-channel spatial auditory display for speech communications

    NASA Technical Reports Server (NTRS)

    Begault, Durand; Erbe, Tom

    1993-01-01

    A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.

  20. Asynchronous sampling of speech with some vocoder experimental results

    NASA Technical Reports Server (NTRS)

    Babcock, M. L.

    1972-01-01

    The method of asynchronously sampling speech is based upon the derivatives of the acoustical speech signal. The following results are apparent from experiments to date: (1) It is possible to represent speech by a string of pulses of uniform amplitude, where the only information contained in the string is the spacing of the pulses in time; (2) the string of pulses may be produced in a simple analog manner; (3) the first derivative of the original speech waveform is the most important for the encoding process; (4) the resulting pulse train can be utilized to control an acoustical signal production system to regenerate the intelligence of the original speech.

  1. A "Goldilocks" Approach to Hearing Aid Self-Fitting: Ear-Canal Output and Speech Intelligibility Index.

    PubMed

    Mackersie, Carol; Boothroyd, Arthur; Lithgow, Alexandra

    2018-06-11

    The objective was to determine self-adjusted output response and speech intelligibility index (SII) in individuals with mild to moderate hearing loss and to measure the effects of prior hearing aid experience. Thirteen hearing aid users and 13 nonusers, with similar group-mean pure-tone thresholds, listened to prerecorded and preprocessed sentences spoken by a man. Starting with a generic level and spectrum, participants adjusted (1) overall level, (2) high-frequency boost, and (3) low-frequency cut. Participants took a speech perception test after an initial adjustment before making a final adjustment. The three self-selected parameters, along with individual thresholds and real-ear-to-coupler differences, were used to compute output levels and SIIs for the starting and two self-adjusted conditions. The values were compared with an NAL second nonlinear threshold-based prescription (NAL-NL2) and, for the hearing aid users, performance of their existing hearing aids. All participants were able to complete the self-adjustment process. The generic starting condition provided outputs (between 2 and 8 kHz) and SIIs that were significantly below those prescribed by NAL-NL2. Both groups increased SII to values that were not significantly different from prescription. The hearing aid users, but not the nonusers, increased high-frequency output and SII significantly after taking the speech perception test. Seventeen of the 26 participants (65%) met an SII criterion of 60% under the generic starting condition. The proportion increased to 23 out of 26 (88%) after the final self-adjustment. Of the 13 hearing aid users, 8 (62%) met the 60% criterion with their existing hearing aids. With the final self-adjustment, 12 out of 13 (92%) met this criterion. The findings support the conclusion that user self-adjustment of basic amplification characteristics can be both feasible and effective with or without prior hearing aid experience.

  2. Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension

    PubMed Central

    Peelle, Jonathan E.; Gross, Joachim; Davis, Matthew H.

    2013-01-01

    A growing body of evidence shows that ongoing oscillations in auditory cortex modulate their phase to match the rhythm of temporally regular acoustic stimuli, increasing sensitivity to relevant environmental cues and improving detection accuracy. In the current study, we test the hypothesis that nonsensory information provided by linguistic content enhances phase-locked responses to intelligible speech in the human brain. Sixteen adults listened to meaningful sentences while we recorded neural activity using magnetoencephalography. Stimuli were processed using a noise-vocoding technique to vary intelligibility while keeping the temporal acoustic envelope consistent. We show that the acoustic envelopes of sentences contain most power between 4 and 7 Hz and that it is in this frequency band that phase locking between neural activity and envelopes is strongest. Bilateral oscillatory neural activity phase-locked to unintelligible speech, but this cerebro-acoustic phase locking was enhanced when speech was intelligible. This enhanced phase locking was left lateralized and localized to left temporal cortex. Together, our results demonstrate that entrainment to connected speech does not only depend on acoustic characteristics, but is also affected by listeners’ ability to extract linguistic information. This suggests a biological framework for speech comprehension in which acoustic and linguistic cues reciprocally aid in stimulus prediction. PMID:22610394

  3. Phase-locked responses to speech in human auditory cortex are enhanced during comprehension.

    PubMed

    Peelle, Jonathan E; Gross, Joachim; Davis, Matthew H

    2013-06-01

    A growing body of evidence shows that ongoing oscillations in auditory cortex modulate their phase to match the rhythm of temporally regular acoustic stimuli, increasing sensitivity to relevant environmental cues and improving detection accuracy. In the current study, we test the hypothesis that nonsensory information provided by linguistic content enhances phase-locked responses to intelligible speech in the human brain. Sixteen adults listened to meaningful sentences while we recorded neural activity using magnetoencephalography. Stimuli were processed using a noise-vocoding technique to vary intelligibility while keeping the temporal acoustic envelope consistent. We show that the acoustic envelopes of sentences contain most power between 4 and 7 Hz and that it is in this frequency band that phase locking between neural activity and envelopes is strongest. Bilateral oscillatory neural activity phase-locked to unintelligible speech, but this cerebro-acoustic phase locking was enhanced when speech was intelligible. This enhanced phase locking was left lateralized and localized to left temporal cortex. Together, our results demonstrate that entrainment to connected speech does not only depend on acoustic characteristics, but is also affected by listeners' ability to extract linguistic information. This suggests a biological framework for speech comprehension in which acoustic and linguistic cues reciprocally aid in stimulus prediction.

  4. Speech characteristics in a Ugandan child with a rare paramedian craniofacial cleft: a case report.

    PubMed

    Van Lierde, K M; Bettens, K; Luyten, A; De Ley, S; Tungotyo, M; Balumukad, D; Galiwango, G; Bauters, W; Vermeersch, H; Hodges, A

    2013-03-01

    The purpose of this study is to describe the speech characteristics in an English-speaking Ugandan boy of 4.5 years who has a rare paramedian craniofacial cleft (unilateral lip, alveolar, palatal, nasal and maxillary cleft, and associated hypertelorism). Closure of the lip together with the closure of the hard and soft palate (one-stage palatal closure) was performed at the age of 5 months. Objective as well as subjective speech assessment techniques were used. The speech samples were perceptually judged for articulation, intelligibility and nasality. The Nasometer was used for the objective measurement of the nasalance values. The most striking communication problems in this child with the rare craniofacial cleft are an incomplete phonetic inventory, a severely impaired speech intelligibility with the presence of very severe hypernasality, mild nasal emission, phonetic disorders (omission of several consonants, decreased intraoral pressure in explosives, insufficient frication of fricatives and the use of a middorsum palatal stop) and phonological disorders (deletion of initial and final consonants and consonant clusters). The increased objective nasalance values are in agreement with the presence of the audible nasality disorders. The results revealed that several phonetic and phonological articulation disorders together with a decreased speech intelligibility and resonance disorders are present in the child with a rare craniofacial cleft. To what extent a secondary surgery for velopharyngeal insufficiency, combined with speech therapy, will improve speech intelligibility, articulation and resonance characteristics is a subject for further research. The results of such analyses may ultimately serve as a starting point for specific surgical and logopedic treatment that addresses the specific needs of children with rare facial clefts. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  5. Acoustic Source Characteristics, Across-Formant Integration, and Speech Intelligibility Under Competitive Conditions

    PubMed Central

    2015-01-01

    An important aspect of speech perception is the ability to group or select formants using cues in the acoustic source characteristics—for example, fundamental frequency (F0) differences between formants promote their segregation. This study explored the role of more radical differences in source characteristics. Three-formant (F1+F2+F3) synthetic speech analogues were derived from natural sentences. In Experiment 1, F1+F3 were generated by passing a harmonic glottal source (F0 = 140 Hz) through second-order resonators (H1+H3); in Experiment 2, F1+F3 were tonal (sine-wave) analogues (T1+T3). F2 could take either form (H2 or T2). In some conditions, the target formants were presented alone, either monaurally or dichotically (left ear = F1+F3; right ear = F2). In others, they were accompanied by a competitor for F2 (F1+F2C+F3; F2), which listeners must reject to optimize recognition. Competitors (H2C or T2C) were created using the time-reversed frequency and amplitude contours of F2. Dichotic presentation of F2 and F2C ensured that the impact of the competitor arose primarily through informational masking. In the absence of F2C, the effect of a source mismatch between F1+F3 and F2 was relatively modest. When F2C was present, intelligibility was lowest when F2 was tonal and F2C was harmonic, irrespective of which type matched F1+F3. This finding suggests that source type and context, rather than similarity, govern the phonetic contribution of a formant. It is proposed that wideband harmonic analogues are more effective informational maskers than narrowband tonal analogues, and so become dominant in across-frequency integration of phonetic information when placed in competition. PMID:25751040

  6. Using listening difficulty ratings of conditions for speech communication in rooms

    NASA Astrophysics Data System (ADS)

    Sato, Hiroshi; Bradley, John S.; Morimoto, Masayuki

    2005-03-01

    The use of listening difficulty ratings of speech communication in rooms is explored because, in common situations, word recognition scores do not discriminate well among conditions that are near to acceptable. In particular, the benefits of early reflections of speech sounds on listening difficulty were investigated and compared to the known benefits to word intelligibility scores. Listening tests were used to assess word intelligibility and perceived listening difficulty of speech in simulated sound fields. The experiments were conducted in three types of sound fields with constant levels of ambient noise: only direct sound, direct sound with early reflections, and direct sound with early reflections and reverberation. The results demonstrate that (1) listening difficulty can better discriminate among these conditions than can word recognition scores; (2) added early reflections increase the effective signal-to-noise ratio equivalent to the added energy in the conditions without reverberation; (3) the benefit of early reflections on difficulty scores is greater than expected from the simple increase in early arriving speech energy with reverberation; (4) word intelligibility tests are most appropriate for conditions with signal-to-noise (S/N) ratios less than 0 dBA, and where S/N is between 0 and 15-dBA S/N, listening difficulty is a more appropriate evaluation tool. .

  7. Call sign intelligibility improvement using a spatial auditory display

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.

    1993-01-01

    A spatial auditory display was used to convolve speech stimuli, consisting of 130 different call signs used in the communications protocol of NASA's John F. Kennedy Space Center, to different virtual auditory positions. An adaptive staircase method was used to determine intelligibility levels of the signal against diotic speech babble, with spatial positions at 30 deg azimuth increments. Non-individualized, minimum-phase approximations of head-related transfer functions were used. The results showed a maximal intelligibility improvement of about 6 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.

  8. Speech coding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably.« less

  9. An integrated approach to improving noisy speech perception

    NASA Astrophysics Data System (ADS)

    Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail

    2002-05-01

    For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.

  10. Determining the energetic and informational components of speech-on-speech masking

    PubMed Central

    Kidd, Gerald; Mason, Christine R.; Swaminathan, Jayaganesh; Roverud, Elin; Clayton, Kameron K.; Best, Virginia

    2016-01-01

    Identification of target speech was studied under masked conditions consisting of two or four independent speech maskers. In the reference conditions, the maskers were colocated with the target, the masker talkers were the same sex as the target, and the masker speech was intelligible. The comparison conditions, intended to provide release from masking, included different-sex target and masker talkers, time-reversal of the masker speech, and spatial separation of the maskers from the target. Significant release from masking was found for all comparison conditions. To determine whether these reductions in masking could be attributed to differences in energetic masking, ideal time-frequency segregation (ITFS) processing was applied so that the time-frequency units where the masker energy dominated the target energy were removed. The remaining target-dominated “glimpses” were reassembled as the stimulus. Speech reception thresholds measured using these resynthesized ITFS-processed stimuli were the same for the reference and comparison conditions supporting the conclusion that the amount of energetic masking across conditions was the same. These results indicated that the large release from masking found under all comparison conditions was due primarily to a reduction in informational masking. Furthermore, the large individual differences observed generally were correlated across the three masking release conditions. PMID:27475139

  11. Intra-oral pressure-based voicing control of electrolaryngeal speech with intra-oral vibrator.

    PubMed

    Takahashi, Hirokazu; Nakao, Masayuki; Kikuchi, Yataro; Kaga, Kimitaka

    2008-07-01

    In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.

  12. Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners.

    PubMed

    Park, Hyojin; Ince, Robin A A; Schyns, Philippe G; Thut, Gregor; Gross, Joachim

    2015-06-15

    Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Frontal Top-Down Signals Increase Coupling of Auditory Low-Frequency Oscillations to Continuous Speech in Human Listeners

    PubMed Central

    Park, Hyojin; Ince, Robin A.A.; Schyns, Philippe G.; Thut, Gregor; Gross, Joachim

    2015-01-01

    Summary Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. PMID:26028433

  14. Perceptual learning for speech in noise after application of binary time-frequency masks

    PubMed Central

    Ahmadi, Mahnaz; Gross, Vauna L.; Sinex, Donal G.

    2013-01-01

    Ideal time-frequency (TF) masks can reject noise and improve the recognition of speech-noise mixtures. An ideal TF mask is constructed with prior knowledge of the target speech signal. The intelligibility of a processed speech-noise mixture depends upon the threshold criterion used to define the TF mask. The study reported here assessed the effect of training on the recognition of speech in noise after processing by ideal TF masks that did not restore perfect speech intelligibility. Two groups of listeners with normal hearing listened to speech-noise mixtures processed by TF masks calculated with different threshold criteria. For each group, a threshold criterion that initially produced word recognition scores between 0.56–0.69 was chosen for training. Listeners practiced with one set of TF-masked sentences until their word recognition performance approached asymptote. Perceptual learning was quantified by comparing word-recognition scores in the first and last training sessions. Word recognition scores improved with practice for all listeners with the greatest improvement observed for the same materials used in training. PMID:23464038

  15. Recognition of speech in noise after application of time-frequency masks: Dependence on frequency and threshold parameters

    PubMed Central

    Sinex, Donal G.

    2013-01-01

    Binary time-frequency (TF) masks can be applied to separate speech from noise. Previous studies have shown that with appropriate parameters, ideal TF masks can extract highly intelligible speech even at very low speech-to-noise ratios (SNRs). Two psychophysical experiments provided additional information about the dependence of intelligibility on the frequency resolution and threshold criteria that define the ideal TF mask. Listeners identified AzBio Sentences in noise, before and after application of TF masks. Masks generated with 8 or 16 frequency bands per octave supported nearly-perfect identification. Word recognition accuracy was slightly lower and more variable with 4 bands per octave. When TF masks were generated with a local threshold criterion of 0 dB SNR, the mean speech reception threshold was −9.5 dB SNR, compared to −5.7 dB for unprocessed sentences in noise. Speech reception thresholds decreased by about 1 dB per dB of additional decrease in the local threshold criterion. Information reported here about the dependence of speech intelligibility on frequency and level parameters has relevance for the development of non-ideal TF masks for clinical applications such as speech processing for hearing aids. PMID:23556604

  16. Dramatic effects of speech task on motor and linguistic planning in severely dysfluent parkinsonian speech

    PubMed Central

    Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

    2015-01-01

    In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency, and voice emerge more saliently in conversation than in repetition, reading, or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have revealed that formulaic language is more impaired than novel language. This descriptive study extends these observations to a case of severely dysfluent dysarthria due to a parkinsonian syndrome. Dysfluencies were quantified and compared for conversation, two forms of repetition, reading, recited speech, and singing. Other measures examined phonetic inventories, word forms, and formulaic language. Phonetic, syllabic, and lexical dysfluencies were more abundant in conversation than in other task conditions. Formulaic expressions in conversation were reduced compared to normal speakers. A proposed explanation supports the notion that the basal ganglia contribute to formulation of internal models for execution of speech. PMID:22774929

  17. Lingual–Alveolar Contact Pressure During Speech in Amyotrophic Lateral Sclerosis: Preliminary Findings

    PubMed Central

    Knollhoff, Stephanie; Barohn, Richard J.

    2017-01-01

    Purpose This preliminary study on lingual–alveolar contact pressures (LACP) in people with amyotrophic lateral sclerosis (ALS) had several aims: (a) to evaluate whether the protocol induced fatigue, (b) to compare LACP during speech (LACP-Sp) and during maximum isometric pressing (LACP-Max) in people with ALS (PALS) versus healthy controls, (c) to compare the percentage of LACP-Max utilized during speech (%Max) for PALS versus controls, and (d) to evaluate relationships between LACP-Sp and LACP-Max with word intelligibility. Method Thirteen PALS and 12 healthy volunteers produced /t, d, s, z, l, n/ sounds while LACP-Sp was recorded. LACP-Max was obtained before and after the speech protocol. Word intelligibility was obtained from auditory–perceptual judgments. Results LACP-Max values measured before and after completion of the speech protocol did not differ. LACP-Sp and LACP-Max were statistically lower in the ALS bulbar group compared with controls and PALS with only spinal symptoms. There was no statistical difference between groups for %Max. LACP-Sp and LACP-Max were correlated with word intelligibility. Conclusions It was feasible to obtain LACP-Sp measures without inducing fatigue. Reductions in LACP-Sp and LACP-Max for bulbar speakers might reflect tongue weakness. Although confirmation of results is needed, the data indicate that individuals with high word intelligibility maintained LACP-Sp at or above 2 kPa and LACP-Max at or above 50 kPa. PMID:28335033

  18. Automatic Speech Recognition Predicts Speech Intelligibility and Comprehension for Listeners with Simulated Age-Related Hearing Loss

    ERIC Educational Resources Information Center

    Fontan, Lionel; Ferrané, Isabelle; Farinas, Jérôme; Pinquier, Julien; Tardieu, Julien; Magnen, Cynthia; Gaillard, Pascal; Aumont, Xavier; Füllgrabe, Christian

    2017-01-01

    Purpose: The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist…

  19. Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

    NASA Astrophysics Data System (ADS)

    Kayasith, Prakasith; Theeramunkong, Thanaruk

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  20. Speech and Language Deficits in Early-Treated Children with Galactosemia.

    ERIC Educational Resources Information Center

    Waisbren, Susan E.; And Others

    1983-01-01

    Intelligence and speech-language development of eight children (3.6 to 11.6 years old) with classic galactosemia were assessed by standardized tests. Each of the children had delays of early speech difficulties, and all but one had language disorders in at least one area. Available from: Journal of Pediatrics, C.V. Mosby Co., 11830 Westline…

  1. The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

    PubMed Central

    Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

    2010-01-01

    In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615

  2. The design of a device for hearer and feeler differentiation, part A. [speech modulated hearing device

    NASA Technical Reports Server (NTRS)

    Creecy, R.

    1974-01-01

    A speech modulated white noise device is reported that gives the rhythmic characteristics of a speech signal for intelligible reception by deaf persons. The signal is composed of random amplitudes and frequencies as modulated by the speech envelope characteristics of rhythm and stress. Time intensity parameters of speech are conveyed through the vibro-tactile sensation stimuli.

  3. The association between intelligence and lifespan is mostly genetic.

    PubMed

    Arden, Rosalind; Luciano, Michelle; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M

    2016-02-01

    Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. The finding of common genetic effects between lifespan and intelligence has important implications for public health, and for those interested in the

  4. The association between intelligence and lifespan is mostly genetic

    PubMed Central

    Arden, Rosalind; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M

    2016-01-01

    Abstract Background: Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. Methods: We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. Results: The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r  = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. Conclusions: The finding of common genetic effects between lifespan and intelligence has important implications

  5. Development of coffee maker service robot using speech and face recognition systems using POMDP

    NASA Astrophysics Data System (ADS)

    Budiharto, Widodo; Meiliana; Santoso Gunawan, Alexander Agung

    2016-07-01

    There are many development of intelligent service robot in order to interact with user naturally. This purpose can be done by embedding speech and face recognition ability on specific tasks to the robot. In this research, we would like to propose Intelligent Coffee Maker Robot which the speech recognition is based on Indonesian language and powered by statistical dialogue systems. This kind of robot can be used in the office, supermarket or restaurant. In our scenario, robot will recognize user's face and then accept commands from the user to do an action, specifically in making a coffee. Based on our previous work, the accuracy for speech recognition is about 86% and face recognition is about 93% in laboratory experiments. The main problem in here is to know the intention of user about how sweetness of the coffee. The intelligent coffee maker robot should conclude the user intention through conversation under unreliable automatic speech in noisy environment. In this paper, this spoken dialog problem is treated as a partially observable Markov decision process (POMDP). We describe how this formulation establish a promising framework by empirical results. The dialog simulations are presented which demonstrate significant quantitative outcome.

  6. Intelligibility in Context Scale: Normative and Validation Data for English-Speaking Preschoolers.

    PubMed

    McLeod, Sharynne; Crowe, Kathryn; Shahaeian, Ameneh

    2015-07-01

    The purpose of this study was to describe normative and validation data on the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012c) for English-speaking children. The ICS is a 7-item, parent-report measure of children's speech intelligibility with a range of communicative partners. Data were collected from the parents of 803 Australian English-speaking children ranging in age from 4;0 (years;months) to 5;5 (37.0% were multilingual). The mean ICS score was 4.4 (SD = 0.7) out of a possible total score of 5. Children's speech was reported to be most intelligible to their parents, followed by their immediate family, friends, and teachers; children's speech was least intelligible to strangers. The ICS had high internal consistency (α = .94). Significant differences in scores were identified on the basis of sex and age but not on the basis of socioeconomic status or the number of languages spoken. There were significant differences in scores between children whose parents had concerns about their child's speech (M = 3.9) and those who did not (M = 4.6). A sensitivity of .82 and a specificity of .58 were established as the optimal cutoff. Test-retest reliability and criterion validity were established for 184 children with a speech sound disorder. There was a significant low correlation between the ICS mean score and percentage of phonemes correct (r = .30), percentage of consonants correct (r = .24), and percentage of vowels correct (r = .30) on the Diagnostic Evaluation of Articulation and Phonology (Dodd, Hua, Crosbie, Holm, & Ozanne, 2002). Thirty-one parents completed the ICS related to English and another language spoken by their child with a speech sound disorder. The significant correlations between the scores suggest that the ICS may be robust between languages. This article provides normative ICS data for English-speaking children and additional validation of the psychometric properties of the ICS. The robustness of the ICS was suggested

  7. Speech deterioration in amyotrophic lateral sclerosis (ALS) after manifestation of bulbar symptoms.

    PubMed

    Makkonen, Tanja; Ruottinen, Hanna; Puhto, Riitta; Helminen, Mika; Palmio, Johanna

    2018-03-01

    The symptoms and their progression in amyotrophic lateral sclerosis (ALS) are typically studied after the diagnosis has been confirmed. However, many people with ALS already have severe dysarthria and loss of adequate speech at the time of diagnosis. Speech-and-language therapy interventions should be targeted timely based on communicative need in ALS. To investigate how long natural speech will remain functional and to identify the changes in the speech of persons with ALS. Altogether 30 consecutive participants were studied and divided into two groups based on the initial type of ALS, bulbar or spinal. Their speech disorder was evaluated on severity, articulation rate and intelligibility during the 2-year follow-up. The ability to speak deteriorated to poor and necessitated augmentative and alternative communication (AAC) methods with 60% of the participants. Their speech remained adequate on average for 18 months from the first bulbar symptom. Severity, articulation rate and intelligibility declined with nearly all participants during the study. To begin with speech deteriorated more in the bulbar group than in the spinal group and the difference remained during the whole follow-up with some exceptions. The onset of bulbar symptoms indicated the time to loss of speech better than when assessed from ALS diagnosis or the first speech therapy evaluation. In clinical work, it is important to take the initial type of ALS into consideration when determining the urgency of AAC measures as people with bulbar-onset ALS are more susceptible to delayed evaluation and AAC intervention. © 2017 Royal College of Speech and Language Therapists.

  8. Acoustic changes in the speech of children with cerebral palsy following an intensive program of dysarthria therapy.

    PubMed

    Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

    2018-01-01

    The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. We recorded 16 young people with cerebral palsy and dysarthria (nine girls; mean age 14 years, SD = 2; nine spastic type, two dyskinetic, four mixed; one Worster-Drought) producing speech in two conditions (single words, connected speech) twice before and twice after therapy focusing on respiration, phonation and rate. In both single-word and connected speech we measured vocal intensity (root mean square-RMS), period-to-period variability (Shimmer APQ, Jitter RAP and PPQ) and harmonics-to-noise ratio (HNR). In connected speech we also measured mean fundamental frequency, utterance duration in seconds and speech and articulation rate (syllables/s with and without pauses respectively). All acoustic measures were made using Praat. Intelligibility was calculated in previous research. In single words statistically significant but very small reductions were observed in period-to-period variability following therapy: Shimmer APQ -0.15 (95% CI = -0.21 to -0.09); Jitter RAP -0.08 (95% CI = -0.14 to -0.01); Jitter PPQ -0.08 (95% CI = -0.15 to -0.01). No changes in period-to-period perturbation across phrases in connected speech were detected. However, changes in connected speech were observed in phrase length, rate and intensity. Following therapy, mean utterance duration increased by 1.11 s (95% CI = 0.37-1.86) when measured with pauses and by 1.13 s (95% CI = 0.40-1.85) when measured without pauses. Articulation rate increased by 0.07 syllables/s (95% CI = 0.02-0.13); speech rate increased by 0.06 syllables/s (95% CI = < 0.01-0.12); and intensity increased by 0.03 Pascals (95% CI = 0.02-0.04). There was a gradual reduction in mean fundamental frequency across all time points (-11.85 Hz, 95

  9. Sensory Intelligence for Extraction of an Abstract Auditory Rule: A Cross-Linguistic Study.

    PubMed

    Guo, Xiao-Tao; Wang, Xiao-Dong; Liang, Xiu-Yuan; Wang, Ming; Chen, Lin

    2018-02-21

    In a complex linguistic environment, while speech sounds can greatly vary, some shared features are often invariant. These invariant features constitute so-called abstract auditory rules. Our previous study has shown that with auditory sensory intelligence, the human brain can automatically extract the abstract auditory rules in the speech sound stream, presumably serving as the neural basis for speech comprehension. However, whether the sensory intelligence for extraction of abstract auditory rules in speech is inherent or experience-dependent remains unclear. To address this issue, we constructed a complex speech sound stream using auditory materials in Mandarin Chinese, in which syllables had a flat lexical tone but differed in other acoustic features to form an abstract auditory rule. This rule was occasionally and randomly violated by the syllables with the rising, dipping or falling tone. We found that both Chinese and foreign speakers detected the violations of the abstract auditory rule in the speech sound stream at a pre-attentive stage, as revealed by the whole-head recordings of mismatch negativity (MMN) in a passive paradigm. However, MMNs peaked earlier in Chinese speakers than in foreign speakers. Furthermore, Chinese speakers showed different MMN peak latencies for the three deviant types, which paralleled recognition points. These findings indicate that the sensory intelligence for extraction of abstract auditory rules in speech sounds is innate but shaped by language experience. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.

  10. Effect of Energy Equalization on the Intelligibility of Speech in Fluctuating Background Interference for Listeners With Hearing Impairment

    PubMed Central

    D’Aquila, Laura A.; Desloge, Joseph G.; Braida, Louis D.

    2017-01-01

    The masking release (MR; i.e., better speech recognition in fluctuating compared with continuous noise backgrounds) that is evident for listeners with normal hearing (NH) is generally reduced or absent for listeners with sensorineural hearing impairment (HI). In this study, a real-time signal-processing technique was developed to improve MR in listeners with HI and offer insight into the mechanisms influencing the size of MR. This technique compares short-term and long-term estimates of energy, increases the level of short-term segments whose energy is below the average energy, and normalizes the overall energy of the processed signal to be equivalent to that of the original long-term estimate. This signal-processing algorithm was used to create two types of energy-equalized (EEQ) signals: EEQ1, which operated on the wideband speech plus noise signal, and EEQ4, which operated independently on each of four bands with equal logarithmic width. Consonant identification was tested in backgrounds of continuous and various types of fluctuating speech-shaped Gaussian noise including those with both regularly and irregularly spaced temporal fluctuations. Listeners with HI achieved similar scores for EEQ and the original (unprocessed) stimuli in continuous-noise backgrounds, while superior performance was obtained for the EEQ signals in fluctuating background noises that had regular temporal gaps but not for those with irregularly spaced fluctuations. Thus, in noise backgrounds with regularly spaced temporal fluctuations, the energy-normalized signals led to larger values of MR and higher intelligibility than obtained with unprocessed signals. PMID:28602128

  11. The intelligibility in Context Scale: validity and reliability of a subjective rating measure.

    PubMed

    McLeod, Sharynne; Harrison, Linda J; McCormack, Jane

    2012-04-01

    To describe a new measure of functional intelligibility, the Intelligibility in Context Scale (ICS), and evaluate its validity, reliability, and sensitivity using 3 clinical measures of severity of speech sound disorder: (a) percentage of phonemes correct (PPC), (b) percentage of consonants correct (PCC), and (c) percentage of vowels correct (PVC). Speech skills of 120 preschool children (109 with parent-/teacher-identified concern about how they talked and made speech sounds and 11 with no identified concern) were assessed with the Diagnostic Evaluation of Articulation and Phonology (Dodd, Hua, Crosbie, Holm, & Ozanne, 2002). Parents completed the 7-item ICS, which rates the degree to which children's speech is understood by different communication partners (parents, immediate family, extended family, friends, acquaintances, teachers, and strangers) on a 5-point scale. Parents' ratings showed that most children were always (5) or usually (4) understood by parents, immediate family, and teachers, but only sometimes (3) by strangers. Factor analysis confirmed the internal consistency of the ICS items; therefore, ratings were averaged to form an overall intelligibility score. The ICS had high internal reliability (α = .93), sensitivity, and construct validity. Criterion validity was established through significant correlations between the ICS and PPC (r = .54), PCC (r = .54), and PVC (r = .36). The ICS is a promising new measure of functional intelligibility. These data provide initial support for the ICS as an easily administered, valid, and reliable estimate of preschool children's intelligibility when speaking with people of varying levels of familiarity and authority.

  12. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    PubMed

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Speech perception in noise with a harmonic complex excited vocoder.

    PubMed

    Churchill, Tyler H; Kan, Alan; Goupell, Matthew J; Ihlefeld, Antje; Litovsky, Ruth Y

    2014-04-01

    A cochlear implant (CI) presents band-pass-filtered acoustic envelope information by modulating current pulse train levels. Similarly, a vocoder presents envelope information by modulating an acoustic carrier. By studying how normal hearing (NH) listeners are able to understand degraded speech signals with a vocoder, the parameters that best simulate electric hearing and factors that might contribute to the NH-CI performance difference may be better understood. A vocoder with harmonic complex carriers (fundamental frequency, f0 = 100 Hz) was used to study the effect of carrier phase dispersion on speech envelopes and intelligibility. The starting phases of the harmonic components were randomly dispersed to varying degrees prior to carrier filtering and modulation. NH listeners were tested on recognition of a closed set of vocoded words in background noise. Two sets of synthesis filters simulated different amounts of current spread in CIs. Results showed that the speech vocoded with carriers whose starting phases were maximally dispersed was the most intelligible. Superior speech understanding may have been a result of the flattening of the dispersed-phase carrier's intrinsic temporal envelopes produced by the large number of interacting components in the high-frequency channels. Cross-correlogram analyses of auditory nerve model simulations confirmed that randomly dispersing the carrier's component starting phases resulted in better neural envelope representation. However, neural metrics extracted from these analyses were not found to accurately predict speech recognition scores for all vocoded speech conditions. It is possible that central speech understanding mechanisms are insensitive to the envelope-fine structure dichotomy exploited by vocoders.

  14. Effects of Neurosurgical Management of Parkinson's Disease on Speech Characteristics and Oromotor Function.

    ERIC Educational Resources Information Center

    Farrell, Anna; Theodoros, Deborah; Ward, Elizabeth; Hall, Bruce; Silburn, Peter

    2005-01-01

    The present study examined the effects of neurosurgical management of Parkinson's disease (PD), including the procedures of pallidotomy, thalamotomy, and deep-brain stimulation (DBS) on perceptual speech characteristics, speech intelligibility, and oromotor function in a group of 22 participants with PD. The surgical participant group was compared…

  15. Noise suppression methods for robust speech processing

    NASA Astrophysics Data System (ADS)

    Boll, S. F.; Ravindra, H.; Randall, G.; Armantrout, R.; Power, R.

    1980-05-01

    Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments during this reporting period for the research program funded to develop real time, compressed speech analysis synthesis algorithms whose performance in invariant under signal contamination. Fulfillment of this requirement is necessary to insure reliable secure compressed speech transmission within realistic military command and control environments. Overall contributions resulting from this research program include the understanding of how environmental noise degrades narrow band, coded speech, development of appropriate real time noise suppression algorithms, and development of speech parameter identification methods that consider signal contamination as a fundamental element in the estimation process. This report describes the current research and results in the areas of noise suppression using the dual input adaptive noise cancellation using the short time Fourier transform algorithms, articulation rate change techniques, and a description of an experiment which demonstrated that the spectral subtraction noise suppression algorithm can improve the intelligibility of 2400 bps, LPC 10 coded, helicopter speech by 10.6 point.

  16. The level and nature of autistic intelligence III: Inspection time.

    PubMed

    Barbeau, Elise B; Soulières, Isabelle; Dawson, Michelle; Zeffiro, Thomas A; Mottron, Laurent

    2013-02-01

    Across the autism spectrum, level of intelligence is highly dependent on the psychometric instrument used for assessment, and there are conflicting views concerning which measures best estimate autistic cognitive abilities. Inspection time is a processing speed measure associated with general intelligence in typical individuals. We therefore investigated autism spectrum performance on inspection time in relation to two different general intelligence tests. Autism spectrum individuals were divided into autistic and Asperger subgroups according to speech development history. Compared to a typical control group, mean inspection time for the autistic subgroup but not the Asperger subgroup was significantly shorter (by 31%). However, the shorter mean autistic inspection time was evident only when groups were matched on Wechsler IQ and disappeared when they were matched using Raven's Progressive Matrices. When autism spectrum abilities are compared to typical abilities, results may be influenced by speech development history as well as by the instrument used for intelligence matching. 2013 APA, all rights reserved

  17. Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.

    PubMed

    Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana

    2017-08-09

    The extent to which the sleeping brain processes sensory information remains unclear. This is particularly true for continuous and complex stimuli such as speech, in which information is organized into hierarchically embedded structures. Recently, novel metrics for assessing the neural representation of continuous speech have been developed using noninvasive brain recordings that have thus far only been tested during wakefulness. Here we investigated, for the first time, the sleeping brain's capacity to process continuous speech at different hierarchical levels using a newly developed Concurrent Hierarchical Tracking (CHT) approach that allows monitoring the neural representation and processing-depth of continuous speech online. Speech sequences were compiled with syllables, words, phrases, and sentences occurring at fixed time intervals such that different linguistic levels correspond to distinct frequencies. This enabled us to distinguish their neural signatures in brain activity. We compared the neural tracking of intelligible versus unintelligible (scrambled and foreign) speech across states of wakefulness and sleep using high-density EEG in humans. We found that neural tracking of stimulus acoustics was comparable across wakefulness and sleep and similar across all conditions regardless of speech intelligibility. In contrast, neural tracking of higher-order linguistic constructs (words, phrases, and sentences) was only observed for intelligible speech during wakefulness and could not be detected at all during nonrapid eye movement or rapid eye movement sleep. These results suggest that, whereas low-level auditory processing is relatively preserved during sleep, higher-level hierarchical linguistic parsing is severely disrupted, thereby revealing the capacity and limits of language processing during sleep. SIGNIFICANCE STATEMENT Despite the persistence of some sensory processing during sleep, it is unclear whether high-level cognitive processes such as speech

  18. Age-related changes to spectral voice characteristics affect judgments of prosodic, segmental, and talker attributes for child and adult speech.

    PubMed

    Dilley, Laura C; Wieland, Elizabeth A; Gamache, Jessica L; McAuley, J Devin; Redford, Melissa A

    2013-02-01

    As children mature, changes in voice spectral characteristics co-vary with changes in speech, language, and behavior. In this study, spectral characteristics were manipulated to alter the perceived ages of talkers' voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Speech was modified by lowering formants and fundamental frequency, for 5-year-old children's utterances, or raising them, for adult caregivers' utterances. Next, participants differing in awareness of the manipulation (Experiment 1A) or amount of speech-language training (Experiment 1B) made judgments of prosodic, segmental, and talker attributes. Experiment 2 investigated the effects of spectral modification on intelligibility. Finally, in Experiment 3, trained analysts used formal prosody coding to assess prosodic characteristics of spectrally modified and unmodified speech. Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work.

  19. Age-related changes to spectral voice characteristics affect judgments of prosodic, segmental, and talker attributes for child and adult speech

    PubMed Central

    Dilley, Laura C.; Wieland, Elizabeth A.; Gamache, Jessica L.; McAuley, J. Devin; Redford, Melissa A.

    2013-01-01

    Purpose As children mature, changes in voice spectral characteristics covary with changes in speech, language, and behavior. Spectral characteristics were manipulated to alter the perceived ages of talkers’ voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Method Speech was modified by lowering formants and fundamental frequency, for 5-year-old children’s utterances, or raising them, for adult caregivers’ utterances. Next, participants differing in awareness of the manipulation (Exp. 1a) or amount of speech-language training (Exp. 1b) made judgments of prosodic, segmental, and talker attributes. Exp. 2 investigated the effects of spectral modification on intelligibility. Finally, in Exp. 3 trained analysts used formal prosody coding to assess prosodic characteristics of spectrally-modified and unmodified speech. Results Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Conclusions Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work. PMID:23275414

  20. Development of equally intelligible Telugu sentence-lists to test speech recognition in noise.

    PubMed

    Tanniru, Kishore; Narne, Vijaya Kumar; Jain, Chandni; Konadath, Sreeraj; Singh, Niraj Kumar; Sreenivas, K J Ramadevi; K, Anusha

    2017-09-01

    To develop sentence lists in the Telugu language for the assessment of speech recognition threshold (SRT) in the presence of background noise through identification of the mean signal-to-noise ratio required to attain a 50% sentence recognition score (SRTn). This study was conducted in three phases. The first phase involved the selection and recording of Telugu sentences. In the second phase, 20 lists, each consisting of 10 sentences with equal intelligibility, were formulated using a numerical optimisation procedure. In the third phase, the SRTn of the developed lists was estimated using adaptive procedures on individuals with normal hearing. A total of 68 native Telugu speakers with normal hearing participated in the study. Of these, 18 (including the speakers) performed on various subjective measures in first phase, 20 performed on sentence/word recognition in noise for second phase and 30 participated in the list equivalency procedures in third phase. In all, 15 lists of comparable difficulty were formulated as test material. The mean SRTn across these lists corresponded to -2.74 (SD = 0.21). The developed sentence lists provided a valid and reliable tool to measure SRTn in Telugu native speakers.

  1. VOT in speech-disordered individuals: History, theory, data, reminiscence

    NASA Astrophysics Data System (ADS)

    Weismer, Gary

    2004-05-01

    Forty years ago Lisker and Abramson published their landmark paper on VOT; the speech-research world has never been the same. The concept of VOT as a measure relevant to phonology, speech physiology, and speech perception made it a prime choice for scientists who saw an opportunity to exploit the techniques and analytic frameworks of ``speech science'' in the study of speech disorders. Modifications of VOT in speech disorders have been used to draw specific inferences concerning phonological representations, glottal-supraglottal timing, and speech intelligibility. This presentation will provide a review of work on VOT in speech disorders, including (among others) stuttering, hearing impairment, and neurogenic disorders. An attempt will be made to collect published data in summary graphic form, and to discuss their implications. Emphasis will be placed on how VOT has been used to inform theories of disordered speech production. I will close with some personal comments about the influence (unbeknowest to them) these two outstanding scientists had on me in the 1970s, when under the spell of their work I first became aware that the world of speech research did not start and end with moving parts.

  2. Speech Production in Hearing-Impaired Children.

    ERIC Educational Resources Information Center

    Gold, Toni

    1980-01-01

    Investigations in recent years have indicated that only about 20% of the speech output of the deaf is understood by the "person on the street." This lack of intelligibility has been associated with some frequently occurring segmental and suprasegmental errors. Journal Availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY…

  3. The Listener: No Longer the Silent Partner in Reduced Intelligibility

    ERIC Educational Resources Information Center

    Zielinski, Beth W.

    2008-01-01

    In this study I investigate the impact of different characteristics of the L2 speech signal on the intelligibility of L2 speakers of English to native listeners. Three native listeners were observed and questioned as they orthographically transcribed utterances taken from connected conversational speech produced by three L2 speakers from different…

  4. Sentence-Level Movements in Parkinson's Disease: Loud, Clear, and Slow Speech

    ERIC Educational Resources Information Center

    Kearney,Elaine; Giles, Renuka; Haworth, Brandon; Faloutsos, Petros; Baljko, Melanie; Yunusova, Yana

    2017-01-01

    Purpose: To further understand the effect of Parkinson's disease (PD) on articulatory movements in speech and to expand our knowledge of therapeutic treatment strategies, this study examined movements of the jaw, tongue blade, and tongue dorsum during sentence production with respect to speech intelligibility and compared the effect of varying…

  5. Relationship between Speech, Oromotor, Language and Cognitive Abilities in Children with Down's Syndrome

    ERIC Educational Resources Information Center

    Cleland, Joanne; Wood, Sara; Hardcastle, William; Wishart, Jennifer; Timmins, Claire

    2010-01-01

    Background: Children and young people with Down's syndrome present with deficits in expressive speech and language, accompanied by strengths in vocabulary comprehension compared with non-verbal mental age. Intelligibility is particularly low, but whether speech is delayed or disordered is a controversial topic. Most studies suggest a delay, but no…

  6. Speech Recognition for A Digital Video Library.

    ERIC Educational Resources Information Center

    Witbrock, Michael J.; Hauptmann, Alexander G.

    1998-01-01

    Production of the meta-data supporting the Informedia Digital Video Library interface is automated using techniques derived from artificial intelligence research. Speech recognition and natural-language processing, information retrieval, and image analysis are applied to produce an interface that helps users locate information and navigate more…

  7. [Computer assisted application of mandarin speech test materials].

    PubMed

    Zhang, Hua; Wang, Shuo; Chen, Jing; Deng, Jun-Min; Yang, Xiao-Lin; Guo, Lian-Sheng; Zhao, Xiao-Yan; Shao, Guang-Yu; Han, De-Min

    2008-06-01

    To design an intelligent speech test system with reliability and convenience using the computer software and to evaluate this system. First, the intelligent system was designed by the Delphi program language. Second, the seven monosyllabic word lists recorded on CD were separated by Cool Edit Pro v2.1 software and put into the system as test materials. Finally, the intelligent system was used to evaluate the equivalence of difficulty between seven lists. Fifty-five college students with normal hearing participated in the study. The seven monosyllabic word lists had equivalent difficulty (F = 1.582, P > 0.05) to the subjects between each other and the system was proved as reliability and convenience. The intelligent system has the feasibility in the clinical practice.

  8. Long short-term memory for speaker generalization in supervised speech separation

    PubMed Central

    Chen, Jitong; Wang, DeLiang

    2017-01-01

    Speech separation can be formulated as learning to estimate a time-frequency mask from acoustic features extracted from noisy speech. For supervised speech separation, generalization to unseen noises and unseen speakers is a critical issue. Although deep neural networks (DNNs) have been successful in noise-independent speech separation, DNNs are limited in modeling a large number of speakers. To improve speaker generalization, a separation model based on long short-term memory (LSTM) is proposed, which naturally accounts for temporal dynamics of speech. Systematic evaluation shows that the proposed model substantially outperforms a DNN-based model on unseen speakers and unseen noises in terms of objective speech intelligibility. Analyzing LSTM internal representations reveals that LSTM captures long-term speech contexts. It is also found that the LSTM model is more advantageous for low-latency speech separation and it, without future frames, performs better than the DNN model with future frames. The proposed model represents an effective approach for speaker- and noise-independent speech separation. PMID:28679261

  9. Visemic Processing in Audiovisual Discrimination of Natural Speech: A Simultaneous fMRI-EEG Study

    ERIC Educational Resources Information Center

    Dubois, Cyril; Otzenberger, Helene; Gounot, Daniel; Sock, Rudolph; Metz-Lutz, Marie-Noelle

    2012-01-01

    In a noisy environment, visual perception of articulatory movements improves natural speech intelligibility. Parallel to phonemic processing based on auditory signal, visemic processing constitutes a counterpart based on "visemes", the distinctive visual units of speech. Aiming at investigating the neural substrates of visemic processing in a…

  10. Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English.

    PubMed

    Li, Junfeng; Yang, Lin; Zhang, Jianping; Yan, Yonghong; Hu, Yi; Akagi, Masato; Loizou, Philipos C

    2011-05-01

    A large number of single-channel noise-reduction algorithms have been proposed based largely on mathematical principles. Most of these algorithms, however, have been evaluated with English speech. Given the different perceptual cues used by native listeners of different languages including tonal languages, it is of interest to examine whether there are any language effects when the same noise-reduction algorithm is used to process noisy speech in different languages. A comparative evaluation and investigation is taken in this study of various single-channel noise-reduction algorithms applied to noisy speech taken from three languages: Chinese, Japanese, and English. Clean speech signals (Chinese words and Japanese words) were first corrupted by three types of noise at two signal-to-noise ratios and then processed by five single-channel noise-reduction algorithms. The processed signals were finally presented to normal-hearing listeners for recognition. Intelligibility evaluation showed that the majority of noise-reduction algorithms did not improve speech intelligibility. Consistent with a previous study with the English language, the Wiener filtering algorithm produced small, but statistically significant, improvements in intelligibility for car and white noise conditions. Significant differences between the performances of noise-reduction algorithms across the three languages were observed.

  11. Speech therapy for children with dysarthria acquired before three years of age.

    PubMed

    Pennington, Lindsay; Parker, Naomi K; Kelly, Helen; Miller, Nick

    2016-07-18

    Children with motor impairments often have the motor speech disorder dysarthria, a condition which effects the tone, strength and co-ordination of any or all of the muscles used for speech. Resulting speech difficulties can range from mild, with slightly slurred articulation and breathy voice, to profound, with an inability to produce any recognisable words. Children with dysarthria are often prescribed communication aids to supplement their natural forms of communication. However, there is variation in practice regarding the provision of therapy focusing on voice and speech production. Descriptive studies have suggested that therapy may improve speech, but its effectiveness has not been evaluated. To assess whether any speech and language therapy intervention aimed at improving the speech of children with dysarthria is more effective in increasing children's speech intelligibility or communicative participation than no intervention at all , and to compare the efficacy of individual types of speech language therapy in improving the speech intelligibility or communicative participation of children with dysarthria. We searched the Cochrane Central Register of Controlled Trials (CENTRAL; 2015 , Issue 7 ), MEDLINE, EMBASE, CINAHL , LLBA, ERIC, PsychInfo, Web of Science, Scopus, UK National Research Register and Dissertation Abstracts up to July 2015, handsearched relevant journals published between 1980 and July 2015, and searched proceedings of relevant conferences between 1996 to 2015. We placed no restrictions on the language or setting of the studies. A previous version of this review considered studies published up to April 2009. In this update we searched for studies published from April 2009 to July 2015. We considered randomised controlled trials and studies using quasi-experimental designs in which children were allocated to groups using non-random methods. One author (LP) conducted searches of all databases, journals and conference reports. All searches

  12. Speech production in children with Down's syndrome: The effects of reading, naming and imitation.

    PubMed

    Knight, Rachael-Anne; Kurtz, Scilla; Georgiadou, Ioanna

    2015-01-01

    People with DS are known to have difficulties with expressive language, and often have difficulties with intelligibility. They often have stronger visual than verbal short-term memory skills and, therefore, reading has often been suggested as an intervention for speech and language in this population. However, there is as yet no firm evidence that reading can improve speech outcomes. This study aimed to compare reading, picture naming and repetition for the same 10 words, to identify if the speech of eight children with DS (aged 11-14 years) was more accurate, consistent and intelligible when reading. Results show that children were slightly, yet significantly, more accurate and intelligible when they read words compared with when they produced those words in naming or imitation conditions although the reduction in inconsistency was non-significant. The results of this small-scale study provide tentative support for previous claims about the benefits of reading for children with DS. The mechanisms behind a facilitatory effect of reading are considered, and directions are identified for future research.

  13. Children with Comorbid Speech Sound Disorder and Specific Language Impairment Are at Increased Risk for Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    McGrath, Lauren M.; Hutaff-Lee, Christa; Scott, Ashley; Boada, Richard; Shriberg, Lawrence D.; Pennington, Bruce F.

    2008-01-01

    This study focuses on the comorbidity between attention-deficit/hyperactivity disorder (ADHD) symptoms and speech sound disorder (SSD). SSD is a developmental disorder characterized by speech production errors that impact intelligibility. Previous research addressing this comorbidity has typically used heterogeneous groups of speech-language…

  14. Australian children with cleft palate achieve age-appropriate speech by 5 years of age.

    PubMed

    Chacon, Antonia; Parkin, Melissa; Broome, Kate; Purcell, Alison

    2017-12-01

    Children with cleft palate demonstrate atypical speech sound development, which can influence their intelligibility, literacy and learning. There is limited documentation regarding how speech sound errors change over time in cleft palate speech and the effect that these errors have upon mono-versus polysyllabic word production. The objective of this study was to examine the phonetic and phonological speech skills of children with cleft palate at ages 3 and 5. A cross-sectional observational design was used. Eligible participants were aged 3 or 5 years with a repaired cleft palate. The Diagnostic Evaluation of Articulation and Phonology (DEAP) Articulation subtest and a non-standardised list of mono- and polysyllabic words were administered once for each child. The Profile of Phonology (PROPH) was used to analyse each child's speech. N = 51 children with cleft palate participated in the study. Three-year-old children with cleft palate produced significantly more speech errors than their typically-developing peers, but no difference was apparent at 5 years. The 5-year-olds demonstrated greater phonetic and phonological accuracy than the 3-year-old children. Polysyllabic words were more affected by errors than monosyllables in the 3-year-old group only. Children with cleft palate are prone to phonetic and phonological speech errors in their preschool years. Most of these speech errors approximate typically-developing children by 5 years. At 3 years, word shape has an influence upon phonological speech accuracy. Speech pathology intervention is indicated to support the intelligibility of these children from their earliest stages of development. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Evaluation of selected speech parameters after prosthesis supply in patients with maxillary or mandibular defects.

    PubMed

    Müller, Rainer; Höhlein, Andreas; Wolf, Annette; Markwardt, Jutta; Schulz, Matthias C; Range, Ursula; Reitemeier, Bernd

    2013-01-01

    Ablative surgery of oropharyngeal tumors frequently leads to defects in the speech organs, resulting in impairment of speech up to the point of unintelligibility. The aim of the present study was the assessment of selected parameters of speech with and without resection prostheses. The speech sounds of 22 patients suffering from maxillary and mandibular defects were recorded using a digital audio tape (DAT) recorder with and without resection prostheses. Evaluation of the resonance and the production of the sounds /s/, /sch/, and /ch/ was performed by 2 experienced speech therapists. Additionally, the patients completed a non-standardized questionnaire containing a linguistic self-assessment. After prosthesis supply, the number of patients with rhinophonia aperta decreased from 7 to 2 while the number of patients with intelligible speech increased from 2 to 20. Correct production of the sounds /s/, /sch/, and /ch/ increased from 2 to 13 patients. A significant improvement of the evaluated parameters could be observed only in patients with maxillary defects. The linguistic self-assessment showed a higher satisfaction in patients with maxillary defects. In patients with maxillary defects due to ablative tumor surgery, an increase in speech performance and intelligibility is possible by supplying resection prostheses. © 2013 S. Karger GmbH, Freiburg.

  16. Underwater speech communications with a modulated laser

    NASA Astrophysics Data System (ADS)

    Woodward, B.; Sari, H.

    2008-04-01

    A novel speech communications system using a modulated laser beam has been developed for short-range applications in which high directionality is an exploitable feature. Although it was designed for certain underwater applications, such as speech communications between divers or between a diver and the surface, it may equally be used for air applications. With some modification it could be used for secure diver-to-diver communications in the situation where untethered divers are swimming close together and do not want their conversations monitored by intruders. Unlike underwater acoustic communications, where the transmitted speech may be received at ranges of hundreds of metres omnidirectionally, a laser communication link is very difficult to intercept and also obviates the need for cables that become snagged or broken. Further applications include the transmission of speech and data, including the short message service (SMS), from a fixed installation such as a sea-bed habitat; and data transmission to and from an autonomous underwater vehicle (AUV), particularly during docking manoeuvres. The performance of the system has been assessed subjectively by listening tests, which revealed that the speech was intelligible, although of poor quality due to the speech algorithm used.

  17. Using Zebra-speech to study sequential and simultaneous speech segregation in a cochlear-implant simulation.

    PubMed

    Gaudrain, Etienne; Carlyon, Robert P

    2013-01-01

    Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish the target and the masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed.

  18. Using Zebra-speech to study sequential and simultaneous speech segregation in a cochlear-implant simulation

    PubMed Central

    Gaudrain, Etienne; Carlyon, Robert P.

    2013-01-01

    Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish target and masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed. PMID:23297922

  19. Listener Perception of Monopitch, Naturalness, and Intelligibility for Speakers with Parkinson's Disease

    ERIC Educational Resources Information Center

    Anand, Supraja; Stepp, Cara E.

    2015-01-01

    Purpose: Given the potential significance of speech naturalness to functional and social rehabilitation outcomes, the objective of this study was to examine the effect of listener perceptions of monopitch on speech naturalness and intelligibility in individuals with Parkinson's disease (PD). Method: Two short utterances were extracted from…

  20. Reaction times of normal listeners to laryngeal, alaryngeal, and synthetic speech.

    PubMed

    Evitts, Paul M; Searl, Jeff

    2006-12-01

    The purpose of this study was to compare listener processing demands when decoding alaryngeal compared to laryngeal speech. Fifty-six listeners were presented with single words produced by 1 proficient speaker from 5 different modes of speech: normal, tracheosophageal (TE), esophageal (ES), electrolaryngeal (EL), and synthetic speech (SS). Cognitive processing load was indexed by listener reaction time (RT). To account for significant durational differences among the modes of speech, an RT ratio was calculated (stimulus duration divided by RT). Results indicated that the cognitive processing load was greater for ES and EL relative to normal speech. TE and normal speech did not differ in terms of RT ratio, suggesting fairly comparable cognitive demands placed on the listener. SS required greater cognitive processing load than normal and alaryngeal speech. The results are discussed relative to alaryngeal speech intelligibility and the role of the listener. Potential clinical applications and directions for future research are also presented.

  1. Speech comprehension aided by multiple modalities: behavioural and neural interactions

    PubMed Central

    McGettigan, Carolyn; Faulkner, Andrew; Altarelli, Irene; Obleser, Jonas; Baverstock, Harriet; Scott, Sophie K.

    2014-01-01

    Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources – e.g. voice, face, gesture, linguistic context – to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with

  2. Speech comprehension aided by multiple modalities: behavioural and neural interactions.

    PubMed

    McGettigan, Carolyn; Faulkner, Andrew; Altarelli, Irene; Obleser, Jonas; Baverstock, Harriet; Scott, Sophie K

    2012-04-01

    Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources - e.g. voice, face, gesture, linguistic context - to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with

  3. Attitudes toward speech disorders: sampling the views of Cantonese-speaking Americans.

    PubMed

    Bebout, L; Arthur, B

    1997-01-01

    Speech-language pathologists who serve clients from cultural backgrounds that are not familiar to them may encounter culturally influenced attitudinal differences. A questionnaire with statements about 4 speech disorders (dysfluency, cleft pallet, speech of the deaf, and misarticulations) was given to a focus group of Chinese Americans and a comparison group of non-Chinese Americans. The focus group was much more likely to believe that persons with speech disorders could improve their own speech by "trying hard," was somewhat more likely to say that people who use deaf speech and people with cleft palates might be "emotionally disturbed," and generally more likely to view deaf speech as a limitation. The comparison group was more pessimistic about stuttering children's acceptance by their peers than was the focus group. The two subject groups agreed about other items, such as the likelihood that older children with articulation problems are "less intelligent" than their peers.

  4. Effects of Alphabet-Supplemented Speech on Brain Activity of Listeners: An fMRI Study

    ERIC Educational Resources Information Center

    Fercho, Kelene; Baugh, Lee A.; Hanson, Elizabeth K.

    2015-01-01

    Purpose: The purpose of this article was to examine the neural mechanisms associated with increases in speech intelligibility brought about through alphabet supplementation. Method: Neurotypical participants listened to dysarthric speech while watching an accompanying video of a hand pointing to the 1st letter spoken of each word on an alphabet…

  5. Performance Evaluation of Intelligent Systems at the National Institute of Standards and Technology (NIST)

    DTIC Science & Technology

    2011-03-01

    past few years, including performance evaluation of emergency response robots , sensor systems on unmanned ground vehicles, speech-to-speech translation...emergency response robots ; intelligent systems; mixed palletizing, testing, simulation; robotic vehicle perception systems; search and rescue robots ...ranging from autonomous vehicles to urban search and rescue robots to speech translation and manufacturing systems. The evaluations have occurred in

  6. Integrated Speech and Language Technology for Intelligence, Surveillance, and Reconnaissance (ISR)

    DTIC Science & Technology

    2017-07-01

    applying submodularity techniques to address computing challenges posed by large datasets in speech and language processing. MT and speech tools were...aforementioned research-oriented activities, the IT system administration team provided necessary support to laboratory computing and network operations...operations of SCREAM Lab computer systems and networks. Other miscellaneous activities in relation to Task Order 29 are presented in an additional fourth

  7. A novel speech prosthesis for mandibular guidance therapy in hemimandibulectomy patient: A clinical report

    PubMed Central

    Adaki, Raghavendra; Shigli, Kamal; Hormuzdi, Dinshaw M.; Gali, Sivaranjani

    2016-01-01

    Treating diverse maxillofacial patients poses a challenge to the maxillofacial prosthodontist. Rehabilitation of hemimandibulectomy patients must aim at restoring mastication and other functions such as intelligible speech, swallowing, and esthetics. Prosthetic methods such as palatal ramp and mandibular guiding flange reposition the deviated mandible. Such prosthesis can also be used to restore speech in case of patients with debilitating speech following surgical resection. This clinical report gives detail of a hemimandibulectomy patient provided with an interim removable dental speech prosthesis with composite resin flange for mandibular guidance therapy. PMID:27041917

  8. Perceptual restoration of degraded speech is preserved with advancing age.

    PubMed

    Saija, Jefta D; Akyürek, Elkan G; Andringa, Tjeerd C; Başkent, Deniz

    2014-02-01

    Cognitive skills, such as processing speed, memory functioning, and the ability to divide attention, are known to diminish with aging. The present study shows that, despite these changes, older adults can successfully compensate for degradations in speech perception. Critically, the older participants of this study were not pre-selected for high performance on cognitive tasks, but only screened for normal hearing. We measured the compensation for speech degradation using phonemic restoration, where intelligibility of degraded speech is enhanced using top-down repair mechanisms. Linguistic knowledge, Gestalt principles of perception, and expectations based on situational and linguistic context are used to effectively fill in the inaudible masked speech portions. A positive compensation effect was previously observed only with young normal hearing people, but not with older hearing-impaired populations, leaving the question whether the lack of compensation was due to aging or due to age-related hearing problems. Older participants in the present study showed poorer intelligibility of degraded speech than the younger group, as expected from previous reports of aging effects. However, in conditions that induce top-down restoration, a robust compensation was observed. Speech perception by the older group was enhanced, and the enhancement effect was similar to that observed with the younger group. This effect was even stronger with slowed-down speech, which gives more time for cognitive processing. Based on previous research, the likely explanations for these observations are that older adults can overcome age-related cognitive deterioration by relying on linguistic skills and vocabulary that they have accumulated over their lifetime. Alternatively, or simultaneously, they may use different cerebral activation patterns or exert more mental effort. This positive finding on top-down restoration skills by the older individuals suggests that new cognitive training methods

  9. Measuring the critical band for speech.

    PubMed

    Healy, Eric W; Bacon, Sid P

    2006-02-01

    The current experiments were designed to measure the frequency resolution employed by listeners during the perception of everyday sentences. Speech bands having nearly vertical filter slopes and narrow bandwidths were sharply partitioned into various numbers of equal log- or ERBN-width subbands. The temporal envelope from each partition was used to amplitude modulate a corresponding band of low-noise noise, and the modulated carriers were combined and presented to normal-hearing listeners. Intelligibility increased and reached asymptote as the number of partitions increased. In the mid- and high-frequency regions of the speech spectrum, the partition bandwidth corresponding to asymptotic performance matched current estimates of psychophysical tuning across a number of conditions. These results indicate that, in these regions, the critical band for speech matches the critical band measured using traditional psychoacoustic methods and nonspeech stimuli. However, in the low-frequency region, partition bandwidths at asymptote were somewhat narrower than would be predicted based upon psychophysical tuning. It is concluded that, overall, current estimates of psychophysical tuning represent reasonably well the ability of listeners to extract spectral detail from running speech.

  10. Comparing Motor Skills in Autism Spectrum Individuals With and Without Speech Delay

    PubMed Central

    Barbeau, Elise B.; Meilleur, Andrée‐Anne S.; Zeffiro, Thomas A.

    2015-01-01

    Movement atypicalities in speed, coordination, posture, and gait have been observed across the autism spectrum (AS) and atypicalities in coordination are more commonly observed in AS individuals without delayed speech (DSM‐IV Asperger) than in those with atypical or delayed speech onset. However, few studies have provided quantitative data to support these mostly clinical observations. Here, we compared perceptual and motor performance between 30 typically developing and AS individuals (21 with speech delay and 18 without speech delay) to examine the associations between limb movement control and atypical speech development. Groups were matched for age, intelligence, and sex. The experimental design included: an inspection time task, which measures visual processing speed; the Purdue Pegboard, which measures finger dexterity, bimanual performance, and hand‐eye coordination; the Annett Peg Moving Task, which measures unimanual goal‐directed arm movement; and a simple reaction time task. We used analysis of covariance to investigate group differences in task performance and linear regression models to explore potential associations between intelligence, language skills, simple reaction time, and visually guided movement performance. AS participants without speech delay performed slower than typical participants in the Purdue Pegboard subtests. AS participants without speech delay showed poorer bimanual coordination than those with speech delay. Visual processing speed was slightly faster in both AS groups than in the typical group. Altogether, these results suggest that AS individuals with and without speech delay differ in visually guided and visually triggered behavior and show that early language skills are associated with slower movement in simple and complex motor tasks. Autism Res 2015, 8: 682–693. © 2015 The Authors Autism Research published by Wiley Periodicals, Inc. on behalf of International Society for Autism Research PMID:25820662

  11. Cochlea-scaled spectral entropy predicts rate-invariant intelligibility of temporally distorted sentences.

    PubMed

    Stilp, Christian E; Kiefte, Michael; Alexander, Joshua M; Kluender, Keith R

    2010-10-01

    Some evidence, mostly drawn from experiments using only a single moderate rate of speech, suggests that low-frequency amplitude modulations may be particularly important for intelligibility. Here, two experiments investigated intelligibility of temporally distorted sentences across a wide range of simulated speaking rates, and two metrics were used to predict results. Sentence intelligibility was assessed when successive segments of fixed duration were temporally reversed (exp. 1), and when sentences were processed through four third-octave-band filters, the outputs of which were desynchronized (exp. 2). For both experiments, intelligibility decreased with increasing distortion. However, in exp. 2, intelligibility recovered modestly with longer desynchronization. Across conditions, performances measured as a function of proportion of utterance distorted converged to a common function. Estimates of intelligibility derived from modulation transfer functions predict a substantial proportion of the variance in listeners' responses in exp. 1, but fail to predict performance in exp. 2. By contrast, a metric of potential information, quantified as relative dissimilarity (change) between successive cochlear-scaled spectra, is introduced. This metric reliably predicts listeners' intelligibility across the full range of speaking rates in both experiments. Results support an information-theoretic approach to speech perception and the significance of spectral change rather than physical units of time.

  12. Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing

    PubMed Central

    Doelling, Keith; Arnal, Luc; Ghitza, Oded; Poeppel, David

    2013-01-01

    A growing body of research suggests that intrinsic neuronal slow (< 10 Hz) oscillations in auditory cortex appear to track incoming speech and other spectro-temporally complex auditory signals. Within this framework, several recent studies have identified critical-band temporal envelopes as the specific acoustic feature being reflected by the phase of these oscillations. However, how this alignment between speech acoustics and neural oscillations might underpin intelligibility is unclear. Here we test the hypothesis that the ‘sharpness’ of temporal fluctuations in the critical band envelope acts as a temporal cue to speech syllabic rate, driving delta-theta rhythms to track the stimulus and facilitate intelligibility. We interpret our findings as evidence that sharp events in the stimulus cause cortical rhythms to re-align and parse the stimulus into syllable-sized chunks for further decoding. Using magnetoencephalographic recordings, we show that by removing temporal fluctuations that occur at the syllabic rate, envelope-tracking activity is reduced. By artificially reinstating these temporal fluctuations, envelope-tracking activity is regained. These changes in tracking correlate with intelligibility of the stimulus. Together, the results suggest that the sharpness of fluctuations in the stimulus, as reflected in the cochlear output, drive oscillatory activity to track and entrain to the stimulus, at its syllabic rate. This process likely facilitates parsing of the stimulus into meaningful chunks appropriate for subsequent decoding, enhancing perception and intelligibility. PMID:23791839

  13. Effects of social cognitive impairment on speech disorder in schizophrenia.

    PubMed

    Docherty, Nancy M; McCleery, Amanda; Divilbiss, Marielle; Schumann, Emily B; Moe, Aubrey; Shakeel, Mohammed K

    2013-05-01

    Disordered speech in schizophrenia impairs social functioning because it impedes communication with others. Treatment approaches targeting this symptom have been limited by an incomplete understanding of its causes. This study examined the process underpinnings of speech disorder, assessed in terms of communication failure. Contributions of impairments in 2 social cognitive abilities, emotion perception and theory of mind (ToM), to speech disorder were assessed in 63 patients with schizophrenia or schizoaffective disorder and 21 nonpsychiatric participants, after controlling for the effects of verbal intelligence and impairments in basic language-related neurocognitive abilities. After removal of the effects of the neurocognitive variables, impairments in emotion perception and ToM each explained additional variance in speech disorder in the patients but not the controls. The neurocognitive and social cognitive variables, taken together, explained 51% of the variance in speech disorder in the patients. Schizophrenic disordered speech may be less a concomitant of "positive" psychotic process than of illness-related limitations in neurocognitive and social cognitive functioning.

  14. The effect of varying talker identity and listening conditions on gaze behavior during audiovisual speech perception.

    PubMed

    Buchan, Julie N; Paré, Martin; Munhall, Kevin G

    2008-11-25

    During face-to-face conversation the face provides auditory and visual linguistic information, and also conveys information about the identity of the speaker. This study investigated behavioral strategies involved in gathering visual information while watching talking faces. The effects of varying talker identity and varying the intelligibility of speech (by adding acoustic noise) on gaze behavior were measured with an eyetracker. Varying the intelligibility of the speech by adding noise had a noticeable effect on the location and duration of fixations. When noise was present subjects adopted a vantage point that was more centralized on the face by reducing the frequency of the fixations on the eyes and mouth and lengthening the duration of their gaze fixations on the nose and mouth. Varying talker identity resulted in a more modest change in gaze behavior that was modulated by the intelligibility of the speech. Although subjects generally used similar strategies to extract visual information in both talker variability conditions, when noise was absent there were more fixations on the mouth when viewing a different talker every trial as opposed to the same talker every trial. These findings provide a useful baseline for studies examining gaze behavior during audiovisual speech perception and perception of dynamic faces.

  15. No association between prenatal exposure to psychotropics and intelligence at age five.

    PubMed

    Eriksen, Hanne-Lise Falgreen; Kesmodel, Ulrik Schiøler; Pedersen, Lars Henning; Mortensen, Erik Lykke

    2015-05-01

    To examine associations between prenatal exposure to selective serotonin reuptake inhibitors (SSRIs)/anxiolytics and intelligence assessed with a standard clinical intelligence test at age 5 years. Longitudinal follow-up study. Denmark, 2003-2008. A total of 1780 women and their children sampled from the Danish National Birth Cohort. Self-reported information on use of SSRI and anxiolytics was obtained from the Danish National Birth Cohort at the time of consent and from two prenatal interviews. Intelligence was assessed at age 5 years, and parental education, maternal intelligence quotient (IQ), maternal smoking and alcohol consumption in pregnancy, the child's age at testing, sex, and tester were included in the full model. The IQ of 13 medication-exposed children was compared with the IQ of 19 children whose mothers had untreated depression and 1748 control children. Wechsler Preschool and Primary Scale of Intelligence - Revised. In unadjusted analyses, children of mothers who used antidepressants or anxiolytics during pregnancy had higher verbal IQ; this association, however, was insignificant after adjustment for potentially confounding maternal and child factors. No consistent associations between IQ and fetal exposure to antidepressants and anxiolytics were observed, but the study had low statistical power, and there is an obvious need to conduct long-term follow-up studies with comprehensive cognitive assessment and sufficiently large samples of adolescent or adult offspring. © 2015 Nordic Federation of Societies of Obstetrics and Gynecology.

  16. Cross-language differences in the brain network subserving intelligible speech.

    PubMed

    Ge, Jianqiao; Peng, Gang; Lyu, Bingjiang; Wang, Yi; Zhuo, Yan; Niu, Zhendong; Tan, Li Hai; Leff, Alexander P; Gao, Jia-Hong

    2015-03-10

    How is language processed in the brain by native speakers of different languages? Is there one brain system for all languages or are different languages subserved by different brain systems? The first view emphasizes commonality, whereas the second emphasizes specificity. We investigated the cortical dynamics involved in processing two very diverse languages: a tonal language (Chinese) and a nontonal language (English). We used functional MRI and dynamic causal modeling analysis to compute and compare brain network models exhaustively with all possible connections among nodes of language regions in temporal and frontal cortex and found that the information flow from the posterior to anterior portions of the temporal cortex was commonly shared by Chinese and English speakers during speech comprehension, whereas the inferior frontal gyrus received neural signals from the left posterior portion of the temporal cortex in English speakers and from the bilateral anterior portion of the temporal cortex in Chinese speakers. Our results revealed that, although speech processing is largely carried out in the common left hemisphere classical language areas (Broca's and Wernicke's areas) and anterior temporal cortex, speech comprehension across different language groups depends on how these brain regions interact with each other. Moreover, the right anterior temporal cortex, which is crucial for tone processing, is equally important as its left homolog, the left anterior temporal cortex, in modulating the cortical dynamics in tone language comprehension. The current study pinpoints the importance of the bilateral anterior temporal cortex in language comprehension that is downplayed or even ignored by popular contemporary models of speech comprehension.

  17. Cross-language differences in the brain network subserving intelligible speech

    PubMed Central

    Ge, Jianqiao; Peng, Gang; Lyu, Bingjiang; Wang, Yi; Zhuo, Yan; Niu, Zhendong; Tan, Li Hai; Leff, Alexander P.; Gao, Jia-Hong

    2015-01-01

    How is language processed in the brain by native speakers of different languages? Is there one brain system for all languages or are different languages subserved by different brain systems? The first view emphasizes commonality, whereas the second emphasizes specificity. We investigated the cortical dynamics involved in processing two very diverse languages: a tonal language (Chinese) and a nontonal language (English). We used functional MRI and dynamic causal modeling analysis to compute and compare brain network models exhaustively with all possible connections among nodes of language regions in temporal and frontal cortex and found that the information flow from the posterior to anterior portions of the temporal cortex was commonly shared by Chinese and English speakers during speech comprehension, whereas the inferior frontal gyrus received neural signals from the left posterior portion of the temporal cortex in English speakers and from the bilateral anterior portion of the temporal cortex in Chinese speakers. Our results revealed that, although speech processing is largely carried out in the common left hemisphere classical language areas (Broca’s and Wernicke’s areas) and anterior temporal cortex, speech comprehension across different language groups depends on how these brain regions interact with each other. Moreover, the right anterior temporal cortex, which is crucial for tone processing, is equally important as its left homolog, the left anterior temporal cortex, in modulating the cortical dynamics in tone language comprehension. The current study pinpoints the importance of the bilateral anterior temporal cortex in language comprehension that is downplayed or even ignored by popular contemporary models of speech comprehension. PMID:25713366

  18. Automatic Speech Recognition from Neural Signals: A Focused Review.

    PubMed

    Herff, Christian; Schultz, Tanja

    2016-01-01

    Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.

  19. Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers

    NASA Astrophysics Data System (ADS)

    Caballero Morales, Santiago Omar; Cox, Stephen J.

    2009-12-01

    Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.

  20. Acceptable range of speech level in noisy sound fields for young adults and elderly persons.

    PubMed

    Sato, Hayato; Morimoto, Masayuki; Ota, Ryo

    2011-09-01

    The acceptable range of speech level as a function of background noise level was investigated on the basis of word intelligibility scores and listening difficulty ratings. In the present study, the acceptable range is defined as the range that maximizes word intelligibility scores and simultaneously does not cause a significant increase in listening difficulty ratings from the minimum ratings. Listening tests with young adult and elderly listeners demonstrated the following. (1) The acceptable range of speech level for elderly listeners overlapped that for young listeners. (2) The lower limit of the acceptable speech level for both young and elderly listeners was 65 dB (A-weighted) for noise levels of 40 and 45 dB (A-weighted), a level with a speech-to-noise ratio of +15 dB for noise levels of 50 and 55 dB, and a level with a speech-to-noise ratio of +10 dB for noise levels from 60 to 70 dB. (3) The upper limit of the acceptable speech level for both young and elderly listeners was 80 dB for noise levels from 40 to 55 dB and 85 dB or above for noise levels from 55 to 70 dB. © 2011 Acoustical Society of America

  1. Enhancing Computer-Based Lessons for Effective Speech Education.

    ERIC Educational Resources Information Center

    Hemphill, Michael R.; Standerfer, Christina C.

    1987-01-01

    Assesses the advantages of computer-based instruction on speech education. Concludes that, while it offers tremendous flexibility to the instructor--especially in dynamic lesson design, feedback, graphics, and artificial intelligence--there is no inherent advantage to the use of computer technology in the classroom, unless the student interacts…

  2. Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility

    PubMed Central

    Park, Hyojin; Kayser, Christoph; Thut, Gregor; Gross, Joachim

    2016-01-01

    During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker’s lip movements and demonstrated significant entrainment in visual cortex. We then used partial coherence to remove contributions of the coherent auditory speech signal from the lip-brain coherence. Comparing this synchronization between different attention conditions revealed that attending visual speech enhances the coherence between activity in visual cortex and the speaker’s lips. Further, we identified a significant partial coherence between left motor cortex and lip movements and this partial coherence directly predicted comprehension accuracy. Our results emphasize the importance of visually entrained and attention-modulated rhythmic brain activity for the enhancement of audiovisual speech processing. DOI: http://dx.doi.org/10.7554/eLife.14521.001 PMID:27146891

  3. Cognitive load during speech perception in noise: the influence of age, hearing loss, and cognition on the pupil response.

    PubMed

    Zekveld, Adriana A; Kramer, Sophia E; Festen, Joost M

    2011-01-01

    The aim of the present study was to evaluate the influence of age, hearing loss, and cognitive ability on the cognitive processing load during listening to speech presented in noise. Cognitive load was assessed by means of pupillometry (i.e., examination of pupil dilation), supplemented with subjective ratings. Two groups of subjects participated: 38 middle-aged participants (mean age = 55 yrs) with normal hearing and 36 middle-aged participants (mean age = 61 yrs) with hearing loss. Using three Speech Reception Threshold (SRT) in stationary noise tests, we estimated the speech-to-noise ratios (SNRs) required for the correct repetition of 50%, 71%, or 84% of the sentences (SRT50%, SRT71%, and SRT84%, respectively). We examined the pupil response during listening: the peak amplitude, the peak latency, the mean dilation, and the pupil response duration. For each condition, participants rated the experienced listening effort and estimated their performance level. Participants also performed the Text Reception Threshold (TRT) test, a test of processing speed, and a word vocabulary test. Data were compared with previously published data from young participants with normal hearing. Hearing loss was related to relatively poor SRTs, and higher speech intelligibility was associated with lower effort and higher performance ratings. For listeners with normal hearing, increasing age was associated with poorer TRTs and slower processing speed but with larger word vocabulary. A multivariate repeated-measures analysis of variance indicated main effects of group and SNR and an interaction effect between these factors on the pupil response. The peak latency was relatively short and the mean dilation was relatively small at low intelligibility levels for the middle-aged groups, whereas the reverse was observed for high intelligibility levels. The decrease in the pupil response as a function of increasing SNR was relatively small for the listeners with hearing loss. Spearman

  4. Prediction and constraint in audiovisual speech perception.

    PubMed

    Peelle, Jonathan E; Sommers, Mitchell S

    2015-07-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration

  5. Prediction and constraint in audiovisual speech perception

    PubMed Central

    Peelle, Jonathan E.; Sommers, Mitchell S.

    2015-01-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported

  6. Individual differences in speech-in-noise perception parallel neural speech processing and attention in preschoolers

    PubMed Central

    Thompson, Elaine C.; Carr, Kali Woodruff; White-Schwoch, Travis; Otto-Meyer, Sebastian; Kraus, Nina

    2016-01-01

    From bustling classrooms to unruly lunchrooms, school settings are noisy. To learn effectively in the unwelcome company of numerous distractions, children must clearly perceive speech in noise. In older children and adults, speech-in-noise perception is supported by sensory and cognitive processes, but the correlates underlying this critical listening skill in young children (3–5 year olds) remain undetermined. Employing a longitudinal design (two evaluations separated by ~12 months), we followed a cohort of 59 preschoolers, ages 3.0–4.9, assessing word-in-noise perception, cognitive abilities (intelligence, short-term memory, attention), and neural responses to speech. Results reveal changes in word-in-noise perception parallel changes in processing of the fundamental frequency (F0), an acoustic cue known for playing a role central to speaker identification and auditory scene analysis. Four unique developmental trajectories (speech-in-noise perception groups) confirm this relationship, in that improvements and declines in word-in-noise perception couple with enhancements and diminishments of F0 encoding, respectively. Improvements in word-in-noise perception also pair with gains in attention. Word-in-noise perception does not relate to strength of neural harmonic representation or short-term memory. These findings reinforce previously-reported roles of F0 and attention in hearing speech in noise in older children and adults, and extend this relationship to preschool children. PMID:27864051

  7. High-frequency energy in singing and speech

    NASA Astrophysics Data System (ADS)

    Monson, Brian Bruce

    While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.

  8. Development of Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Assessments of the Modulation Type with Regard to Intelligibility and Sound Quality

    NASA Astrophysics Data System (ADS)

    Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

    2012-07-01

    Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed; however, further improvements are needed, especially in terms of articulation and sound quality. In this study, the intelligibility and sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulation] were evaluated. The results showed that DSB-TC and transposed speech were more intelligible than DSB-SC speech, and transposed speech was closer than the other types of BCU speech to air-conducted speech in terms of sound quality. These results provide useful information for further development of the BCUHA.

  9. A new time-adaptive discrete bionic wavelet transform for enhancing speech from adverse noise environment

    NASA Astrophysics Data System (ADS)

    Palaniswamy, Sumithra; Duraisamy, Prakash; Alam, Mohammad Showkat; Yuan, Xiaohui

    2012-04-01

    Automatic speech processing systems are widely used in everyday life such as mobile communication, speech and speaker recognition, and for assisting the hearing impaired. In speech communication systems, the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. To obtain an intelligible speech signal and one that is more pleasant to listen, noise reduction is essential. In this paper a new Time Adaptive Discrete Bionic Wavelet Thresholding (TADBWT) scheme is proposed. The proposed technique uses Daubechies mother wavelet to achieve better enhancement of speech from additive non- stationary noises which occur in real life such as street noise and factory noise. Due to the integration of human auditory system model into the wavelet transform, bionic wavelet transform (BWT) has great potential for speech enhancement which may lead to a new path in speech processing. In the proposed technique, at first, discrete BWT is applied to noisy speech to derive TADBWT coefficients. Then the adaptive nature of the BWT is captured by introducing a time varying linear factor which updates the coefficients at each scale over time. This approach has shown better performance than the existing algorithms at lower input SNR due to modified soft level dependent thresholding on time adaptive coefficients. The objective and subjective test results confirmed the competency of the TADBWT technique. The effectiveness of the proposed technique is also evaluated for speaker recognition task under noisy environment. The recognition results show that the TADWT technique yields better performance when compared to alternate methods specifically at lower input SNR.

  10. Typical versus delayed speech onset influences verbal reporting of autistic interests.

    PubMed

    Chiodo, Liliane; Majerus, Steve; Mottron, Laurent

    2017-01-01

    The distinction between autism and Asperger syndrome has been abandoned in the DSM-5. However, this clinical categorization largely overlaps with the presence or absence of a speech onset delay which is associated with clinical, cognitive, and neural differences. It is unknown whether these different speech development pathways and associated cognitive differences are involved in the heterogeneity of the restricted interests that characterize autistic adults. This study tested the hypothesis that speech onset delay, or conversely, early mastery of speech, orients the nature and verbal reporting of adult autistic interests. The occurrence of a priori defined descriptors for perceptual and thematic dimensions were determined, as well as the perceived function and benefits, in the response of autistic people to a semi-structured interview on their intense interests. The number of words, grammatical categories, and proportion of perceptual / thematic descriptors were computed and compared between groups by variance analyses. The participants comprised 40 autistic adults grouped according to the presence ( N  = 20) or absence ( N  = 20) of speech onset delay, as well as 20 non-autistic adults, also with intense interests, matched for non-verbal intelligence using Raven's Progressive Matrices. The overall nature, function, and benefit of intense interests were similar across autistic subgroups, and between autistic and non-autistic groups. However, autistic participants with a history of speech onset delay used more perceptual than thematic descriptors when talking about their interests, whereas the opposite was true for autistic individuals without speech onset delay. This finding remained significant after controlling for linguistic differences observed between the two groups. Verbal reporting, but not the nature or positive function, of intense interests differed between adult autistic individuals depending on their speech acquisition history: oral reporting of

  11. The Speech Intelligibility Index and the pure-tone average as predictors of lexical ability in children fit with hearing AIDS.

    PubMed

    Stiles, Derek J; Bentler, Ruth A; McGregor, Karla K

    2012-06-01

    To determine whether a clinically obtainable measure of audibility, the aided Speech Intelligibility Index (SII; American National Standards Institute, 2007), is more sensitive than the pure-tone average (PTA) at predicting the lexical abilities of children who wear hearing aids (CHA). School-age CHA and age-matched children with normal hearing (CNH) repeated words and nonwords, learned novel words, and completed a standardized receptive vocabulary test. Analyses of covariance allowed comparison of the 2 groups. For CHA, regression analyses determined whether SII held predictive value over and beyond PTA. CHA demonstrated poorer performance than CNH on tests of word and nonword repetition and receptive vocabulary. Groups did not differ on word learning. Aided SII was a stronger predictor of word and nonword repetition and receptive vocabulary than PTA. After accounting for PTA, aided SII remained a significant predictor of nonword repetition and receptive vocabulary. Despite wearing hearing aids, CHA performed more poorly on 3 of 4 lexical measures. Individual differences among CHA were predicted by aided SII. Unlike PTA, aided SII incorporates hearing aid amplification characteristics and speech-frequency weightings and may provide a more valid estimate of the child's access to and ability to learn from auditory input in real-world environments.

  12. Neural Spike-Train Analyses of the Speech-Based Envelope Power Spectrum Model

    PubMed Central

    Rallapalli, Varsha H.

    2016-01-01

    Diagnosing and treating hearing impairment is challenging because people with similar degrees of sensorineural hearing loss (SNHL) often have different speech-recognition abilities. The speech-based envelope power spectrum model (sEPSM) has demonstrated that the signal-to-noise ratio (SNRENV) from a modulation filter bank provides a robust speech-intelligibility measure across a wider range of degraded conditions than many long-standing models. In the sEPSM, noise (N) is assumed to: (a) reduce S + N envelope power by filling in dips within clean speech (S) and (b) introduce an envelope noise floor from intrinsic fluctuations in the noise itself. While the promise of SNRENV has been demonstrated for normal-hearing listeners, it has not been thoroughly extended to hearing-impaired listeners because of limited physiological knowledge of how SNHL affects speech-in-noise envelope coding relative to noise alone. Here, envelope coding to speech-in-noise stimuli was quantified from auditory-nerve model spike trains using shuffled correlograms, which were analyzed in the modulation-frequency domain to compute modulation-band estimates of neural SNRENV. Preliminary spike-train analyses show strong similarities to the sEPSM, demonstrating feasibility of neural SNRENV computations. Results suggest that individual differences can occur based on differential degrees of outer- and inner-hair-cell dysfunction in listeners currently diagnosed into the single audiological SNHL category. The predicted acoustic-SNR dependence in individual differences suggests that the SNR-dependent rate of susceptibility could be an important metric in diagnosing individual differences. Future measurements of the neural SNRENV in animal studies with various forms of SNHL will provide valuable insight for understanding individual differences in speech-in-noise intelligibility.

  13. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    PubMed

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  14. Speech Segregation based on Binary Classification

    DTIC Science & Technology

    2016-07-15

    including the IBM, the target binary mask (TBM), the IRM, the short -time Fourier transform spectral magnitude (FFT-MAG) and its corresponding mask (FFT...complementary features and a fixed DNN as the discriminative learning machine. For evaluation metrics, besides SNR, we use the Short -Time Objective...target analysis is a recent successful intelligibility test conducted on both normal-hearing (NH) and hearing-impaired (HI) listeners. The speech

  15. Pitch-Based Segregation of Reverberant Speech

    DTIC Science & Technology

    2005-02-01

    speaker recognition in real environments, audio information retrieval and hearing prosthesis. Second, although binaural listening improves the...intelligibility of target speech under anechoic conditions (Bronkhorst, 2000), this binaural advantage is largely eliminated by reverberation (Plomp, 1976...Brown and Cooke, 1994; Wang and Brown, 1999; Hu and Wang, 2004) as well as in binaural separation (e.g., Roman et al., 2003; Palomaki et al., 2004

  16. Measurement of speech levels in the presence of time varying background noise

    NASA Technical Reports Server (NTRS)

    Pearsons, K. S.; Horonjeff, R.

    1982-01-01

    Short-term speech level measurements which could be used to note changes in vocal effort in a time varying noise environment were studied. Knowing the changes in speech level would in turn allow prediction of intelligibility in the presence of aircraft flyover noise. Tests indicated that it is possible to use two second samples of speech to estimate long term root mean square speech levels. Other tests were also performed in which people read out loud during aircraft flyover noise. Results of these tests indicate that people do indeed raise their voice during flyovers at a rate of about 3-1/2 dB for each 10 dB increase in background level. This finding is in agreement with other tests of speech levels in the presence of steady state background noise.

  17. Blind speech separation system for humanoid robot with FastICA for audio filtering and separation

    NASA Astrophysics Data System (ADS)

    Budiharto, Widodo; Santoso Gunawan, Alexander Agung

    2016-07-01

    Nowadays, there are many developments in building intelligent humanoid robot, mainly in order to handle voice and image. In this research, we propose blind speech separation system using FastICA for audio filtering and separation that can be used in education or entertainment. Our main problem is to separate the multi speech sources and also to filter irrelevant noises. After speech separation step, the results will be integrated with our previous speech and face recognition system which is based on Bioloid GP robot and Raspberry Pi 2 as controller. The experimental results show the accuracy of our blind speech separation system is about 88% in command and query recognition cases.

  18. Measuring the Effects of Reverberation and Noise on Sentence Intelligibility for Hearing-Impaired Listeners

    ERIC Educational Resources Information Center

    George, Erwin L. J.; Goverts, S. Theo; Festen, Joost M.; Houtgast, Tammo

    2010-01-01

    Purpose: The Speech Transmission Index (STI; Houtgast, Steeneken, & Plomp, 1980; Steeneken & Houtgast, 1980) is commonly used to quantify the adverse effects of reverberation and stationary noise on speech intelligibility for normal-hearing listeners. Duquesnoy and Plomp (1980) showed that the STI can be applied for presbycusic listeners, relating…

  19. Speech Impairment in Down Syndrome: A Review

    PubMed Central

    Kent, Ray D.; Vorperian, Houri K.

    2012-01-01

    Purpose This review summarizes research on disorders of speech production in Down Syndrome (DS) for the purposes of informing clinical services and guiding future research. Method Review of the literature was based on searches using Medline, Google Scholar, Psychinfo, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency and intelligibility. Conclusions The following conclusions pertain to four major areas of review: (a) Voice. Although a number of studies have been reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. (b) Speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. (c) Fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10 to 45%, compared to about 1% in the general population. Research also points to significant disturbances in prosody. (d) Intelligibility. Studies consistently show marked limitations in this area but it is only recently that research goes beyond simple rating scales. PMID:23275397

  20. The Performance of Preschoolers with Speech/Language Disorders on the McCarthy Scales of Children's Abilities.

    ERIC Educational Resources Information Center

    Morgan, Robert L.; And Others

    1992-01-01

    Administered McCarthy Scales of Children's Abilities to preschool children of normal intelligence with (n=25) and without (n=25) speech/language disorders. Speech/language disorders group had significantly lower scores on all scales except Motor; showed difficulty in short-term auditory memory skills but not in visual memory skills; and had…

  1. A generalized time-frequency subtraction method for robust speech enhancement based on wavelet filter banks modeling of human auditory system.

    PubMed

    Shao, Yu; Chang, Chip-Hong

    2007-08-01

    We present a new speech enhancement scheme for a single-microphone system to meet the demand for quality noise reduction algorithms capable of operating at a very low signal-to-noise ratio. A psychoacoustic model is incorporated into the generalized perceptual wavelet denoising method to reduce the residual noise and improve the intelligibility of speech. The proposed method is a generalized time-frequency subtraction algorithm, which advantageously exploits the wavelet multirate signal representation to preserve the critical transient information. Simultaneous masking and temporal masking of the human auditory system are modeled by the perceptual wavelet packet transform via the frequency and temporal localization of speech components. The wavelet coefficients are used to calculate the Bark spreading energy and temporal spreading energy, from which a time-frequency masking threshold is deduced to adaptively adjust the subtraction parameters of the proposed method. An unvoiced speech enhancement algorithm is also integrated into the system to improve the intelligibility of speech. Through rigorous objective and subjective evaluations, it is shown that the proposed speech enhancement system is capable of reducing noise with little speech degradation in adverse noise environments and the overall performance is superior to several competitive methods.

  2. The Relationship Between Apraxia of Speech and Oral Apraxia: Association or Dissociation?

    PubMed

    Whiteside, Sandra P; Dyson, Lucy; Cowell, Patricia E; Varley, Rosemary A

    2015-11-01

    Acquired apraxia of speech (AOS) is a motor speech disorder that affects the implementation of articulatory gestures and the fluency and intelligibility of speech. Oral apraxia (OA) is an impairment of nonspeech volitional movement. Although many speakers with AOS also display difficulties with volitional nonspeech oral movements, the relationship between the 2 conditions is unclear. This study explored the relationship between speech and volitional nonspeech oral movement impairment in a sample of 50 participants with AOS. We examined levels of association and dissociation between speech and OA using a battery of nonspeech oromotor, speech, and auditory/aphasia tasks. There was evidence of a moderate positive association between the 2 impairments across participants. However, individual profiles revealed patterns of dissociation between the 2 in a few cases, with evidence of double dissociation of speech and oral apraxic impairment. We discuss the implications of these relationships for models of oral motor and speech control. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Perceptions of University Instructors When Listening to International Student Speech

    ERIC Educational Resources Information Center

    Sheppard, Beth; Elliott, Nancy; Baese-Berk, Melissa

    2017-01-01

    Intensive English Program (IEP) Instructors and content faculty both listen to international students at the university. For these two groups of instructors, this study compared perceptions of international student speech by collecting comprehensibility ratings and transcription samples for intelligibility scores. No significant differences were…

  4. Individual differences in speech-in-noise perception parallel neural speech processing and attention in preschoolers.

    PubMed

    Thompson, Elaine C; Woodruff Carr, Kali; White-Schwoch, Travis; Otto-Meyer, Sebastian; Kraus, Nina

    2017-02-01

    From bustling classrooms to unruly lunchrooms, school settings are noisy. To learn effectively in the unwelcome company of numerous distractions, children must clearly perceive speech in noise. In older children and adults, speech-in-noise perception is supported by sensory and cognitive processes, but the correlates underlying this critical listening skill in young children (3-5 year olds) remain undetermined. Employing a longitudinal design (two evaluations separated by ∼12 months), we followed a cohort of 59 preschoolers, ages 3.0-4.9, assessing word-in-noise perception, cognitive abilities (intelligence, short-term memory, attention), and neural responses to speech. Results reveal changes in word-in-noise perception parallel changes in processing of the fundamental frequency (F0), an acoustic cue known for playing a role central to speaker identification and auditory scene analysis. Four unique developmental trajectories (speech-in-noise perception groups) confirm this relationship, in that improvements and declines in word-in-noise perception couple with enhancements and diminishments of F0 encoding, respectively. Improvements in word-in-noise perception also pair with gains in attention. Word-in-noise perception does not relate to strength of neural harmonic representation or short-term memory. These findings reinforce previously-reported roles of F0 and attention in hearing speech in noise in older children and adults, and extend this relationship to preschool children. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use.

    PubMed

    Gieseler, Anja; Tahden, Maike A S; Thiel, Christiane M; Wagener, Kirsten C; Meis, Markus; Colonius, Hans

    2017-01-01

    Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners ( mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically relevant auditory and non-auditory measures for speech-in-noise prediction using auditory (audiogram, categorical loudness scaling) and cognitive tests (verbal-intelligence test, screening test of dementia), as well as questionnaires assessing various self-reported measures (health status, socio-economic status, and subjective hearing problems). Using stepwise linear regression analysis, 62% of the variance in unaided speech-in-noise performance was explained, with measures Pure-tone average (PTA), Age , and Verbal intelligence emerging as the three most important predictors. In the complementary analysis, those individuals with the same hearing loss profile were separated into hearing aid users (HAU) and non-users (NU), and were then compared regarding potential differences in the test measures and in explaining unaided speech-in-noise recognition. The groupwise comparisons revealed significant differences in auditory measures and self-reported subjective hearing problems, while no differences in the cognitive domain were found. Furthermore, groupwise regression analyses revealed that Verbal intelligence had a predictive value in both groups, whereas Age and PTA only emerged significant in the group of hearing aid NU.

  6. Age-related Effects on Word Recognition: Reliance on Cognitive Control Systems with Structural Declines in Speech-responsive Cortex

    PubMed Central

    Walczak, Adam; Ahlstrom, Jayne; Denslow, Stewart; Horwitz, Amy; Dubno, Judy R.

    2008-01-01

    Speech recognition can be difficult and effortful for older adults, even for those with normal hearing. Declining frontal lobe cognitive control has been hypothesized to cause age-related speech recognition problems. This study examined age-related changes in frontal lobe function for 15 clinically normal hearing adults (21–75 years) when they performed a word recognition task that was made challenging by decreasing word intelligibility. Although there were no age-related changes in word recognition, there were age-related changes in the degree of activity within left middle frontal gyrus (MFG) and anterior cingulate (ACC) regions during word recognition. Older adults engaged left MFG and ACC regions when words were most intelligible compared to younger adults who engaged these regions when words were least intelligible. Declining gray matter volume within temporal lobe regions responsive to word intelligibility significantly predicted left MFG activity, even after controlling for total gray matter volume, suggesting that declining structural integrity of brain regions responsive to speech leads to the recruitment of frontal regions when words are easily understood. Electronic supplementary material The online version of this article (doi:10.1007/s10162-008-0113-3) contains supplementary material, which is available to authorized users. PMID:18274825

  7. Effect of Three Classroom Listening Conditions on Speech Intelligibility

    ERIC Educational Resources Information Center

    Ross, Mark; Giolas, Thomas G.

    1971-01-01

    Speech discrimination scores for 13 deaf children were obtained in a classroom under: usual listening condition (hearing aid or not), binaural listening situation using auditory trainer/FM receiver with wireless microphone transmitter turned off, and binaural condition with inputs from auditory trainer/FM receiver and wireless microphone/FM…

  8. Can you hear me yet? An intracranial investigation of speech and non-speech audiovisual interactions in human cortex.

    PubMed

    Rhone, Ariane E; Nourski, Kirill V; Oya, Hiroyuki; Kawasaki, Hiroto; Howard, Matthew A; McMurray, Bob

    In everyday conversation, viewing a talker's face can provide information about the timing and content of an upcoming speech signal, resulting in improved intelligibility. Using electrocorticography, we tested whether human auditory cortex in Heschl's gyrus (HG) and on superior temporal gyrus (STG) and motor cortex on precentral gyrus (PreC) were responsive to visual/gestural information prior to the onset of sound and whether early stages of auditory processing were sensitive to the visual content (speech syllable versus non-speech motion). Event-related band power (ERBP) in the high gamma band was content-specific prior to acoustic onset on STG and PreC, and ERBP in the beta band differed in all three areas. Following sound onset, we found with no evidence for content-specificity in HG, evidence for visual specificity in PreC, and specificity for both modalities in STG. These results support models of audio-visual processing in which sensory information is integrated in non-primary cortical areas.

  9. Sequential Organization and Room Reverberation for Speech Segregation

    DTIC Science & Technology

    2012-02-28

    we have proposed two algorithms for sequential organization, an unsupervised clustering algorithm applicable to monaural recordings and a binaural ...algorithm that integrates monaural and binaural analyses. In addition, we have conducted speech intelligibility tests that Firmly establish the...comprehensive version is currently under review for journal publication. A binaural approach in room reverberation Most existing approaches to binaural or

  10. Cortical activation patterns correlate with speech understanding after cochlear implantation

    PubMed Central

    Olds, Cristen; Pollonini, Luca; Abaya, Homer; Larky, Jannine; Loy, Megan; Bortfeld, Heather; Beauchamp, Michael S.; Oghalai, John S.

    2015-01-01

    Objectives Cochlear implants are a standard therapy for deafness, yet the ability of implanted patients to understand speech varies widely. To better understand this variability in outcomes, we used functional near-infrared spectroscopy (fNIRS) to image activity within regions of the auditory cortex and compare the results to behavioral measures of speech perception. Design We studied 32 deaf adults hearing through cochlear implants and 35 normal-hearing controls. We used fNIRS to measure responses within the lateral temporal lobe and the superior temporal gyrus to speech stimuli of varying intelligibility. The speech stimuli included normal speech, channelized speech (vocoded into 20 frequency bands), and scrambled speech (the 20 frequency bands were shuffled in random order). We also used environmental sounds as a control stimulus. Behavioral measures consisted of the Speech Reception Threshold, CNC words, and AzBio Sentence tests measured in quiet. Results Both control and implanted participants with good speech perception exhibited greater cortical activations to natural speech than to unintelligible speech. In contrast, implanted participants with poor speech perception had large, indistinguishable cortical activations to all stimuli. The ratio of cortical activation to normal speech to that of scrambled speech directly correlated with the CNC Words and AzBio Sentences scores. This pattern of cortical activation was not correlated with auditory threshold, age, side of implantation, or time after implantation. Turning off the implant reduced cortical activations in all implanted participants. Conclusions Together, these data indicate that the responses we measured within the lateral temporal lobe and the superior temporal gyrus correlate with behavioral measures of speech perception, demonstrating a neural basis for the variability in speech understanding outcomes after cochlear implantation. PMID:26709749

  11. Effects of Within-Talker Variability on Speech Intelligibility in Mandarin-Speaking Adult and Pediatric Cochlear Implant Patients

    PubMed Central

    Su, Qiaotong; Galvin, John J.; Zhang, Guoping; Li, Yongxin

    2016-01-01

    Cochlear implant (CI) speech performance is typically evaluated using well-enunciated speech produced at a normal rate by a single talker. CI users often have greater difficulty with variations in speech production encountered in everyday listening. Within a single talker, speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context. The coarse spectral resolution afforded by the CI limits perception of voice pitch, which is an important cue for speech prosody and for tonal languages such as Mandarin Chinese. In this study, sentence recognition from the Mandarin speech perception database was measured in adult and pediatric Mandarin-speaking CI listeners for a variety of speaking styles: voiced speech produced at slow, normal, and fast speaking rates; whispered speech; voiced emotional speech; and voiced shouted speech. Recognition of Mandarin Hearing in Noise Test sentences was also measured. Results showed that performance was significantly poorer with whispered speech relative to the other speaking styles and that performance was significantly better with slow speech than with fast or emotional speech. Results also showed that adult and pediatric performance was significantly poorer with Mandarin Hearing in Noise Test than with Mandarin speech perception sentences at the normal rate. The results suggest that adult and pediatric Mandarin-speaking CI patients are highly susceptible to whispered speech, due to the lack of lexically important voice pitch cues and perhaps other qualities associated with whispered speech. The results also suggest that test materials may contribute to differences in performance observed between adult and pediatric CI users. PMID:27363714

  12. Nonlinear Frequency Compression in Hearing Aids: Impact on Speech and Language Development

    PubMed Central

    Bentler, Ruth; Walker, Elizabeth; McCreery, Ryan; Arenas, Richard M.; Roush, Patricia

    2015-01-01

    Objectives The research questions of this study were: (1) Are children using nonlinear frequency compression (NLFC) in their hearing aids getting better access to the speech signal than children using conventional processing schemes? The authors hypothesized that children whose hearing aids provided wider input bandwidth would have more access to the speech signal, as measured by an adaptation of the Speech Intelligibility Index, and (2) are speech and language skills different for children who have been fit with the two different technologies; if so, in what areas? The authors hypothesized that if the children were getting increased access to the speech signal as a result of their NLFC hearing aids (question 1), it would be possible to see improved performance in areas of speech production, morphosyntax, and speech perception compared with the group with conventional processing. Design Participants included 66 children with hearing loss recruited as part of a larger multisite National Institutes of Health–funded study, Outcomes for Children with Hearing Loss, designed to explore the developmental outcomes of children with mild to severe hearing loss. For the larger study, data on communication, academic and psychosocial skills were gathered in an accelerated longitudinal design, with entry into the study between 6 months and 7 years of age. Subjects in this report consisted of 3-, 4-, and 5-year-old children recruited at the North Carolina test site. All had at least at least 6 months of current hearing aid usage with their NLFC or conventional amplification. Demographic characteristics were compared at the three age levels as well as audibility and speech/language outcomes; speech-perception scores were compared for the 5-year-old groups. Results Results indicate that the audibility provided did not differ between the technology options. As a result, there was no difference between groups on speech or language outcome measures at 4 or 5 years of age, and no

  13. Nonlinear frequency compression in hearing aids: impact on speech and language development.

    PubMed

    Bentler, Ruth; Walker, Elizabeth; McCreery, Ryan; Arenas, Richard M; Roush, Patricia

    2014-01-01

    The research questions of this study were: (1) Are children using nonlinear frequency compression (NLFC) in their hearing aids getting better access to the speech signal than children using conventional processing schemes? The authors hypothesized that children whose hearing aids provided wider input bandwidth would have more access to the speech signal, as measured by an adaptation of the Speech Intelligibility Index, and (2) are speech and language skills different for children who have been fit with the two different technologies; if so, in what areas? The authors hypothesized that if the children were getting increased access to the speech signal as a result of their NLFC hearing aids (question 1), it would be possible to see improved performance in areas of speech production, morphosyntax, and speech perception compared with the group with conventional processing. Participants included 66 children with hearing loss recruited as part of a larger multisite National Institutes of Health-funded study, Outcomes for Children with Hearing Loss, designed to explore the developmental outcomes of children with mild to severe hearing loss. For the larger study, data on communication, academic and psychosocial skills were gathered in an accelerated longitudinal design, with entry into the study between 6 months and 7 years of age. Subjects in this report consisted of 3-, 4-, and 5-year-old children recruited at the North Carolina test site. All had at least at least 6 months of current hearing aid usage with their NLFC or conventional amplification. Demographic characteristics were compared at the three age levels as well as audibility and speech/language outcomes; speech-perception scores were compared for the 5-year-old groups. Results indicate that the audibility provided did not differ between the technology options. As a result, there was no difference between groups on speech or language outcome measures at 4 or 5 years of age, and no impact on speech perception

  14. Dysarthria and broader motor speech deficits in Dravet syndrome.

    PubMed

    Turner, Samantha J; Brown, Amy; Arpone, Marta; Anderson, Vicki; Morgan, Angela T; Scheffer, Ingrid E

    2017-02-21

    To analyze the oral motor, speech, and language phenotype in 20 children and adults with Dravet syndrome (DS) associated with mutations in SCN1A . Fifteen verbal and 5 minimally verbal DS patients with SCN1A mutations (aged 15 months-28 years) underwent a tailored assessment battery. Speech was characterized by imprecise articulation, abnormal nasal resonance, voice, and pitch, and prosody errors. Half of verbal patients had moderate to severely impaired conversational speech intelligibility. Oral motor impairment, motor planning/programming difficulties, and poor postural control were typical. Nonverbal individuals had intentional communication. Cognitive skills varied markedly, with intellectual functioning ranging from the low average range to severe intellectual disability. Language impairment was congruent with cognition. We describe a distinctive speech, language, and oral motor phenotype in children and adults with DS associated with mutations in SCN1A. Recognizing this phenotype will guide therapeutic intervention in patients with DS. © 2017 American Academy of Neurology.

  15. Dysarthria and broader motor speech deficits in Dravet syndrome

    PubMed Central

    Turner, Samantha J.; Brown, Amy; Arpone, Marta; Anderson, Vicki; Morgan, Angela T.

    2017-01-01

    Objective: To analyze the oral motor, speech, and language phenotype in 20 children and adults with Dravet syndrome (DS) associated with mutations in SCN1A. Methods: Fifteen verbal and 5 minimally verbal DS patients with SCN1A mutations (aged 15 months-28 years) underwent a tailored assessment battery. Results: Speech was characterized by imprecise articulation, abnormal nasal resonance, voice, and pitch, and prosody errors. Half of verbal patients had moderate to severely impaired conversational speech intelligibility. Oral motor impairment, motor planning/programming difficulties, and poor postural control were typical. Nonverbal individuals had intentional communication. Cognitive skills varied markedly, with intellectual functioning ranging from the low average range to severe intellectual disability. Language impairment was congruent with cognition. Conclusions: We describe a distinctive speech, language, and oral motor phenotype in children and adults with DS associated with mutations in SCN1A. Recognizing this phenotype will guide therapeutic intervention in patients with DS. PMID:28148630

  16. Improving Intelligibility: Guided Reflective Journals in Action

    ERIC Educational Resources Information Center

    Lear, Emmaline L.

    2014-01-01

    This study explores the effectiveness of guided reflective journals to improve intelligibility in a Japanese higher educational context. Based on qualitative and quantitative methods, the paper evaluates changes in speech over the duration of one semester. In particular, this study focuses on changes in prosodic features such as stress, intonation…

  17. Phonological processes in the speech of school-age children with hearing loss: Comparisons with children with normal hearing.

    PubMed

    Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline

    2018-04-27

    In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. Influence of musical training on understanding voiced and whispered speech in noise.

    PubMed

    Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J

    2014-01-01

    This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.

  19. Audibility-based predictions of speech recognition for children and adults with normal hearing.

    PubMed

    McCreery, Ryan W; Stelmachowicz, Patricia G

    2011-12-01

    This study investigated the relationship between audibility and predictions of speech recognition for children and adults with normal hearing. The Speech Intelligibility Index (SII) is used to quantify the audibility of speech signals and can be applied to transfer functions to predict speech recognition scores. Although the SII is used clinically with children, relatively few studies have evaluated SII predictions of children's speech recognition directly. Children have required more audibility than adults to reach maximum levels of speech understanding in previous studies. Furthermore, children may require greater bandwidth than adults for optimal speech understanding, which could influence frequency-importance functions used to calculate the SII. Speech recognition was measured for 116 children and 19 adults with normal hearing. Stimulus bandwidth and background noise level were varied systematically in order to evaluate speech recognition as predicted by the SII and derive frequency-importance functions for children and adults. Results suggested that children required greater audibility to reach the same level of speech understanding as adults. However, differences in performance between adults and children did not vary across frequency bands. © 2011 Acoustical Society of America

  20. Perceptual Speech Assessment After Anterior Maxillary Distraction in Patients With Cleft Maxillary Hypoplasia.

    PubMed

    Richardson, Sunil; Seelan, Nikkie S; Selvaraj, Dhivakar; Khandeparker, Rakshit V; Gnanamony, Sangeetha

    2016-06-01

    To assess speech outcomes after anterior maxillary distraction (AMD) in patients with cleft-related maxillary hypoplasia. Fifty-eight patients at least 10 years old with cleft-related maxillary hypoplasia were included in this study irrespective of gender, type of cleft lip and palate, and amount of required advancement. AMD was carried out in all patients using a tooth-borne palatal distractor by a single oral and maxillofacial surgeon. Perceptual speech assessment was performed by 2 speech language pathologists preoperatively, before placement of the distractor device, and 6 months postoperatively using the scoring system of Perkins et al (Plast Reconstr Surg 116:72, 2005); the system evaluates velopharyngeal insufficiency (VPI), resonance, nasal air emission, articulation errors, and intelligibility. The data obtained were tabulated and subjected to statistical analysis using Wilcoxon signed rank test. A P value less than .05 was considered significant. Eight patients were lost to follow-up. At 6-month follow-up, improvements of 62% (n = 31), 64% (n = 32), 50% (n = 25), 68% (n = 34), and 70% (n = 35) in VPI, resonance, nasal air emission, articulation, and intelligibility, respectively, were observed, with worsening of all parameters in 1 patient (2%). The results for all tested parameters were highly significant (P ≤ .001). AMD offers a substantial improvement in speech for all 5 parameters of perceptual speech assessment. Copyright © 2016 The American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

  1. Human neuromagnetic steady-state responses to amplitude-modulated tones, speech, and music.

    PubMed

    Lamminmäki, Satu; Parkkonen, Lauri; Hari, Riitta

    2014-01-01

    Auditory steady-state responses that can be elicited by various periodic sounds inform about subcortical and early cortical auditory processing. Steady-state responses to amplitude-modulated pure tones have been used to scrutinize binaural interaction by frequency-tagging the two ears' inputs at different frequencies. Unlike pure tones, speech and music are physically very complex, as they include many frequency components, pauses, and large temporal variations. To examine the utility of magnetoencephalographic (MEG) steady-state fields (SSFs) in the study of early cortical processing of complex natural sounds, the authors tested the extent to which amplitude-modulated speech and music can elicit reliable SSFs. MEG responses were recorded to 90-s-long binaural tones, speech, and music, amplitude-modulated at 41.1 Hz at four different depths (25, 50, 75, and 100%). The subjects were 11 healthy, normal-hearing adults. MEG signals were averaged in phase with the modulation frequency, and the sources of the resulting SSFs were modeled by current dipoles. After the MEG recording, intelligibility of the speech, musical quality of the music stimuli, naturalness of music and speech stimuli, and the perceived deterioration caused by the modulation were evaluated on visual analog scales. The perceived quality of the stimuli decreased as a function of increasing modulation depth, more strongly for music than speech; yet, all subjects considered the speech intelligible even at the 100% modulation. SSFs were the strongest to tones and the weakest to speech stimuli; the amplitudes increased with increasing modulation depth for all stimuli. SSFs to tones were reliably detectable at all modulation depths (in all subjects in the right hemisphere, in 9 subjects in the left hemisphere) and to music stimuli at 50 to 100% depths, whereas speech usually elicited clear SSFs only at 100% depth.The hemispheric balance of SSFs was toward the right hemisphere for tones and speech, whereas

  2. Evaluation of speech outcomes using English version of the Speech Handicap Index in a cohort of head and neck cancer patients.

    PubMed

    Dwivedi, Raghav C; St Rose, Suzanne; Chisholm, Edward J; Bisase, Brian; Amen, Furrat; Nutting, Christopher M; Clarke, Peter M; Kerawala, Cyrus J; Rhys-Evans, Peter H; Harrington, Kevin J; Kazi, Rehan

    2012-06-01

    The aim of this study was to explore post-treatment speech impairments using English version of Speech Handicap Index (SHI) (first speech-specific questionnaire) in a cohort of oral cavity (OC) and oropharyngeal (OP) cancer patients. Sixty-three consecutive OC and OP cancer patients in follow-up participated in this study. Descriptive analyses have been presented as percentages, while Mann-Whitney U-test and Kruskall-Wallis test have been used for the quantitative variables. Statistical Package for Social Science-15 statistical software (SPSS Inc., Chicago, IL) was used for the statistical analyses. Over a third (36.1%) of patients reported their speech as either average or bad. Speech intelligibility and articulation were the main speech concerns for 58.8% and 52.9% OC and 31.6% and 34.2% OP cancer patients, respectively. While feeling of incompetent and being less outgoing were the speech-related psychosocial concerns for 64.7% and 23.5% OC and 15.8% and 18.4% OP cancer patients, respectively. Worse speech outcomes were noted for oral tongue and base of tongue cancers vs. tonsillar cancers, mean (SD) values were 56.7 (31.3) and 52.0 (38.4) vs. 10.9 (14.8) (P<0.001) and late vs. early T stage cancers 65.0 (29.9) vs. 29.3 (32.7) (P<0.005). The English version of the SHI is a reliable, valid and useful tool for the evaluation of speech in HNC patients. Over one-third of OC and OP cancer patients reported speech problems in their day-do-day life. Advanced T-stage tumors affecting the oral tongue or base of tongue are particularly associated with poor speech outcomes. Copyright © 2012 Elsevier Ltd. All rights reserved.

  3. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  4. Validation of the Intelligibility in Context Scale for Jamaican Creole-Speaking Preschoolers.

    PubMed

    Washington, Karla N; McDonald, Megan M; McLeod, Sharynne; Crowe, Kathryn; Devonish, Hubert

    2017-08-15

    To describe validation of the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012a) and ICS-Jamaican Creole (ICS-JC; McLeod, Harrison, & McCormack, 2012b) in a sample of typically developing 3- to 6-year-old Jamaicans. One-hundred and forty-five preschooler-parent dyads participated in the study. Parents completed the 7-item ICS (n = 145) and ICS-JC (n = 98) to rate children's speech intelligibility (5-point scale) across communication partners (parents, immediate family, extended family, friends, acquaintances, strangers). Preschoolers completed the Diagnostic Evaluation of Articulation and Phonology (DEAP; Dodd, Hua, Crosbie, Holm, & Ozanne, 2006) in English and Jamaican Creole to establish speech-sound competency. For this sample, we examined validity and reliability (interrater, test-rest, internal consistency) evidence using measures of speech-sound production: (a) percentage of consonants correct, (b) percentage of vowels correct, and (c) percentage of phonemes correct. ICS and ICS-JC ratings showed preschoolers were always (5) to usually (4) understood across communication partners (ICS, M = 4.43; ICS-JC, M = 4.50). Both tools demonstrated excellent internal consistency (α = .91), high interrater, and test-retest reliability. Significant correlations between the two tools and between each measure and language-specific percentage of consonants correct, percentage of vowels correct, and percentage of phonemes correct provided criterion-validity evidence. A positive correlation between the ICS and age further strengthened validity evidence for that measure. Both tools show promising evidence of reliability and validity in describing functional speech intelligibility for this group of typically developing Jamaican preschoolers.

  5. An overview of artificial intelligence and robotics. Volume 1: Artificial intelligence. Part B: Applications

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Artificial Intelligence (AI) is an emerging technology that has recently attracted considerable attention. Many applications are now under development. This report, Part B of a three part report on AI, presents overviews of the key application areas: Expert Systems, Computer Vision, Natural Language Processing, Speech Interfaces, and Problem Solving and Planning. The basic approaches to such systems, the state-of-the-art, existing systems and future trends and expectations are covered.

  6. Long-Term Follow-Up Study of Young Adults Treated for Unilateral Complete Cleft Lip, Alveolus, and Palate by a Treatment Protocol Including Two-Stage Palatoplasty: Speech Outcomes

    PubMed Central

    Bittermann, Dirk; Janssen, Laura; Bittermann, Gerhard Koendert Pieter; Boonacker, Chantal; Haverkamp, Sarah; de Wilde, Hester; Van Der Heul, Marise; Specken, Tom FJMC; Koole, Ron; Kon, Moshe; Breugem, Corstiaan Cornelis; Mink van der Molen, Aebele Barber

    2017-01-01

    Background No consensus exists on the optimal treatment protocol for orofacial clefts or the optimal timing of cleft palate closure. This study investigated factors influencing speech outcomes after two-stage palate repair in adults with a non-syndromal complete unilateral cleft lip and palate (UCLP). Methods This was a retrospective analysis of adult patients with a UCLP who underwent two-stage palate closure and were treated at our tertiary cleft centre. Patients ≥17 years of age were invited for a final speech assessment. Their medical history was obtained from their medical files, and speech outcomes were assessed by a speech pathologist during the follow-up consultation. Results Forty-eight patients were included in the analysis, with a mean age of 21 years (standard deviation, 3.4 years). Their mean age at the time of hard and soft palate closure was 3 years and 8.0 months, respectively. In 40% of the patients, a pharyngoplasty was performed. On a 5-point intelligibility scale, 84.4% received a score of 1 or 2; meaning that their speech was intelligible. We observed a significant correlation between intelligibility scores and the incidence of articulation errors (P<0.001). In total, 36% showed mild to moderate hypernasality during the speech assessment, and 11%–17% of the patients exhibited increased nasalance scores, assessed through nasometry. Conclusions The present study describes long-term speech outcomes after two-stage palatoplasty with hard palate closure at a mean age of 3 years old. We observed moderate long-term intelligibility scores, a relatively high incidence of persistent hypernasality, and a high pharyngoplasty incidence. PMID:28573094

  7. A method for determining internal noise criteria based on practical speech communication applied to helicopters

    NASA Technical Reports Server (NTRS)

    Sternfeld, H., Jr.; Doyle, L. B.

    1978-01-01

    The relationship between the internal noise environment of helicopters and the ability of personnel to understand commands and instructions was studied. A test program was conducted to relate speech intelligibility to a standard measurement called Articulation Index. An acoustical simulator was used to provide noise environments typical of Army helicopters. Speech material (command sentences and phonetically balanced word lists) were presented at several voice levels in each helicopter environment. Recommended helicopter internal noise criteria, based on speech communication, were derived and the effectiveness of hearing protection devices were evaluated.

  8. Perception of speech in reverberant conditions using AM-FM cochlear implant simulation.

    PubMed

    Drgas, Szymon; Blaszak, Magdalena A

    2010-10-01

    This study assessed the effects of speech misidentification and cognitive processing errors in normal-hearing adults listening to degraded auditory input signals simulating cochlear implants in reverberation conditions. Three variables were controlled: number of vocoder channels (six and twelve), instantaneous frequency change rate (none, 50, 400 Hz), and enclosures (different reverberation conditions). The analyses were made on the basis of: (a) nonsense word recognition scores for eight young normal-hearing listeners, (b) 'ease of listening' based on the time of response, and (c) the subjective measure of difficulty. The maximum score of speech intelligibility in cochlear implant simulation was 70% for non-reverberant conditions with a 12-channel vocoder and changes of instantaneous frequency limited to 400 Hz. In the presence of reflections, word misidentification was about 10-20 percentage points higher. There was little difference between the 50 and 400 Hz frequency modulation cut-off for the 12-channel vocoder; however, in the case of six channels this difference was more significant. The results of the experiment suggest that the information other than F0, that is carried by FM, can be sufficient to improve speech intelligibility in the real-world conditions.

  9. Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use

    PubMed Central

    Gieseler, Anja; Tahden, Maike A. S.; Thiel, Christiane M.; Wagener, Kirsten C.; Meis, Markus; Colonius, Hans

    2017-01-01

    Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners (mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically relevant auditory and non-auditory measures for speech-in-noise prediction using auditory (audiogram, categorical loudness scaling) and cognitive tests (verbal-intelligence test, screening test of dementia), as well as questionnaires assessing various self-reported measures (health status, socio-economic status, and subjective hearing problems). Using stepwise linear regression analysis, 62% of the variance in unaided speech-in-noise performance was explained, with measures Pure-tone average (PTA), Age, and Verbal intelligence emerging as the three most important predictors. In the complementary analysis, those individuals with the same hearing loss profile were separated into hearing aid users (HAU) and non-users (NU), and were then compared regarding potential differences in the test measures and in explaining unaided speech-in-noise recognition. The groupwise comparisons revealed significant differences in auditory measures and self-reported subjective hearing problems, while no differences in the cognitive domain were found. Furthermore, groupwise regression analyses revealed that Verbal intelligence had a predictive value in both groups, whereas Age and PTA only emerged significant in the group of hearing aid NU. PMID:28270784

  10. Intelligence and Schooling. Fueling the Education Explosion: Proceedings of Conference 2 (Cleveland, Ohio, November 17-18, 1983).

    ERIC Educational Resources Information Center

    Gardner, Mary, Ed.; Reed-Mundell, Charlene, Ed.

    These proceedings contain presentations from a conference whose major topics were real-world intelligence, artificial intelligence, and linkage between the education and corporate sectors. "People, Perspectives...Potential and Possibilities" (Elyse S. Fleming), which was the conference's closing speech, briefly summarizes the information…

  11. The Cleft Care UK study. Part 4: perceptual speech outcomes

    PubMed Central

    Sell, D; Mildinhall, S; Albery, L; Wills, A K; Sandy, J R; Ness, A R

    2015-01-01

    Structured Abstract Objectives To describe the perceptual speech outcomes from the Cleft Care UK (CCUK) study and compare them to the 1998 Clinical Standards Advisory Group (CSAG) audit. Setting and sample population A cross-sectional study of 248 children born with complete unilateral cleft lip and palate, between 1 April 2005 and 31 March 2007 who underwent speech assessment. Materials and methods Centre-based specialist speech and language therapists (SLT) took speech audio–video recordings according to nationally agreed guidelines. Two independent listeners undertook the perceptual analysis using the CAPS-A Audit tool. Intra- and inter-rater reliability were tested. Results For each speech parameter of intelligibility/distinctiveness, hypernasality, palatal/palatalization, backed to velar/uvular, glottal, weak and nasalized consonants, and nasal realizations, there was strong evidence that speech outcomes were better in the CCUK children compared to CSAG children. The parameters which did not show improvement were nasal emission, nasal turbulence, hyponasality and lateral/lateralization. Conclusion These results suggest that centralization of cleft care into high volume centres has resulted in improvements in UK speech outcomes in five-year-olds with unilateral cleft lip and palate. This may be associated with the development of a specialized workforce. Nevertheless, there still remains a group of children with significant difficulties at school entry. PMID:26567854

  12. The Cleft Care UK study. Part 4: perceptual speech outcomes.

    PubMed

    Sell, D; Mildinhall, S; Albery, L; Wills, A K; Sandy, J R; Ness, A R

    2015-11-01

    To describe the perceptual speech outcomes from the Cleft Care UK (CCUK) study and compare them to the 1998 Clinical Standards Advisory Group (CSAG) audit. A cross-sectional study of 248 children born with complete unilateral cleft lip and palate, between 1 April 2005 and 31 March 2007 who underwent speech assessment. Centre-based specialist speech and language therapists (SLT) took speech audio-video recordings according to nationally agreed guidelines. Two independent listeners undertook the perceptual analysis using the CAPS-A Audit tool. Intra- and inter-rater reliability were tested. For each speech parameter of intelligibility/distinctiveness, hypernasality, palatal/palatalization, backed to velar/uvular, glottal, weak and nasalized consonants, and nasal realizations, there was strong evidence that speech outcomes were better in the CCUK children compared to CSAG children. The parameters which did not show improvement were nasal emission, nasal turbulence, hyponasality and lateral/lateralization. These results suggest that centralization of cleft care into high volume centres has resulted in improvements in UK speech outcomes in five-year-olds with unilateral cleft lip and palate. This may be associated with the development of a specialized workforce. Nevertheless, there still remains a group of children with significant difficulties at school entry. © The Authors. Orthodontics & Craniofacial Research Published by John Wiley & Sons Ltd.

  13. Clear Speech Variants: An Acoustic Study in Parkinson's Disease.

    PubMed

    Lam, Jennifer; Tjaden, Kris

    2016-08-01

    The authors investigated how different variants of clear speech affect segmental and suprasegmental acoustic measures of speech in speakers with Parkinson's disease and a healthy control group. A total of 14 participants with Parkinson's disease and 14 control participants served as speakers. Each speaker produced 18 different sentences selected from the Sentence Intelligibility Test (Yorkston & Beukelman, 1996). All speakers produced stimuli in 4 speaking conditions (habitual, clear, overenunciate, and hearing impaired). Segmental acoustic measures included vowel space area and first moment (M1) coefficient difference measures for consonant pairs. Second formant slope of diphthongs and measures of vowel and fricative durations were also obtained. Suprasegmental measures included fundamental frequency, sound pressure level, and articulation rate. For the majority of adjustments, all variants of clear speech instruction differed from the habitual condition. The overenunciate condition elicited the greatest magnitude of change for segmental measures (vowel space area, vowel durations) and the slowest articulation rates. The hearing impaired condition elicited the greatest fricative durations and suprasegmental adjustments (fundamental frequency, sound pressure level). Findings have implications for a model of speech production for healthy speakers as well as for speakers with dysarthria. Findings also suggest that particular clear speech instructions may target distinct speech subsystems.

  14. The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

    PubMed

    Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

    2009-06-01

    The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p < 0.001) than with the 30 IIDR. Group mean CNC scores at 60 dB SPL, loudness ratings, and the signal to noise ratios-50 for Bamford-Kowal-Bench Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.

  15. Speech recognition: Acoustic-phonetic knowledge acquisition and representation

    NASA Astrophysics Data System (ADS)

    Zue, Victor W.

    1988-09-01

    The long-term research goal is to develop and implement speaker-independent continuous speech recognition systems. It is believed that the proper utilization of speech-specific knowledge is essential for such advanced systems. This research is thus directed toward the acquisition, quantification, and representation, of acoustic-phonetic and lexical knowledge, and the application of this knowledge to speech recognition algorithms. In addition, we are exploring new speech recognition alternatives based on artificial intelligence and connectionist techniques. We developed a statistical model for predicting the acoustic realization of stop consonants in various positions in the syllable template. A unification-based grammatical formalism was developed for incorporating this model into the lexical access algorithm. We provided an information-theoretic justification for the hierarchical structure of the syllable template. We analyzed segmented duration for vowels and fricatives in continuous speech. Based on contextual information, we developed durational models for vowels and fricatives that account for over 70 percent of the variance, using data from multiple, unknown speakers. We rigorously evaluated the ability of human spectrogram readers to identify stop consonants spoken by many talkers and in a variety of phonetic contexts. Incorporating the declarative knowledge used by the readers, we developed a knowledge-based system for stop identification. We achieved comparable system performance to that to the readers.

  16. A variable rate speech compressor for mobile applications

    NASA Technical Reports Server (NTRS)

    Yeldener, S.; Kondoz, A. M.; Evans, B. G.

    1990-01-01

    One of the most promising speech coder at the bit rate of 9.6 to 4.8 kbits/s is CELP. Code Excited Linear Prediction (CELP) has been dominating 9.6 to 4.8 kbits/s region during the past 3 to 4 years. Its set back however, is its expensive implementation. As an alternative to CELP, the Base-Band CELP (CELP-BB) was developed which produced good quality speech comparable to CELP and a single chip implementable complexity as reported previously. Its robustness was also improved to tolerate errors up to 1.0 pct. and maintain intelligibility up to 5.0 pct. and more. Although, CELP-BB produces good quality speech at around 4.8 kbits/s, it has a fundamental problem when updating the pitch filter memory. A sub-optimal solution is proposed for this problem. Below 4.8 kbits/s, however, CELP-BB suffers from noticeable quantization noise as a result of the large vector dimensions used. Efficient representation of speech below 4.8 kbits/s is reported by introducing Sinusoidal Transform Coding (STC) to represent the LPC excitation which is called Sine Wave Excited LPC (SWELP). In this case, natural sounding good quality synthetic speech is obtained at around 2.4 kbits/s.

  17. The effect of sensorineural hearing loss and tinnitus on speech recognition over air and bone conduction military communications headsets.

    PubMed

    Manning, Candice; Mermagen, Timothy; Scharine, Angelique

    2017-06-01

    Military personnel are at risk for hearing loss due to noise exposure during deployment (USACHPPM, 2008). Despite mandated use of hearing protection, hearing loss and tinnitus are prevalent due to reluctance to use hearing protection. Bone conduction headsets can offer good speech intelligibility for normal hearing (NH) listeners while allowing the ears to remain open in quiet environments and the use of hearing protection when needed. Those who suffer from tinnitus, the experience of perceiving a sound not produced by an external source, often show degraded speech recognition; however, it is unclear whether this is a result of decreased hearing sensitivity or increased distractibility (Moon et al., 2015). It has been suggested that the vibratory stimulation of a bone conduction headset might ameliorate the effects of tinnitus on speech perception; however, there is currently no research to support or refute this claim (Hoare et al., 2014). Speech recognition of words presented over air conduction and bone conduction headsets was measured for three groups of listeners: NH, sensorineural hearing impaired, and/or tinnitus sufferers. Three levels of speech-to-noise (SNR = 0, -6, -12 dB) were created by embedding speech items in pink noise. Better speech recognition performance was observed with the bone conduction headset regardless of hearing profile, and speech intelligibility was a function of SNR. Discussion will include study limitations and the implications of these findings for those serving in the military. Published by Elsevier B.V.

  18. An Intelligibility Assessment of Toddlers with Cleft Lip and Palate Who Received and Did Not Receive Presurgical Infant Orthopedic Treatment.

    ERIC Educational Resources Information Center

    Konst, Emmy M.; Weersink-Braks, Hanny; Rietveld, Toni; Peters, Herman

    2000-01-01

    The influence of presurgical infant orthopedic treatment (PIO) on speech intelligibility was evaluated with 10 toddlers who used PIO during the first year of life and 10 who did not. Treated children were rated as exhibiting greater intelligibility, however, transcription data indicated there were not group differences in actual intelligibility.…

  19. Accent, Intelligibility, and the Role of the Listener: Perceptions of English-Accented German by Native German Speakers

    ERIC Educational Resources Information Center

    Hayes-Harb, Rachel; Watzinger-Tharp, Johanna

    2012-01-01

    We explore the relationship between accentedness and intelligibility, and investigate how listeners' beliefs about nonnative speech interact with their accentedness and intelligibility judgments. Native German speakers and native English learners of German produced German sentences, which were presented to 12 native German speakers in accentedness…

  20. Advancements in text-to-speech technology and implications for AAC applications

    NASA Astrophysics Data System (ADS)

    Syrdal, Ann K.

    2003-10-01

    Intelligibility was the initial focus in text-to-speech (TTS) research, since it is clearly a necessary condition for the application of the technology. Sufficiently high intelligibility (approximating human speech) has been achieved in the last decade by the better formant-based and concatenative TTS systems. This led to commercially available TTS systems for highly motivated users, particularly the blind and vocally impaired. Some unnatural qualities of TTS were exploited by these users, such as very fast speaking rates and altered pitch ranges for flagging relevant information. Recently, the focus in TTS research has turned to improving naturalness, so that synthetic speech sounds more human and less robotic. Unit selection approaches to concatenative synthesis have dramatically improved TTS quality, although at the cost of larger and more complex systems. This advancement in naturalness has made TTS technology more acceptable to the general public. The vocally impaired appreciate a more natural voice with which to represent themselves when communicating with others. Unit selection TTS does not achieve such high speaking rates as the earlier TTS systems, however, which is a disadvantage to some AAC device users. An important new research emphasis is to improve and increase the range of emotional expressiveness of TTS.

  1. Spectral and temporal changes to speech produced in the presence of energetic and informational maskers.

    PubMed

    Cooke, Martin; Lu, Youyi

    2010-10-01

    Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.

  2. Use of listening strategies for the speech of individuals with dysarthria and cerebral palsy.

    PubMed

    Hustad, Katherine C; Dardis, Caitlin M; Kramper, Amy J

    2011-03-01

    This study examined listeners' endorsement of cognitive, linguistic, segmental, and suprasegmental strategies employed when listening to speakers with dysarthria. The study also examined whether strategy endorsement differed between listeners who earned the highest and lowest intelligibility scores. Speakers were eight individuals with dysarthria and cerebral palsy. Listeners were 80 individuals who transcribed speech stimuli and rated their use of each of 24 listening strategies on a 4-point scale. Results showed that cognitive and linguistic strategies were most highly endorsed. Use of listening strategies did not differ between listeners with the highest and lowest intelligibility scores. Results suggest that there may be a core of strategies common to listeners of speakers with dysarthria that may be supplemented by additional strategies, based on characteristics of the speaker and speech signal.

  3. Author's Rebuttal to Smits et al. (2018), "Comment on 'Sensitivity of the Speech Intelligibility Index to the Assumed Dynamic Range' by Jin et al. (2017)".

    PubMed

    Jin, In-Ki; Kates, James M; Arehart, Kathryn H

    2018-01-22

    The purpose of this letter is to refute the comments written by Smits, Goverts, and Versfeld (2018). Refutations to each issue including the fixed mathematical relationship between dynamic range (DR) and a fitting constant (Q value), deviating results for small DRs, and determination of Speech Intelligibility Index (SII) model parameters are described. Although Smits et al. (2018) correctly identified several issues, those comments do not diminish the results of the original article (Jin, Kates, & Arehart, 2017) in providing new insights for the SII. Jin et al. (2017) clearly provided the impact of languages and DR on the SII, which was the main result of the study.

  4. Overall intelligibility, articulation, resonance, voice and language in a child with Nager syndrome.

    PubMed

    Van Lierde, Kristiane M; Luyten, Anke; Mortier, Geert; Tijskens, Anouk; Bettens, Kim; Vermeersch, Hubert

    2011-02-01

    The purpose of this study was to provide a description of the language and speech (intelligibility, voice, resonance, articulation) in a 7-year-old Dutch speaking boy with Nager syndrome. To reveal these features comparison was made with an age and gender related child with a similar palatal or hearing problem. Language was tested with an age appropriate language test namely the Dutch version of the Clinical Evaluation of Language Fundamentals. Regarding articulation a phonetic inventory, phonetic analysis and phonological process analysis was performed. A nominal scale with four categories was used to judge the overall speech intelligibility. A voice and resonance assessment included a videolaryngostroboscopy, a perceptual evaluation, acoustic analysis and nasometry. The most striking communication problems in this child were expressive and receptive language delay, moderately impaired speech intelligibility, the presence of phonetic and phonological disorders, resonance disorders and a high-pitched voice. The explanation for this pattern of communication is not completely straightforward. The language and the phonological impairment, only present in the child with the Nager syndrome, are not part of a more general developmental delay. The resonance disorders can be related to the cleft palate, but were not present in the child with the isolated cleft palate. One might assume that the cul-de-sac resonance and the much decreased mandibular movement and the restricted tongue lifting are caused by the restricted jaw mobility and micrognathia. To what extent the suggested mandibular distraction osteogenesis in early childhood allows increased mandibular movement and better speech outcome with increased oral resonance is subject for further research. According to the results of this study the speech and language management must be focused on receptive and expressive language skills and linguistic conceptualization, correct phonetic placement and the modification of

  5. The Interaction of Temporal and Spectral Acoustic Information with Word Predictability on Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Shahsavarani, Somayeh Bahar

    High-level, top-down information such as linguistic knowledge is a salient cortical resource that influences speech perception under most listening conditions. But, are all listeners able to exploit these resources for speech facilitation to the same extent? It was found that children with cochlear implants showed different patterns of benefit from contextual information in speech perception compared with their normal-haring peers. Previous studies have discussed the role of non-acoustic factors such as linguistic and cognitive capabilities to account for this discrepancy. Given the fact that the amount of acoustic information encoded and processed by auditory nerves of listeners with cochlear implants differs from normal-hearing listeners and even varies across individuals with cochlear implants, it is important to study the interaction of specific acoustic properties of the speech signal with contextual cues. This relationship has been mostly neglected in previous research. In this dissertation, we aimed to explore how different acoustic dimensions interact to affect listeners' abilities to combine top-down information with bottom-up information in speech perception beyond the known effects of linguistic and cognitive capacities shown previously. Specifically, the present study investigated whether there were any distinct context effects based on the resolution of spectral versus slowly-varying temporal information in perception of spectrally impoverished speech. To that end, two experiments were conducted. In both experiments, a noise-vocoded technique was adopted to generate spectrally-degraded speech to approximate acoustic cues delivered to listeners with cochlear implants. The frequency resolution was manipulated by varying the number of frequency channels. The temporal resolution was manipulated by low-pass filtering of amplitude envelope with varying low-pass cutoff frequencies. The stimuli were presented to normal-hearing native speakers of American

  6. Five-year speech and language outcomes in children with cleft lip-palate.

    PubMed

    Prathanee, Benjamas; Pumnum, Tawitree; Seepuaham, Cholada; Jaiyong, Pechcharat

    2016-10-01

    To investigate 5-year speech and language outcomes in children with cleft lip/palate (CLP). Thirty-eight children aged 4-7 years and 8 months were recruited for this study. Speech abilities including articulation, resonance, voice, and intelligibility were assessed based on Thai Universal Parameters of Speech Outcomes. Language ability was assessed by the Language Screening Test. The findings revealed that children with clefts had speech and language delay, abnormal understandability, resonance abnormality, and voice disturbance; articulation defects that were 8.33 (1.75, 22.47), 50.00 (32.92, 67.08), 36.11 (20.82, 53.78), 30.56 (16.35, 48.11), and 94.44 (81.34, 99.32). Articulation errors were the most common speech and language defects in children with clefts, followed by abnormal understandability, resonance abnormality, and voice disturbance. These results should be of critical concern. Protocol reviewing and early intervention programs are needed for improved speech outcomes. Copyright © 2016 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.

  7. A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements.

    PubMed

    Hauswald, Anne; Lithari, Chrysa; Collignon, Olivier; Leonardelli, Elisa; Weisz, Nathan

    2018-05-07

    Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. Implementation of the Intelligent Voice System for Kazakh

    NASA Astrophysics Data System (ADS)

    Yessenbayev, Zh; Saparkhojayev, N.; Tibeyev, T.

    2014-04-01

    Modern speech technologies are highly advanced and widely used in day-to-day applications. However, this is mostly concerned with the languages of well-developed countries such as English, German, Japan, Russian, etc. As for Kazakh, the situation is less prominent and research in this field is only starting to evolve. In this research and application-oriented project, we introduce an intelligent voice system for the fast deployment of call-centers and information desks supporting Kazakh speech. The demand on such a system is obvious if the country's large size and small population is considered. The landline and cell phones become the only means of communication for the distant villages and suburbs. The system features Kazakh speech recognition and synthesis modules as well as a web-GUI for efficient dialog management. For speech recognition we use CMU Sphinx engine and for speech synthesis- MaryTTS. The web-GUI is implemented in Java enabling operators to quickly create and manage the dialogs in user-friendly graphical environment. The call routines are handled by Asterisk PBX and JBoss Application Server. The system supports such technologies and protocols as VoIP, VoiceXML, FastAGI, Java SpeechAPI and J2EE. For the speech recognition experiments we compiled and used the first Kazakh speech corpus with the utterances from 169 native speakers. The performance of the speech recognizer is 4.1% WER on isolated word recognition and 6.9% WER on clean continuous speech recognition tasks. The speech synthesis experiments include the training of male and female voices.

  9. EEG oscillations entrain their phase to high-level features of speech sound.

    PubMed

    Zoefel, Benedikt; VanRullen, Rufin

    2016-01-01

    Phase entrainment of neural oscillations, the brain's adjustment to rhythmic stimulation, is a central component in recent theories of speech comprehension: the alignment between brain oscillations and speech sound improves speech intelligibility. However, phase entrainment to everyday speech sound could also be explained by oscillations passively following the low-level periodicities (e.g., in sound amplitude and spectral content) of auditory stimulation-and not by an adjustment to the speech rhythm per se. Recently, using novel speech/noise mixture stimuli, we have shown that behavioral performance can entrain to speech sound even when high-level features (including phonetic information) are not accompanied by fluctuations in sound amplitude and spectral content. In the present study, we report that neural phase entrainment might underlie our behavioral findings. We observed phase-locking between electroencephalogram (EEG) and speech sound in response not only to original (unprocessed) speech but also to our constructed "high-level" speech/noise mixture stimuli. Phase entrainment to original speech and speech/noise sound did not differ in the degree of entrainment, but rather in the actual phase difference between EEG signal and sound. Phase entrainment was not abolished when speech/noise stimuli were presented in reverse (which disrupts semantic processing), indicating that acoustic (rather than linguistic) high-level features play a major role in the observed neural entrainment. Our results provide further evidence for phase entrainment as a potential mechanism underlying speech processing and segmentation, and for the involvement of high-level processes in the adjustment to the rhythm of speech. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Developmental Variables and Speech-Language in a Special Education Intervention Model.

    ERIC Educational Resources Information Center

    Cruz, Maria del C.; Ayala, Myrna

    Case studies of eight children with speech and language impairments are presented in a review of the intervention efforts at the Demonstration Center for Preschool Special Education (DCPSE) in Puerto Rico. Five components of the intervention model are examined: social medical history, intelligence, motor development, socio-emotional development,…

  11. Native Reactions to Non-Native Speech: A Review of Empirical Research.

    ERIC Educational Resources Information Center

    Eisenstein, Miriam

    1983-01-01

    Recent research on native speakers' reactions to nonnative speech that views listeners, speakers, and language from a variety of perspectives using both objective and subjective research paradigms is reviewed. Studies of error gravity, relative intelligibility of language samples, the role of accent, speakers' characteristics, and context in which…

  12. Everyday listeners' impressions of speech produced by individuals with adductor spasmodic dysphonia.

    PubMed

    Nagle, Kathleen F; Eadie, Tanya L; Yorkston, Kathryn M

    2015-01-01

    Individuals with adductor spasmodic dysphonia (ADSD) have reported that unfamiliar communication partners appear to judge them as sneaky, nervous or not intelligent, apparently based on the quality of their speech; however, there is minimal research into the actual everyday perspective of listening to ADSD speech. The purpose of this study was to investigate the impressions of listeners hearing ADSD speech for the first time using a mixed-methods design. Everyday listeners were interviewed following sessions in which they made ratings of ADSD speech. A semi-structured interview approach was used and data were analyzed using thematic content analysis. Three major themes emerged: (1) everyday listeners make judgments about speakers with ADSD; (2) ADSD speech does not sound normal to everyday listeners; and (3) rating overall severity is difficult for everyday listeners. Participants described ADSD speech similarly to existing literature; however, some listeners inaccurately extrapolated speaker attributes based solely on speech samples. Listeners may draw erroneous conclusions about individuals with ADSD and these biases may affect the communicative success of these individuals. Results have implications for counseling individuals with ADSD, as well as the need for education and awareness about ADSD. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Rapid tuning shifts in human auditory cortex enhance speech intelligibility

    PubMed Central

    Holdgraf, Christopher R.; de Heer, Wendy; Pasley, Brian; Rieger, Jochem; Crone, Nathan; Lin, Jack J.; Knight, Robert T.; Theunissen, Frédéric E.

    2016-01-01

    Experience shapes our perception of the world on a moment-to-moment basis. This robust perceptual effect of experience parallels a change in the neural representation of stimulus features, though the nature of this representation and its plasticity are not well-understood. Spectrotemporal receptive field (STRF) mapping describes the neural response to acoustic features, and has been used to study contextual effects on auditory receptive fields in animal models. We performed a STRF plasticity analysis on electrophysiological data from recordings obtained directly from the human auditory cortex. Here, we report rapid, automatic plasticity of the spectrotemporal response of recorded neural ensembles, driven by previous experience with acoustic and linguistic information, and with a neurophysiological effect in the sub-second range. This plasticity reflects increased sensitivity to spectrotemporal features, enhancing the extraction of more speech-like features from a degraded stimulus and providing the physiological basis for the observed ‘perceptual enhancement' in understanding speech. PMID:27996965

  14. Suprasegmental Characteristics of Spontaneous Speech Produced in Good and Challenging Communicative Conditions by Talkers Aged 9-14 Years.

    PubMed

    Hazan, Valerie; Tuomainen, Outi; Pettinato, Michèle

    2016-12-01

    This study investigated the acoustic characteristics of spontaneous speech by talkers aged 9-14 years and their ability to adapt these characteristics to maintain effective communication when intelligibility was artificially degraded for their interlocutor. Recordings were made for 96 children (50 female participants, 46 male participants) engaged in a problem-solving task with a same-sex friend; recordings for 20 adults were used as reference. The task was carried out in good listening conditions (normal transmission) and in degraded transmission conditions. Articulation rate, median fundamental frequency (f0), f0 range, and relative energy in the 1- to 3-kHz range were analyzed. With increasing age, children significantly reduced their median f0 and f0 range, became faster talkers, and reduced their mid-frequency energy in spontaneous speech. Children produced similar clear speech adaptations (in degraded transmission conditions) as adults, but only children aged 11-14 years increased their f0 range, an unhelpful strategy not transmitted via the vocoder. Changes made by children were consistent with a general increase in vocal effort. Further developments in speech production take place during later childhood. Children use clear speech strategies to benefit an interlocutor facing intelligibility problems but may not be able to attune these strategies to the same degree as adults.

  15. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-15

    ...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...: This is a summary of the Commission's Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...), Internet Protocol Relay (IP Relay), and IP captioned telephone service (IP CTS) as compensable forms of TRS...

  16. Speech and Language Development in 2 Year Old Children with Cerebral Palsy

    PubMed Central

    Hustad, Katherine C.; Allison, Kristen; McFadd, Emily; Riehle, Katherine

    2013-01-01

    Objective We examined early speech and language development in children who had cerebral palsy. Questions addressed whether children could be classified into early profile groups on the basis of speech and language skills and whether there were differences on selected speech and language measures among groups. Methods Speech and language assessments were completed on 27 children with CP who were between the ages of 24-30 months (mean age 27.1 months; SD 1.8). We examined several measures of expressive and receptive language, along with speech intelligibility. Results 2-step cluster analysis was used to identify homogeneous groups of children based on their performance on the 7 dependent variables characterizing speech and language performance. Three groups of children identified were those not yet talking (44% of the sample); those whose talking abilities appeared to be emerging (41% of the sample); and those who were established talkers (15% of the sample). Group differences were evident on all variables except receptive language skills. Conclusion 85% of 2 year old children with CP in this study had clinical speech and /or language delays relative to age expectations. Findings suggest that children with CP should receive speech and language assessment and treatment to identify and treat those with delays at or before 2 years of age. PMID:23627373

  17. Speech and language development in 2-year-old children with cerebral palsy.

    PubMed

    Hustad, Katherine C; Allison, Kristen; McFadd, Emily; Riehle, Katherine

    2014-06-01

    We examined early speech and language development in children who had cerebral palsy. Questions addressed whether children could be classified into early profile groups on the basis of speech and language skills and whether there were differences on selected speech and language measures among groups. Speech and language assessments were completed on 27 children with CP who were between the ages of 24 and 30 months (mean age 27.1 months; SD 1.8). We examined several measures of expressive and receptive language, along with speech intelligibility. Two-step cluster analysis was used to identify homogeneous groups of children based on their performance on the seven dependent variables characterizing speech and language performance. Three groups of children identified were those not yet talking (44% of the sample); those whose talking abilities appeared to be emerging (41% of the sample); and those who were established talkers (15% of the sample). Group differences were evident on all variables except receptive language skills. 85% of 2-year-old children with CP in this study had clinical speech and/or language delays relative to age expectations. Findings suggest that children with CP should receive speech and language assessment and treatment at or before 2 years of age.

  18. Intelligence development of pre-lingual deaf children with unilateral cochlear implantation.

    PubMed

    Chen, Mo; Wang, Zhaoyan; Zhang, Zhiwen; Li, Xun; Wu, Weijing; Xie, Dinghua; Xiao, Zi-An

    2016-11-01

    The present study aims to test whether deaf children with unilateral cochlear implantation (CI) have higher intelligence quotients (IQ). We also try to find out the predictive factors of intelligence development in deaf children with CI. Totally, 186 children were enrolled into this study. They were divided into 3 groups: CI group (N = 66), hearing loss group (N = 54) and normal hearing group (N = 66). All children took the Hiskey-Nebraska Test of Learning Aptitude to assess the IQ. After that, we used Deafness gene chip, Categories of Auditory Performance (CAP) and Speech Intelligibility Rating (SIR) methods to evaluate the genotype, auditory and speech performance, respectively. At baseline, the average IQ of hearing loss group (HL), CI group, normal hearing (NH) group were 98.3 ± 9.23, 100.03 ± 12.13 and 109.89 ± 10.56, while NH group scored higher significantly than HL and CI groups (p < 0.05). After 12 months, the average IQ of HL group, CI group, NH group were99.54 ± 9.38,111.85 ± 15.38, and 112.08 ± 8.51, respectively. No significant difference between the IQ of the CI and NH groups was found (p > 0.05). The growth of SIR was positive correlated with the growth of IQ (r = 0.247, p = 0.046), while no significant correlation were found between IQ growth and other possible factors, i.e. gender, age of CI, use of hearing aid, genotype, implant device type, inner ear malformation and CAP growth (p > 0.05). Our study suggests that CI potentially improves the intelligence development in deaf children. Speech performance growth is significantly correlated with IQ growth of CI children. Deaf children accepted CI before 6 years can achieve a satisfying and undifferentiated short-term (12 months) development of intelligence. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  19. Exploring expressivity and emotion with artificial voice and speech technologies.

    PubMed

    Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

    2013-10-01

    Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.

  20. Emotional recognition from the speech signal for a virtual education agent

    NASA Astrophysics Data System (ADS)

    Tickle, A.; Raghu, S.; Elshaw, M.

    2013-06-01

    This paper explores the extraction of features from the speech wave to perform intelligent emotion recognition. A feature extract tool (openSmile) was used to obtain a baseline set of 998 acoustic features from a set of emotional speech recordings from a microphone. The initial features were reduced to the most important ones so recognition of emotions using a supervised neural network could be performed. Given that the future use of virtual education agents lies with making the agents more interactive, developing agents with the capability to recognise and adapt to the emotional state of humans is an important step.