Science.gov

Sample records for danish speech intelligibility

  1. Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  2. Improving Alaryngeal Speech Intelligibility.

    ERIC Educational Resources Information Center

    Christensen, John M.; Dwyer, Patricia E.

    1990-01-01

    Laryngectomized patients using esophageal speech or an electronic artificial larynx have difficulty producing correct voicing contrasts between homorganic consonants. This paper describes a therapy technique that emphasizes "pushing harder" on voiceless consonants to improve alaryngeal speech intelligibility and proposes focusing on the production…

  3. Expectations and speech intelligibility.

    PubMed

    Babel, Molly; Russell, Jamie

    2015-05-01

    Socio-indexical cues and paralinguistic information are often beneficial to speech processing as this information assists listeners in parsing the speech stream. Associations that particular populations speak in a certain speech style can, however, make it such that socio-indexical cues have a cost. In this study, native speakers of Canadian English who identify as Chinese Canadian and White Canadian read sentences that were presented to listeners in noise. Half of the sentences were presented with a visual-prime in the form of a photo of the speaker and half were presented in control trials with fixation crosses. Sentences produced by Chinese Canadians showed an intelligibility cost in the face-prime condition, whereas sentences produced by White Canadians did not. In an accentedness rating task, listeners rated White Canadians as less accented in the face-prime trials, but Chinese Canadians showed no such change in perceived accentedness. These results suggest a misalignment between an expected and an observed speech signal for the face-prime trials, which indicates that social information about a speaker can trigger linguistic associations that come with processing benefits and costs. PMID:25994710

  4. Predicting the intelligibility of vocoded speech

    PubMed Central

    Chen, Fei; Loizou, Philipos C.

    2010-01-01

    Objectives The purpose of this study is to evaluate the performance of a number of speech intelligibility indices in terms of predicting the intelligibility of vocoded speech. Design Noise-corrupted sentences were vocoded in a total of 80 conditions, involving three different SNR levels (-5, 0 and 5 dB) and two types of maskers (steady-state noise and two-talker). Tone-vocoder simulations were used as well as simulations of combined electric-acoustic stimulation (EAS). The vocoded sentences were presented to normal-hearing listeners for identification, and the resulting intelligibility scores were used to assess the correlation of various speech intelligibility measures. These included measures designed to assess speech intelligibility, including the speech-transmission index (STI) and articulation index (AI) based measures, as well as distortions in hearing aids (e.g., coherence-based measures). These measures employed primarily either the temporal-envelope or the spectral-envelope information in the prediction model. The underlying hypothesis in the present study is that measures that assess temporal envelope distortions, such as those based on the speech-transmission index, should correlate highly with the intelligibility of vocoded speech. This is based on the fact that vocoder simulations preserve primarily envelope information, similar to the processing implemented in current cochlear implant speech processors. Similarly, it is hypothesized that measures such as the coherence-based index that assess the distortions present in the spectral envelope could also be used to model the intelligibility of vocoded speech. Results Of all the intelligibility measures considered, the coherence-based and the STI-based measures performed the best. High correlations (r=0.9-0.96) were maintained with the coherence-based measures in all noisy conditions. The highest correlation obtained with the STI-based measure was 0.92, and that was obtained when high modulation rates (100

  5. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

    ERIC Educational Resources Information Center

    Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

    2004-01-01

    This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…

  6. Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

    ERIC Educational Resources Information Center

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-01-01

    Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…

  7. Intelligibility of the Speech of Deaf Children.

    ERIC Educational Resources Information Center

    Amcoff, Sven

    To develop a simple, inexpensive technique to quantify speech comprehension of pupils (aged 7 to 13) in special schools for the deaf, the verbal responses to pictures by 111 pupils were judged for intelligibility by untrained listeners. Pupils were asked to identify 30 pictures; their taped replies were judged by listeners who wrote down what they…

  8. Improving the speech intelligibility in classrooms

    NASA Astrophysics Data System (ADS)

    Lam, Choi Ling Coriolanus

    One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations

  9. The Modulation Transfer Function for Speech Intelligibility

    PubMed Central

    Elliott, Taffeta M.; Theunissen, Frédéric E.

    2009-01-01

    We systematically determined which spectrotemporal modulations in speech are necessary for comprehension by human listeners. Speech comprehension has been shown to be robust to spectral and temporal degradations, but the specific relevance of particular degradations is arguable due to the complexity of the joint spectral and temporal information in the speech signal. We applied a novel modulation filtering technique to recorded sentences to restrict acoustic information quantitatively and to obtain a joint spectrotemporal modulation transfer function for speech comprehension, the speech MTF. For American English, the speech MTF showed the criticality of low modulation frequencies in both time and frequency. Comprehension was significantly impaired when temporal modulations <12 Hz or spectral modulations <4 cycles/kHz were removed. More specifically, the MTF was bandpass in temporal modulations and low-pass in spectral modulations: temporal modulations from 1 to 7 Hz and spectral modulations <1 cycles/kHz were the most important. We evaluated the importance of spectrotemporal modulations for vocal gender identification and found a different region of interest: removing spectral modulations between 3 and 7 cycles/kHz significantly increases gender misidentifications of female speakers. The determination of the speech MTF furnishes an additional method for producing speech signals with reduced bandwidth but high intelligibility. Such compression could be used for audio applications such as file compression or noise removal and for clinical applications such as signal processing for cochlear implants. PMID:19266016

  10. Effects of interior aircraft noise on speech intelligibility and annoyance

    NASA Technical Reports Server (NTRS)

    Pearsons, K. S.; Bennett, R. L.

    1977-01-01

    Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceived annoyance in 36 subjects. Both speech intelligibility and background level significantly affected judged annoyance. However, the interaction between the two variables showed that above an 85 db background level the speech intelligibility results had a minimal effect on annoyance ratings. Below this level, people rated the background as less annoying if there was adequate speech intelligibility.

  11. Variability and Diagnostic Accuracy of Speech Intelligibility Scores in Children

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Oakes, Ashley; Allison, Kristen

    2015-01-01

    Purpose: We examined variability of speech intelligibility scores and how well intelligibility scores predicted group membership among 5-year-old children with speech motor impairment (SMI) secondary to cerebral palsy and an age-matched group of typically developing (TD) children. Method: Speech samples varying in length from 1-4 words were…

  12. Correlation study of predictive and descriptive metrics of speech intelligibility

    NASA Astrophysics Data System (ADS)

    Stefaniw, Abigail; Shimizu, Yasushi; Smith, Dana

    2002-11-01

    There exists a wide range of speech-intelligibility metrics, each of which is designed to encapsulate a different aspect of room acoustics that relates to speech intelligibility. This study reviews the different definitions of and correlations between various proposed speech intelligibility measures. Speech Intelligibility metrics can be grouped by two main uses: prediction of designed rooms and description of existing rooms. Two descriptive metrics still under investigation are Ease of Hearing and Acoustical Comfort. These are measured by a simple questionnaire, and their relationships with each other and with significant speech intelligibility metrics are explored. A variety of rooms are modeled and auralized in cooperation with a larger study, including classrooms, lecture halls, and offices. Auralized rooms are used to conveniently provide calculated metrics and cross-talk canceled auralizations for diagnostic and descriptive intelligibility tests. Rooms are modeled in CATT-Acoustic and auralized with a multi-channel speaker array in a hemi-anechoic chamber.

  13. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  14. Speech Technology-Based Assessment of Phoneme Intelligibility in Dysarthria

    ERIC Educational Resources Information Center

    Van Nuffelen, Gwen; Middag, Catherine; De Bodt, Marc; Martens, Jean-Pierre

    2009-01-01

    Background: Currently, clinicians mainly rely on perceptual judgements to assess intelligibility of dysarthric speech. Although often highly reliable, this procedure is subjective with a lot of intrinsic variables. Therefore, certain benefits can be expected from a speech technology-based intelligibility assessment. Previous attempts to develop an…

  15. Evaluating airborne sound insulation in terms of speech intelligibility.

    PubMed

    Park, H K; Bradley, J S; Gover, B N

    2008-03-01

    This paper reports on an evaluation of ratings of the sound insulation of simulated walls in terms of the intelligibility of speech transmitted through the walls. Subjects listened to speech modified to simulate transmission through 20 different walls with a wide range of sound insulation ratings, with constant ambient noise. The subjects' mean speech intelligibility scores were compared with various physical measures to test the success of the measures as sound insulation ratings. The standard Sound Transmission Class (STC) and Weighted Sound Reduction Index ratings were only moderately successful predictors of intelligibility scores, and eliminating the 8 dB rule from STC led to very modest improvements. Various previously established speech intelligibility measures (e.g., Articulation Index or Speech Intelligibility Index) and measures derived from them, such as the Articulation Class, were all relatively strongly related to speech intelligibility scores. In general, measures that involved arithmetic averages or summations of decibel values over frequency bands important for speech were most strongly related to intelligibility scores. The two most accurate predictors of the intelligibility of transmitted speech were an arithmetic average transmission loss over the frequencies from 200 to 2.5 kHz and the addition of a new spectrum weighting term to R(w) that included frequencies from 400 to 2.5 kHz. PMID:18345835

  16. Objectivization of evaluation of speech intelligibility during flights

    NASA Astrophysics Data System (ADS)

    Frolov, M. V.; Cherkasov, O. A.; Tarasenko, G. I.; Petlenko, I. A.

    1983-11-01

    Results of monitoring speech by an auditor and checking parameters of the articular trust of the speaker are analyzed to find a reliable and affective means of evaluating intelligibility of speech. The affect of a mask on syllable intelligibility was tested.

  17. Intelligibility of Dysarthric Speech: Perceptions of Speakers and Listeners

    ERIC Educational Resources Information Center

    Walshe, Margaret; Miller, Nick; Leahy, Margaret; Murray, Aisling

    2008-01-01

    Background: Many factors influence listener perception of dysarthric speech. Final consensus on the role of gender and listener experience is still to be reached. The speaker's perception of his/her speech has largely been ignored. Aims: (1) To compare speaker and listener perception of the intelligibility of dysarthric speech; (2) to explore the…

  18. Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features.

    PubMed

    Schubotz, Wiebke; Brand, Thomas; Kollmeier, Birger; Ewert, Stephan D

    2016-07-01

    Speech intelligibility is strongly affected by the presence of maskers. Depending on the spectro-temporal structure of the masker and its similarity to the target speech, different masking aspects can occur which are typically referred to as energetic, amplitude modulation, and informational masking. In this study speech intelligibility and speech detection was measured in maskers that vary systematically in the time-frequency domain from steady-state noise to a single interfering talker. Male and female target speech was used in combination with maskers based on speech for the same or different gender. Observed data were compared to predictions of the speech intelligibility index, extended speech intelligibility index, multi-resolution speech-based envelope-power-spectrum model, and the short-time objective intelligibility measure. The different models served as analysis tool to help distinguish between the different masking aspects. Comparison shows that overall masking can to a large extent be explained by short-term energetic masking. However, the other masking aspects (amplitude modulation an informational masking) influence speech intelligibility as well. Additionally, it was obvious that all models showed considerable deviations from the data. Therefore, the current study provides a benchmark for further evaluation of speech prediction models. PMID:27475175

  19. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    PubMed

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse. PMID:16521772

  20. Implementing Speech Supplementation Strategies: Effects on Intelligibility and Speech Rate of Individuals with Chronic Severe Dysarthria.

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Jones, Tabitha; Dailey, Suzanne

    2003-01-01

    A study compared intelligibility and speech rate differences following speaker implementation of 3 strategies (topic, alphabet, and combined topic and alphabet supplementation) and a habitual speech control condition for 5 speakers with severe dysarthria. Combined cues and alphabet cues yielded significantly higher intelligibility scores and…

  1. Optimizing acoustical conditions for speech intelligibility in classrooms

    NASA Astrophysics Data System (ADS)

    Yang, Wonyoung

    High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with

  2. Predicting Speech Intelligibility with A Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    PubMed Central

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystem approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (SMI), and nine judged to be free of dysarthria (NSMI). Data from children with CP were compared to data from age-matched typically developing children (TD). Results Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and TD groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (Adjusted R-squared = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R-squared analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Conclusions Children in the SMI group have articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems. PMID:24824584

  3. Speech Intelligibility Advantages using an Acoustic Beamformer Display

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Sunder, Kaushik; Godfroy, Martine; Otto, Peter

    2015-01-01

    A speech intelligibility test conforming to the Modified Rhyme Test of ANSI S3.2 "Method for Measuring the Intelligibility of Speech Over Communication Systems" was conducted using a prototype 12-channel acoustic beamformer system. The target speech material (signal) was identified against speech babble (noise), with calculated signal-noise ratios of 0, 5 and 10 dB. The signal was delivered at a fixed beam orientation of 135 deg (re 90 deg as the frontal direction of the array) and the noise at 135 deg (co-located) and 0 deg (separated). A significant improvement in intelligibility from 57% to 73% was found for spatial separation for the same signal-noise ratio (0 dB). Significant effects for improved intelligibility due to spatial separation were also found for higher signal-noise ratios (5 and 10 dB).

  4. An OFDM-Based Speech Encryption System without Residual Intelligibility

    NASA Astrophysics Data System (ADS)

    Tseng, Der-Chang; Chiu, Jung-Hui

    Since an FFT-based speech encryption system retains a considerable residual intelligibility, such as talk spurts and the original intonation in the encrypted speech, this makes it easy for eavesdroppers to deduce the information contents from the encrypted speech. In this letter, we propose a new technique based on the combination of an orthogonal frequency division multiplexing (OFDM) scheme and an appropriate QAM mapping method to remove the residual intelligibility from the encrypted speech by permuting several frequency components. In addition, the proposed OFDM-based speech encryption system needs only two FFT operations instead of the four required by the FFT-based speech encryption system. Simulation results are presented to show the effectiveness of this proposed technique.

  5. Influence of Visual Information on the Intelligibility of Dysarthric Speech

    ERIC Educational Resources Information Center

    Keintz, Connie K.; Bunton, Kate; Hoit, Jeannette D.

    2007-01-01

    Purpose: To examine the influence of visual information on speech intelligibility for a group of speakers with dysarthria associated with Parkinson's disease. Method: Eight speakers with Parkinson's disease and dysarthria were recorded while they read sentences. Speakers performed a concurrent manual task to facilitate typical speech production.…

  6. The Intelligibility of Time-Compressed Speech. Final Report.

    ERIC Educational Resources Information Center

    Carroll, John B.; Cramer, H. Leslie

    Time-compressed speech is now being used to present recorded lectures to groups at word rates up to two and one-half times that at which they were originally spoken. This process is particularly helpful to the blind. This study investigated the intelligibility of speech processed with seven different discard intervals and at seven rates from two…

  7. Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

    PubMed

    Schoenmaker, Esther; van de Par, Steven

    2016-01-01

    Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs. PMID:27080648

  8. Speech Intelligibility and Prosody Production in Children with Cochlear Implants

    ERIC Educational Resources Information Center

    Chin, Steven B.; Bergeson, Tonya R.; Phan, Jennifer

    2012-01-01

    Objectives: The purpose of the current study was to examine the relation between speech intelligibility and prosody production in children who use cochlear implants. Methods: The Beginner's Intelligibility Test (BIT) and Prosodic Utterance Production (PUP) task were administered to 15 children who use cochlear implants and 10 children with normal…

  9. Talker Versus Dialect Effects on Speech Intelligibility: A Symmetrical Study.

    PubMed

    McCloy, Daniel R; Wright, Richard A; Souza, Pamela E

    2015-09-01

    This study investigates the relative effects of talker-specific variation and dialect-based variation on speech intelligibility. Listeners from two dialects of American English performed speech-in-noise tasks with sentences spoken by talkers of each dialect. An initial statistical model showed no significant effects for either talker or listener dialect group, and no interaction. However, a mixed-effects regression model including several acoustic measures of the talker's speech revealed a subtle effect of talker dialect once the various acoustic dimensions were accounted for. Results are discussed in relation to other recent studies of cross-dialect intelligibility. PMID:26529902

  10. Talker versus dialect effects on speech intelligibility: a symmetrical study

    PubMed Central

    McCloy, Daniel R.; Wright, Richard A.; Souza, Pamela E.

    2014-01-01

    This study investigates the relative effects of talker-specific variation and dialect-based variation on speech intelligibility. Listeners from two dialects of American English performed speech-in-noise tasks with sentences spoken by talkers of each dialect. An initial statistical model showed no significant effects for either talker or listener dialect group, and no interaction. However, a mixed-effects regression model including several acoustic measures of the talker’s speech revealed a subtle effect of talker dialect once the various acoustic dimensions were accounted for. Results are discussed in relation to other recent studies of cross-dialect intelligibility. PMID:26529902

  11. Exploring the Role of Brain Oscillations in Speech Perception in Noise: Intelligibility of Isochronously Retimed Speech

    PubMed Central

    Aubanel, Vincent; Davis, Chris; Kim, Jeesun

    2016-01-01

    A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximize processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioral experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets) was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  12. Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems

    PubMed Central

    Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L.; Berry, James D.; Perry, Bridget; Green, Jordan R.

    2016-01-01

    Purpose To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Method Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Results Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Conclusion Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred. PMID:27148967

  13. Aircraft noise and speech intelligibility in an outdoor living space.

    PubMed

    Alvarsson, Jesper J; Nordström, Henrik; Lundén, Peter; Nilsson, Mats E

    2014-06-01

    Studies of effects on speech intelligibility from aircraft noise in outdoor places are currently lacking. To explore these effects, first-order ambisonic recordings of aircraft noise were reproduced outdoors in a pergola. The average background level was 47 dB LA eq. Lists of phonetically balanced words (LAS max,word = 54 dB) were reproduced simultaneously with aircraft passage noise (LAS max,noise = 72-84 dB). Twenty individually tested listeners wrote down each presented word while seated in the pergola. The main results were (i) aircraft noise negatively affects speech intelligibility at sound pressure levels that exceed those of the speech sound (signal-to-noise ratio, S/N < 0), and (ii) the simple A-weighted S/N ratio was nearly as good an indicator of speech intelligibility as were two more advanced models, the Speech Intelligibility Index and Glasberg and Moore's [J. Audio Eng. Soc. 53, 906-918 (2005)] partial loudness model. This suggests that any of these indicators is applicable for predicting effects of aircraft noise on speech intelligibility outdoors. PMID:24907809

  14. Correlation of subjective and objective measures of speech intelligibility

    NASA Astrophysics Data System (ADS)

    Bowden, Erica E.; Wang, Lily M.; Palahanska, Milena S.

    2003-10-01

    Currently there are a number of objective evaluation methods used to quantify the speech intelligibility in a built environment, including the Speech Transmission Index (STI), Rapid Speech Transmission Index (RASTI), Articulation Index (AI), and the Percentage Articulation Loss of Consonants (%ALcons). Many of these have been used for years; however, questions remain about their accuracy in predicting the acoustics of a space. Current widely used software programs can quickly evaluate STI, RASTI, and %ALcons from a measured impulse response. This project compares subjective human performance on modified rhyme and phonetically balanced word tests with objective results calculated from impulse response measurements in four different spaces. The results of these tests aid in understanding performance of various methods of speech intelligibility evaluation. [Work supported by the Univ. of Nebraska Center for Building Integration.] For Speech Communication Best Student Paper Award.

  15. Enhancing Speech Intelligibility: Interactions among Context, Modality, Speech Style, and Masker

    ERIC Educational Resources Information Center

    Van Engen, Kristin J.; Phelps, Jasmine E. B.; Smiljanic, Rajka; Chandrasekaran, Bharath

    2014-01-01

    Purpose: The authors sought to investigate interactions among intelligibility-enhancing speech cues (i.e., semantic context, clearly produced speech, and visual information) across a range of masking conditions. Method: Sentence recognition in noise was assessed for 29 normal-hearing listeners. Testing included semantically normal and anomalous…

  16. The Mutual Intelligibility of L2 Speech

    ERIC Educational Resources Information Center

    Munro, Murray J.; Derwing, Tracey M.; Morton, Susan L.

    2006-01-01

    When understanding or evaluating foreign-accented speech, listeners are affected not only by properties of the speech itself but by their own linguistic backgrounds and their experience with different speech varieties. Given the latter influence, it is not known to what degree a diverse group of listeners might share a response to second language…

  17. Influence of auditory fatigue on masked speech intelligibility

    NASA Technical Reports Server (NTRS)

    Parker, D. E.; Martens, W. L.; Johnston, P. A.

    1980-01-01

    Intelligibility of PB word lists embedded in simultaneous masking noise was evaluated before and after fatiguing-noise exposure, which was determined by observing the number of words correctly repeated during a shadowing task. Both the speech signal and the masking noise were filtered to a 2825-3185-Hz band. Masking-noise leves were varied from 0- to 90-dB SL. Fatigue was produced by a 1500-3000-Hz octave band of noise at 115 dB (re 20 micron-Pa) presented continuously for 5 min. The results of three experiments indicated that speed intelligibility was reduced when the speech was presented against a background of silence but that the fatiguing-noise exposure had no effect on intelligibility when the speech was made more intense and embedded in masking noise of 40-90-dB SL. These observations are interpreted by considering the recruitment produced by fatigue and masking noise.

  18. Acheiving speech intelligibility at Paddington Station, London, UK

    NASA Astrophysics Data System (ADS)

    Goddard, Helen M.

    2002-11-01

    Paddington Station in London, UK is a large rail terminus for long distance electric and diesel powered trains. This magnificent train shed has four arched spans and is one of the remaining structural testaments to the architect Brunel. Given the current British and European legislative requirements for intelligible speech in public buildings AMS Acoustics were engaged to design an electroacoustic solution. In this paper we will outline how the significant problems of lively natural acoustics, the high operational noise levels and the strict aesthetic constraints were addressed. The resultant design is radical, using the most recent dsp controlled line array loudspeakers. In the paper we detail the acoustic modeling undertaken to predict both even direct sound pressure level coverage and STI. Further it presents the speech intelligibility measured upon handover of the new system. The design has proved to be successful and given the nature of the space, outstanding speech intelligibility is achieved.

  19. Variability and Intelligibility of Clarified Speech to Different Listener Groups

    NASA Astrophysics Data System (ADS)

    Silber, Ronnie F.

    Two studies examined the modifications that adult speakers make in speech to disadvantaged listeners. Previous research that has focused on speech to the deaf individuals and to young children has shown that adults clarify speech when addressing these two populations. Acoustic measurements suggest that the signal undergoes similar changes for both populations. Perceptual tests corroborate these results for the deaf population, but are nonsystematic in developmental studies. The differences in the findings for these populations and the nonsystematic results in the developmental literature may be due to methodological factors. The present experiments addressed these methodological questions. Studies of speech to hearing impaired listeners have used read, nonsense, sentences, for which speakers received explicit clarification instructions and feedback, while in the child literature, excerpts of real-time conversations were used. Therefore, linguistic samples were not precisely matched. In this study, experiments used various linguistic materials. Experiment 1 used a children's story; experiment 2, nonsense sentences. Four mothers read both types of material in four ways: (1) in "normal" adult speech, (2) in "babytalk," (3) under the clarification instructions used in the "hearing impaired studies" (instructed clear speech) and (4) in (spontaneous) clear speech without instruction. No extra practice or feedback was given. Sentences were presented to 40 normal hearing college students with and without simultaneous masking noise. Results were separately tabulated for content and function words, and analyzed using standard statistical tests. The major finding in the study was individual variation in speaker intelligibility. "Real world" speakers vary in their baseline intelligibility. The four speakers also showed unique patterns of intelligibility as a function of each independent variable. Results were as follows. Nonsense sentences were less intelligible than story

  20. The Effects of Auditory Contrast Tuning upon Speech Intelligibility

    PubMed Central

    Killian, Nathan J.; Watkins, Paul V.; Davidson, Lisa S.; Barbour, Dennis L.

    2016-01-01

    We have previously identified neurons tuned to spectral contrast of wideband sounds in auditory cortex of awake marmoset monkeys. Because additive noise alters the spectral contrast of speech, contrast-tuned neurons, if present in human auditory cortex, may aid in extracting speech from noise. Given that this cortical function may be underdeveloped in individuals with sensorineural hearing loss, incorporating biologically-inspired algorithms into external signal processing devices could provide speech enhancement benefits to cochlear implantees. In this study we first constructed a computational signal processing algorithm to mimic auditory cortex contrast tuning. We then manipulated the shape of contrast channels and evaluated the intelligibility of reconstructed noisy speech using a metric to predict cochlear implant user perception. Candidate speech enhancement strategies were then tested in cochlear implantees with a hearing-in-noise test. Accentuation of intermediate contrast values or all contrast values improved computed intelligibility. Cochlear implant subjects showed significant improvement in noisy speech intelligibility with a contrast shaping procedure. PMID:27555826

  1. The Effects of Auditory Contrast Tuning upon Speech Intelligibility.

    PubMed

    Killian, Nathan J; Watkins, Paul V; Davidson, Lisa S; Barbour, Dennis L

    2016-01-01

    We have previously identified neurons tuned to spectral contrast of wideband sounds in auditory cortex of awake marmoset monkeys. Because additive noise alters the spectral contrast of speech, contrast-tuned neurons, if present in human auditory cortex, may aid in extracting speech from noise. Given that this cortical function may be underdeveloped in individuals with sensorineural hearing loss, incorporating biologically-inspired algorithms into external signal processing devices could provide speech enhancement benefits to cochlear implantees. In this study we first constructed a computational signal processing algorithm to mimic auditory cortex contrast tuning. We then manipulated the shape of contrast channels and evaluated the intelligibility of reconstructed noisy speech using a metric to predict cochlear implant user perception. Candidate speech enhancement strategies were then tested in cochlear implantees with a hearing-in-noise test. Accentuation of intermediate contrast values or all contrast values improved computed intelligibility. Cochlear implant subjects showed significant improvement in noisy speech intelligibility with a contrast shaping procedure. PMID:27555826

  2. Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility dataa

    PubMed Central

    Payton, Karen L.; Shrestha, Mona

    2013-01-01

    Several algorithms have been shown to generate a metric corresponding to the Speech Transmission Index (STI) using speech as a probe stimulus [e.g., Goldsworthy and Greenberg, J. Acoust. Soc. Am. 116, 3679–3689 (2004)]. The time-domain approaches work well on long speech segments and have the added potential to be used for short-time analysis. This study investigates the performance of the Envelope Regression (ER) time-domain STI method as a function of window length, in acoustically degraded environments with multiple talkers and speaking styles. The ER method is compared with a short-time Theoretical STI, derived from octave-band signal-to-noise ratios and reverberation times. For windows as short as 0.3 s, the ER method tracks short-time Theoretical STI changes in stationary speech-shaped noise, fluctuating restaurant babble and stationary noise plus reverberation. The metric is also compared to intelligibility scores on conversational speech and speech articulated clearly but at normal speaking rates (Clear/Norm) in stationary noise. Correlation between the metric and intelligibility scores is high and, consistent with the subject scores, the metrics are higher for Clear/Norm speech than for conversational speech and higher for the first word in a sentence than for the last word. PMID:24180791

  3. Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data.

    PubMed

    Payton, Karen L; Shrestha, Mona

    2013-11-01

    Several algorithms have been shown to generate a metric corresponding to the Speech Transmission Index (STI) using speech as a probe stimulus [e.g., Goldsworthy and Greenberg, J. Acoust. Soc. Am. 116, 3679-3689 (2004)]. The time-domain approaches work well on long speech segments and have the added potential to be used for short-time analysis. This study investigates the performance of the Envelope Regression (ER) time-domain STI method as a function of window length, in acoustically degraded environments with multiple talkers and speaking styles. The ER method is compared with a short-time Theoretical STI, derived from octave-band signal-to-noise ratios and reverberation times. For windows as short as 0.3 s, the ER method tracks short-time Theoretical STI changes in stationary speech-shaped noise, fluctuating restaurant babble and stationary noise plus reverberation. The metric is also compared to intelligibility scores on conversational speech and speech articulated clearly but at normal speaking rates (Clear/Norm) in stationary noise. Correlation between the metric and intelligibility scores is high and, consistent with the subject scores, the metrics are higher for Clear/Norm speech than for conversational speech and higher for the first word in a sentence than for the last word. PMID:24180791

  4. Chinese speech intelligibility and its relationship with the speech transmission index for children in elementary school classrooms.

    PubMed

    Peng, Jianxin; Yan, Nanjie; Wang, Dan

    2015-01-01

    The present study investigated Chinese speech intelligibility in 28 classrooms from nine different elementary schools in Guangzhou, China. The subjective Chinese speech intelligibility in the classrooms was evaluated with children in grades 2, 4, and 6 (7 to 12 years old). Acoustical measurements were also performed in these classrooms. Subjective Chinese speech intelligibility scores and objective speech intelligibility parameters, such as speech transmission index (STI), were obtained at each listening position for all tests. The relationship between subjective Chinese speech intelligibility scores and STI was revealed and analyzed. The effects of age on Chinese speech intelligibility scores were compared. Results indicate high correlations between subjective Chinese speech intelligibility scores and STI for grades 2, 4, and 6 children. Chinese speech intelligibility scores increase with increase of age under the same STI condition. The differences in scores among different age groups decrease as STI increases. To achieve 95% Chinese speech intelligibility scores, the STIs required for grades 2, 4, and 6 children are 0.75, 0.69, and 0.63, respectively. PMID:25618041

  5. Intelligibility of reverberant noisy speech with ideal binary masking.

    PubMed

    Roman, Nicoleta; Woodruff, John

    2011-10-01

    For a mixture of target speech and noise in anechoic conditions, the ideal binary mask is defined as follows: It selects the time-frequency units where target energy exceeds noise energy by a certain local threshold and cancels the other units. In this study, the definition of the ideal binary mask is extended to reverberant conditions. Given the division between early and late reflections in terms of speech intelligibility, three ideal binary masks can be defined: an ideal binary mask that uses the direct path of the target as the desired signal, an ideal binary mask that uses the direct path and early reflections of the target as the desired signal, and an ideal binary mask that uses the reverberant target as the desired signal. The effects of these ideal binary mask definitions on speech intelligibility are compared across two types of interference: speech shaped noise and concurrent female speech. As suggested by psychoacoustical studies, the ideal binary mask based on the direct path and early reflections of target speech outperforms the other masks as reverberation time increases and produces substantial reductions in terms of speech reception threshold for normal hearing listeners. PMID:21973369

  6. Intelligibility of whispered speech in stationary and modulated noise maskers

    PubMed Central

    Freyman, Richard L.; Griffin, Amanda M.; Oxenham, Andrew J.

    2012-01-01

    This study investigated the role of natural periodic temporal fine structure in helping listeners take advantage of temporal valleys in amplitude-modulated masking noise when listening to speech. Young normal-hearing participants listened to natural, whispered, and/or vocoded nonsense sentences in a variety of masking conditions. Whispering alters normal waveform temporal fine structure dramatically but, unlike vocoding, does not degrade spectral details created by vocal tract resonances. The improvement in intelligibility, or masking release, due to introducing 16-Hz square-wave amplitude modulations in an otherwise steady speech-spectrum noise was reduced substantially with vocoded sentences relative to natural speech, but was not reduced for whispered sentences. In contrast to natural speech, masking release for whispered sentences was observed even at positive signal-to-noise ratios. Whispered speech has a different short-term amplitude distribution relative to natural speech, and this appeared to explain the robust masking release for whispered speech at high signal-to-noise ratios. Recognition of whispered speech was not disproportionately affected by unpredictable modulations created by a speech-envelope modulated noise masker. Overall, the presence or absence of periodic temporal fine structure did not have a major influence on the degree of benefit obtained from imposing temporal fluctuations on a noise masker. PMID:23039445

  7. Speech Intelligibility and Prosody Production in Children with Cochlear Implants

    PubMed Central

    Chin, Steven B.; Bergeson, Tonya R.; Phan, Jennifer

    2012-01-01

    Objectives The purpose of the current study was to examine the relation between speech intelligibility and prosody production in children who use cochlear implants. Methods The Beginner's Intelligibility Test (BIT) and Prosodic Utterance Production (PUP) task were administered to 15 children who use cochlear implants and 10 children with normal hearing. Adult listeners with normal hearing judged the intelligibility of the words in the BIT sentences, identified the PUP sentences as one of four grammatical or emotional moods (i.e., declarative, interrogative, happy, or sad), and rated the PUP sentences according to how well they thought the child conveyed the designated mood. Results Percent correct scores were higher for intelligibility than for prosody and higher for children with normal hearing than for children with cochlear implants. Declarative sentences were most readily identified and received the highest ratings by adult listeners; interrogative sentences were least readily identified and received the lowest ratings. Correlations between intelligibility and all mood identification and rating scores except declarative were not significant. Discussion The findings suggest that the development of speech intelligibility progresses ahead of prosody in both children with cochlear implants and children with normal hearing; however, children with normal hearing still perform better than children with cochlear implants on measures of intelligibility and prosody even after accounting for hearing age. Problems with interrogative intonation may be related to more general restrictions on rising intonation, and the correlation results indicate that intelligibility and sentence intonation may be relatively dissociated at these ages. PMID:22717120

  8. Speech intelligibility enhancement using hearing-aid array processing.

    PubMed

    Saunders, G H; Kates, J M

    1997-09-01

    Microphone arrays can improve speech recognition in the noise for hearing-impaired listeners by suppressing interference coming from other than desired signal direction. In a previous paper [J. M. Kates and M. R. Weiss, J. Acoust. Soc. Am. 99, 3138-3148 (1996)], several array-processing techniques were evaluated in two rooms using the AI-weighted array gain as the performance metric. The array consisted of five omnidirectional microphones having uniform 2.5-cm spacing, oriented in the endfire direction. In this paper, the speech intelligibility for two of the array processing techniques, delay-and-sum beamforming and superdirective processing, is evaluated for a group of hearing-impaired subjects. Speech intelligibility was measured using the speech reception threshold (SRT) for spondees and speech intelligibility rating (SIR) for sentence materials. The array performance is compared with that for a single omnidirectional microphone and a single directional microphone having a cardioid response pattern. The SRT and SIR results show that the superdirective array processing was the most effective, followed by the cardioid microphone, the array using delay-and-sum beamforming, and the single omnidirectional microphone. The relative processing ratings do not appear to be strongly affected by the size of the room, and the SRT values determined using isolated spondees are similar to the SIR values produced from continuous discourse. PMID:9301060

  9. Speech intelligibility measure for vocal control of an automaton

    NASA Astrophysics Data System (ADS)

    Naranjo, Michel; Tsirigotis, Georgios

    1998-07-01

    The acceleration of investigations in Speech Recognition allows to augur, in the next future, a wide establishment of Vocal Control Systems in the production units. The communication between a human and a machine necessitates technical devices that emit, or are submitted to important noise perturbations. The vocal interface introduces a new control problem of a deterministic automaton using uncertain information. The purpose is to place exactly the automaton in a final state, ordered by voice, from an unknown initial state. The whole Speech Processing procedure, presented in this paper, has for input the temporal speech signal of a word and for output a recognised word labelled with an intelligibility index given by the recognition quality. In the first part, we present the essential psychoacoustic concepts for the automatic calculation of the loudness of a speech signal. The architecture of a Time Delay Neural Network is presented in second part where we also give the results of the recognition. The theory of the fuzzy subset, in third part, allows to extract at the same time a recognised word and its intelligibility index. In the fourth part, an Anticipatory System models the control of a Sequential Machine. A prediction phase and an updating one appear which involve data coming from the information system. A Bayesian decision strategy is used and the criterion is a weighted sum of criteria defined from information, minimum path functions and speech intelligibility measure.

  10. Respiratory Dynamics and Speech Intelligibility in Speakers with Generalized Dystonia.

    ERIC Educational Resources Information Center

    LaBlance, Gary R.; Rutherford, David R.

    1991-01-01

    This study compared respiratory function during quiet breathing and monologue, in six adult dystonic subjects and a control group of four neurologically intact adults. Dystonic subjects showed a faster breathing rate, less rhythmic breathing pattern, decreased lung volume, and apnea-like periods. Decreased speech intelligibility was related to…

  11. Time-forward speech intelligibility in time-reversed rooms

    PubMed Central

    Longworth-Reed, Laricia; Brandewie, Eugene; Zahorik, Pavel

    2009-01-01

    The effects of time-reversed room acoustics on word recognition abilities were examined using virtual auditory space techniques, which allowed for temporal manipulation of the room acoustics independent of the speech source signals. Two acoustical conditions were tested: one in which room acoustics were simulated in a realistic time-forward fashion and one in which the room acoustics were reversed in time, causing reverberation and acoustic reflections to precede the direct-path energy. Significant decreases in speech intelligibility—from 89% on average to less than 25%—were observed between the time-forward and time-reversed rooms. This result is not predictable using standard methods for estimating speech intelligibility based on the modulation transfer function of the room. It may instead be due to increased degradation of onset information in the speech signals when room acoustics are time-reversed. PMID:19173377

  12. Lexical effects on speech production and intelligibility in Parkinson's disease

    NASA Astrophysics Data System (ADS)

    Chiu, Yi-Fang

    Individuals with Parkinson's disease (PD) often have speech deficits that lead to reduced speech intelligibility. Previous research provides a rich database regarding the articulatory deficits associated with PD including restricted vowel space (Skodda, Visser, & Schlegel, 2011) and flatter formant transitions (Tjaden & Wilding, 2004; Walsh & Smith, 2012). However, few studies consider the effect of higher level structural variables of word usage frequency and the number of similar sounding words (i.e. neighborhood density) on lower level articulation or on listeners' perception of dysarthric speech. The purpose of the study is to examine the interaction of lexical properties and speech articulation as measured acoustically in speakers with PD and healthy controls (HC) and the effect of lexical properties on the perception of their speech. Individuals diagnosed with PD and age-matched healthy controls read sentences with words that varied in word frequency and neighborhood density. Acoustic analysis was performed to compare second formant transitions in diphthongs, an indicator of the dynamics of tongue movement during speech production, across different lexical characteristics. Young listeners transcribed the spoken sentences and the transcription accuracy was compared across lexical conditions. The acoustic results indicate that both PD and HC speakers adjusted their articulation based on lexical properties but the PD group had significant reductions in second formant transitions compared to HC. Both groups of speakers increased second formant transitions for words with low frequency and low density, but the lexical effect is diphthong dependent. The change in second formant slope was limited in the PD group when the required formant movement for the diphthong is small. The data from listeners' perception of the speech by PD and HC show that listeners identified high frequency words with greater accuracy suggesting the use of lexical knowledge during the

  13. Speech Intelligibility of Cochlear-Implanted and Normal-Hearing Children

    PubMed Central

    Poursoroush, Sara; Ghorbani, Ali; Soleymani, Zahra; Kamali, Mohammd; Yousefi, Negin; Poursoroush, Zahra

    2015-01-01

    Introduction: Speech intelligibility, the ability to be understood verbally by listeners, is the gold standard for assessing the effectiveness of cochlear implantation. Thus, the goal of this study was to compare the speech intelligibility between normal-hearing and cochlear-implanted children using the Persian intelligibility test. Materials and Methods: Twenty-six cochlear-implanted children aged 48–95 months, who had been exposed to 95–100 speech therapy sessions, were compared with 40 normal-hearing children aged 48–84 months. The average post-implanted time was 14.53 months. Speech intelligibility was assessed using the Persian sentence speech intelligibility test. Results: The mean score of the speech intelligibility test among cochlear-implanted children was 63.71% (standard deviation [SD], 1.06) compared with 100% intelligible among all normal-hearing children (P<0.000). No effects of age or gender on speech intelligibility were observed in these two groups at this range of ages (P>0.05). Conclusion: Speech intelligibility in the Persian language was poorer in cochlear-implanted children in comparison with normal-hearing children. The differences in speech intelligibility between cochlear-implanted and normal-hearing children can be shown through the Persian sentence speech intelligibility test. PMID:26568940

  14. SNR Loss: A new objective measure for predicting speech intelligibility of noise-suppressed speech

    PubMed Central

    Ma, Jianfen; Loizou, Philipos C.

    2010-01-01

    Most of the existing intelligibility measures do not account for the distortions present in processed speech, such as those introduced by speech-enhancement algorithms. In the present study, we propose three new objective measures that can be used for prediction of intelligibility of processed (e.g., via an enhancement algorithm) speech in noisy conditions. All three measures use a critical-band spectral representation of the clean and noise-suppressed signals and are based on the measurement of the SNR loss incurred in each critical band after the corrupted signal goes through a speech enhancement algorithm. The proposed measures are flexible in that they can provide different weights to the two types of spectral distortions introduced by enhancement algorithms, namely spectral attenuation and spectral amplification distortions. The proposed measures were evaluated with intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech (consonants and sentences) corrupted by four different maskers (car, babble, train and street interferences). Highest correlation (r=−0.85) with sentence recognition scores was obtained using a variant of the SNR loss measure that only included vowel/consonant transitions and weak consonant information. High correlation was maintained for all noise types, with a maximum correlation (r=−0.88) achieved in street noise conditions. PMID:21503274

  15. Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.

    PubMed

    Cao, Shuyang; Li, Liang; Wu, Xihong

    2011-04-01

    When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions. PMID:21476677

  16. Role of binaural hearing in speech intelligibility and spatial release from masking using vocoded speech.

    PubMed

    Garadat, Soha N; Litovsky, Ruth Y; Yu, Gongqiang; Zeng, Fan-Gang

    2009-11-01

    A cochlear implant vocoder was used to evaluate relative contributions of spectral and binaural temporal fine-structure cues to speech intelligibility. In Study I, stimuli were vocoded, and then convolved through head related transfer functions (HRTFs) to remove speech temporal fine structure but preserve the binaural temporal fine-structure cues. In Study II, the order of processing was reversed to remove both speech and binaural temporal fine-structure cues. Speech reception thresholds (SRTs) were measured adaptively in quiet, and with interfering speech, for unprocessed and vocoded speech (16, 8, and 4 frequency bands), under binaural or monaural (right-ear) conditions. Under binaural conditions, as the number of bands decreased, SRTs increased. With decreasing number of frequency bands, greater benefit from spatial separation of target and interferer was observed, especially in the 8-band condition. The present results demonstrate a strong role of the binaural cues in spectrally degraded speech, when the target and interfering speech are more likely to be confused. The nearly normal binaural benefits under present simulation conditions and the lack of order of processing effect further suggest that preservation of binaural cues is likely to improve performance in bilaterally implanted recipients. PMID:19894832

  17. Assessment of Intelligibility Using Children's Spontaneous Speech: Methodological Aspects

    ERIC Educational Resources Information Center

    Lagerberg, Tove B.; Åsberg, Jakob; Hartelius, Lena; Persson, Christina

    2014-01-01

    Background: Intelligibility is a speaker's ability to convey a message to a listener. Including an assessment of intelligibility is essential in both research and clinical work relating to individuals with communication disorders due to speech impairment. Assessment of the intelligibility of spontaneous speech can be used as an overall…

  18. Speech Intelligibility of Pediatric Cochlear Implant Recipients with 7 Years of Device Experience.

    ERIC Educational Resources Information Center

    Peng, Shu-Chen; Spencer, Linda J.; Tomblin, J. Bruce

    2004-01-01

    Speech intelligibility of 24 prelingually deaf pediatric cochlear implant (CI) recipients with 84 months of device experience was investigated. Each CI participant's speech samples were judged by a panel of 3 listeners. Intelligibility scores were calculated as the average of the 3 listeners' responses. The average write-down intelligibility score…

  19. Perceptual and production variables in explicating interlanguage speech intelligibility benefit

    NASA Astrophysics Data System (ADS)

    Shah, Amee P.; Vavva, Zoi

    2005-04-01

    This study attempts to investigate the importance of the degree of similarity or difference in the language backgrounds of the speakers and listeners, as it interacts differentially in intelligibility judgment of foreign-accented speech (Bent and Bradlow, 2003). The present study attempts to clarify the distinction in the matched and mismatched listening conditions, in context of addressing the overarching question whether auditory exposure to a language alone, without corresponding proficiency in production of that language, can provide a listening advantage. Particularly, do listeners understand accented-English speech spoken by native individuals of the language to which they are exposed to, as compared to listeners without that exposure? Greek-accented English speakers (and native monolingual English speakers) were judged for their speech intelligibility by four groups of listeners (n=10, each): native Greek speakers (matched), Greek-Americans (matched only through auditory exposure to Greek without any corresponding spoken proficiency), native monolingual American-English speakers (unmatched), and a mixed group (mismatched). Pilot data have shown that the intelligibility judgments by Greek-American listeners are intermediate to the native Greeks, and both the American-English and the mixed group. Further data-collection is underway, and will be presented as they bear important theoretical and clinical implications.

  20. Speech intelligibility in the community mosques of Dhaka City

    NASA Astrophysics Data System (ADS)

    Najmul Imam, Sheikh Muhammad

    2002-11-01

    A mosque facilitates a Muslim community through different religious activities like congregational prayers, recitation and theological education. Speech in a mosque usually generates through bare voice though sound amplification system is also applied. Since no musical instrument is used in any liturgy, a mosque involves only speech acoustics. The community mosques of Dhaka city, the densely populated capital of Bangladesh, are usually designed and constructed by common people inspired from religious virtues. Seeking consultancy for acoustical design is almost never done. As an obvious consequence, there is a common crisis of speech intelligibility in different mosques, except those saved for their smaller volume and other parameters generated by chance. In a very few cases, a trial and error method is applied to solve the problem. But in most of the cases, the problem remains unsolved, putting the devotees in endless sufferings. This paper identifies the type and magnitudes of the prevailing crisis of speech intelligibility of these community mosques through instrumental measurements and questionnaire survey. This paper is also intended to establish certain research rationale and hypothesis for further research, which will propose certain parameters in acoustical design for mosques of Dhaka city in particular and of Bangladesh in general.

  1. Søren Buus' contribution to speech intelligibility prediction

    NASA Astrophysics Data System (ADS)

    Müsch, Hannes; Florentine, Mary

    2005-04-01

    In addition to his work in psychoacoustics, Søren Buus also contributed to the field of speech intelligibility prediction by developing a model that predicts the results of speech recognition tests [H. Müsch and S. Buus, J. Acoust. Soc. Am. 109, 2896-2909 (2001)]. The model was successful in test conditions that are outside the scope of the Articulation Index. It builds on Green and Birdsall's concept of describing a speech recognition task as selecting one of several response alternatives [in D. Green and J. Swets, Signal Detection Theory (1966), pp. 609-619], and on Durlach et al.'s model for discriminating broadband sounds [J. Acoust. Soc. Am. 80, 63-72 (1986)]. Experimental evidence suggests that listeners can extract redundant, independent, or synergistic information from spectrally distinct speech bands. One of the main accomplishments of the model is to reflect this ability. The model also provides for a measure of linguistic entropy to enter the intelligibility prediction. Recent model development has focused on investigating whether this measure, the cognitive noise, can account for the effects of semantic and syntactic context. This presentation will review the model and present new model predictions. [Work supported by NIH grant R01DC00187.

  2. Predicting the intelligibility of deaf children's speech from acoustic measures

    NASA Astrophysics Data System (ADS)

    Uchanski, Rosalie M.; Geers, Ann E.; Brenner, Christine M.; Tobey, Emily A.

    2001-05-01

    A weighted combination of speech-acoustic measures may provide an objective assessment of speech intelligibility in deaf children that could be used to evaluate the benefits of sensory aids and rehabilitation programs. This investigation compared the accuracy of two different approaches, multiple linear regression and a simple neural net. These two methods were applied to identical sets of acoustic measures, including both segmental (e.g., voice-onset times of plosives, spectral moments of fricatives, second formant frequencies of vowels) and suprasegmental measures (e.g., sentence duration, number and frequency of intersentence pauses). These independent variables were obtained from digitized recordings of deaf children's imitations of 11 simple sentences. The dependent measure was the percentage of spoken words from the 36 McGarr Sentences understood by groups of naive listeners. The two predictive methods were trained on speech measures obtained from 123 out of 164 8- and 9-year-old deaf children who used cochlear implants. Then, predictions were obtained using speech measures from the remaining 41 children. Preliminary results indicate that multiple linear regression is a better predictor of intelligibility than the neural net, accounting for 79% as opposed to 65% of the variance in the data. [Work supported by NIH.

  3. Effects of Instantaneous Multiband Dynamic Compression on Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Herzke, Tobias; Hohmann, Volker

    2005-12-01

    The recruitment phenomenon, that is, the reduced dynamic range between threshold and uncomfortable level, is attributed to the loss of instantaneous dynamic compression on the basilar membrane. Despite this, hearing aids commonly use slow-acting dynamic compression for its compensation, because this was found to be the most successful strategy in terms of speech quality and intelligibility rehabilitation. Former attempts to use fast-acting compression gave ambiguous results, raising the question as to whether auditory-based recruitment compensation by instantaneous compression is in principle applicable in hearing aids. This study thus investigates instantaneous multiband dynamic compression based on an auditory filterbank. Instantaneous envelope compression is performed in each frequency band of a gammatone filterbank, which provides a combination of time and frequency resolution comparable to the normal healthy cochlea. The gain characteristics used for dynamic compression are deduced from categorical loudness scaling. In speech intelligibility tests, the instantaneous dynamic compression scheme was compared against a linear amplification scheme, which used the same filterbank for frequency analysis, but employed constant gain factors that restored the sound level for medium perceived loudness in each frequency band. In subjective comparisons, five of nine subjects preferred the linear amplification scheme and would not accept the instantaneous dynamic compression in hearing aids. Four of nine subjects did not perceive any quality differences. A sentence intelligibility test in noise (Oldenburg sentence test) showed little to no negative effects of the instantaneous dynamic compression, compared to linear amplification. A word intelligibility test in quiet (one-syllable rhyme test) showed that the subjects benefit from the larger amplification at low levels provided by instantaneous dynamic compression. Further analysis showed that the increase in intelligibility

  4. Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality.

    PubMed

    Kates, James M; Arehart, Kathryn H

    2015-10-01

    This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships. PMID:26520329

  5. An Alternative to the Computational Speech Intelligibility Index Estimates: Direct Measurement of Rectangular Passband Intelligibilities

    ERIC Educational Resources Information Center

    Warren, Richard M.; Bashford, James A., Jr.; Lenz, Peter W.

    2011-01-01

    The need for determining the relative intelligibility of passbands spanning the speech spectrum has been addressed by publications of the American National Standards Institute (ANSI). When the Articulation Index (AI) standard (ANSI, S3.5, 1969, R1986) was developed, available filters confounded passband and slope contributions. The AI procedure…

  6. Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.

    PubMed

    Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

    2014-07-01

    Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes. PMID:23998347

  7. Preschool Speech Intelligibility and Vocabulary Skills Predict Long-Term Speech and Language Outcomes Following Cochlear Implantation in Early Childhood

    PubMed Central

    Castellanos, Irina; Kronenberger, William G.; Beer, Jessica; Henning, Shirley C.; Colson, Bethany G.; Pisoni, David B.

    2013-01-01

    Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants, but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine if early preschool measures of speech and language performance predict speech-language functioning in long-term users of cochlear implants. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3 – 6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes. PMID:23998347

  8. Evaluation of adult aphasics with the Pediatric Speech Intelligibility test.

    PubMed

    Jerger, S; Oliver, T A; Martin, R C

    1990-04-01

    Results of conventional adult speech audiometry may be compromised by the presence of speech/language disorders, such as aphasia. The purpose of this project was to determine the efficacy of the speech intelligibility materials and techniques developed for young children in evaluating central auditory function in aphasic adults. Eight adult aphasics were evaluated with the Pediatric Speech Intelligibility (PSI) test, a picture-pointing approach that was carefully developed to be relatively insensitive to linguistic-cognitive skills and relatively sensitive to auditory-perceptual function. Results on message-to-competition ratio (MCR) functions or performance-intensity (PI) functions were abnormal in all subjects. Most subjects served as their own controls, showing normal performance on one ear coupled with abnormal performance on the other ear. The patterns of abnormalities were consistent with the patterns seen (1) on conventional speech audiometry in brain-lesioned adults without aphasia and (2) on the PSI test in brain-lesioned children without aphasia. An exception to this general observation was an atypical pattern of abnormality on PI-function testing in the subgroup of nonfluent aphasics. The nonfluent subjects showed substantially poorer word-max scores than sentence-max scores, a pattern seen previously in only one other patient group, namely young children with recurrent otitis media. The unusually depressed word-max abnormality was not meaningfully related to clinical diagnostic data regarding the degree of hearing loss and the location and severity of the lesions or to experimental data regarding the integrity of phonologic processing abilities. The observations of ear-specific and condition-specific abnormalities suggest that the linguistically- and cognitively-simplified PSI test may be useful in the evaluation of auditory-specific deficits in the aphasic adult. PMID:2132591

  9. Using the Speech Transmission Index for predicting non-native speech intelligibility

    NASA Astrophysics Data System (ADS)

    van Wijngaarden, Sander J.; Bronkhorst, Adelbert W.; Houtgast, Tammo; Steeneken, Herman J. M.

    2004-03-01

    While the Speech Transmission Index (STI) is widely applied for prediction of speech intelligibility in room acoustics and telecommunication engineering, it is unclear how to interpret STI values when non-native talkers or listeners are involved. Based on subjectively measured psychometric functions for sentence intelligibility in noise, for populations of native and non-native communicators, a correction function for the interpretation of the STI is derived. This function is applied to determine the appropriate STI ranges with qualification labels (``bad''-``excellent''), for specific populations of non-natives. The correction function is derived by relating the non-native psychometric function to the native psychometric function by a single parameter (ν). For listeners, the ν parameter is found to be highly correlated with linguistic entropy. It is shown that the proposed correction function is also valid for conditions featuring bandwidth limiting and reverberation.

  10. How broadband speech may avoid neural firing rate saturation at high intensities and maintain intelligibility

    PubMed Central

    Bashford, James A.; Warren, Richard M.; Lenz, Peter W.

    2015-01-01

    Three experiments examined the intelligibility enhancement produced when noise bands flank high intensity rectangular band speech. When white noise flankers were added to the speech individually at a low spectrum level (−30 dB relative to the speech) only the higher frequency flanker produced a significant intelligibility increase (i.e., recovery from intelligibility rollover). However, the lower-frequency flanking noise did produce an equivalent intelligibility increase when its spectrum level was increased by 10 dB. This asymmetrical intensity requirement, and other results, support previous suggestions that intelligibility loss at high intensities is reduced by lateral inhibition in the cochlear nuclei. PMID:25920887

  11. Formant trajectory characteristics in speakers with dysarthria and homogeneous speech intelligibility scores: Further data

    NASA Astrophysics Data System (ADS)

    Kim, Yunjung; Weismer, Gary; Kent, Ray D.

    2005-09-01

    In previous work [J. Acoust. Soc. Am. 117, 2605 (2005)], we reported on formant trajectory characteristics of a relatively large number of speakers with dysarthria and near-normal speech intelligibility. The purpose of that analysis was to begin a documentation of the variability, within relatively homogeneous speech-severity groups, of acoustic measures commonly used to predict across-speaker variation in speech intelligibility. In that study we found that even with near-normal speech intelligibility (90%-100%), many speakers had reduced formant slopes for some words and distributional characteristics of acoustic measures that were different than values obtained from normal speakers. In the current report we extend those findings to a group of speakers with dysarthria with somewhat poorer speech intelligibility than the original group. Results are discussed in terms of the utility of certain acoustic measures as indices of speech intelligibility, and as explanatory data for theories of dysarthria. [Work supported by NIH Award R01 DC00319.

  12. Traffic noise annoyance and speech intelligibility in persons with normal and persons with impaired hearing

    NASA Astrophysics Data System (ADS)

    Aniansson, G.; Björkman, M.

    1983-05-01

    Annoyance ratings in speech intelligibility tests at 45 dB(A) and 55 dB(A) traffic noise were investigated in a laboratory study. Subjects were chosen according to their hearing acuity to be representative of 70-year-old men and women, and of noise-induced hearing losses typical for a great number of industrial workers. These groups were compared with normal hearing subjects of the same sex and, when possible, the same age. The subjects rated their annoyance on an open 100 mm scale. Significant correlations were found between annoyance expressed in millimetres and speech intelligibility in percent when all subjects were taken as one sample. Speech intelligibility was also calculated from physical measurements of speech and noise by using the articulation index method. Observed and calculated speech intelligibility scores are compared and discussed. Also treated is the estimation of annoyance by traffic noise at moderate noise levels via speech intelligibility scores.

  13. Evaluation of the importance of time-frequency contributions to speech intelligibility in noise.

    PubMed

    Yu, Chengzhu; Wójcicki, Kamil K; Loizou, Philipos C; Hansen, John H L; Johnson, Michael T

    2014-05-01

    Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures. PMID:24815280

  14. Working memory and intelligibility of hearing-aid processed speech.

    PubMed

    Souza, Pamela E; Arehart, Kathryn H; Shen, Jing; Anderson, Melinda; Kates, James M

    2015-01-01

    Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity) and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing [wide-dynamic range compression (WDRC) and frequency compression (FC)] applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than to the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with WDRC and/or FC parameters that maintain the fidelity of the signal envelope. PMID:25999874

  15. Working memory and intelligibility of hearing-aid processed speech

    PubMed Central

    Souza, Pamela E.; Arehart, Kathryn H.; Shen, Jing; Anderson, Melinda; Kates, James M.

    2015-01-01

    Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity) and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing [wide-dynamic range compression (WDRC) and frequency compression (FC)] applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than to the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with WDRC and/or FC parameters that maintain the fidelity of the signal envelope. PMID:25999874

  16. Evaluating the role of spectral and envelope characteristics in the intelligibility advantage of clear speech

    PubMed Central

    Krause, Jean C.; Braida, Louis D.

    2009-01-01

    In adverse listening conditions, talkers can increase their intelligibility by speaking clearly [Picheny, M.A., et al. (1985). J. Speech Hear. Res. 28, 96–103; Payton, K. L., et al. (1994). J. Acoust. Soc. Am. 95, 1581–1592]. This modified speaking style, known as clear speech, is typically spoken more slowly than conversational speech [Picheny, M. A., et al. (1986). J. Speech Hear. Res. 29, 434–446; Uchanski, R. M., et al. (1996). J. Speech Hear. Res. 39, 494–509]. However, talkers can produce clear speech at normal rates (clear∕normal speech) with training [Krause, J. C., and Braida, L. D. (2002). J. Acoust. Soc. Am. 112, 2165–2172] suggesting that clear speech has some inherent acoustic properties, independent of rate, that contribute to its improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. Two global-level properties of clear∕normal speech that appear likely to be associated with improved intelligibility are increased energy in the 1000–3000-Hz range of long-term spectra and increased modulation depth of low-frequency modulations of the intensity envelope [Krause, J. C., and Braida, L. D. (2004). J. Acoust. Soc. Am. 115, 362–378]. In an attempt to isolate the contributions of these two properties to intelligibility, signal processing transformations were developed to manipulate each of these aspects of conversational speech independently. Results of intelligibility testing with hearing-impaired listeners and normal-hearing listeners in noise suggest that (1) increasing energy between 1000 and 3000 Hz does not fully account for the intelligibility benefit of clear∕normal speech, and (2) simple filtering of the intensity envelope is generally detrimental to intelligibility. While other manipulations of the intensity envelope are required to determine conclusively the role of this factor in intelligibility, it is also likely that additional properties important for

  17. Quantifying the intelligibility of speech in noise for non-native talkers

    NASA Astrophysics Data System (ADS)

    van Wijngaarden, Sander J.; Steeneken, Herman J. M.; Houtgast, Tammo

    2002-12-01

    The intelligibility of speech pronounced by non-native talkers is generally lower than speech pronounced by native talkers, especially under adverse conditions, such as high levels of background noise. The effect of foreign accent on speech intelligibility was investigated quantitatively through a series of experiments involving voices of 15 talkers, differing in language background, age of second-language (L2) acquisition and experience with the target language (Dutch). Overall speech intelligibility of L2 talkers in noise is predicted with a reasonable accuracy from accent ratings by native listeners, as well as from the self-ratings for proficiency of L2 talkers. For non-native speech, unlike native speech, the intelligibility of short messages (sentences) cannot be fully predicted by phoneme-based intelligibility tests. Although incorrect recognition of specific phonemes certainly occurs as a result of foreign accent, the effect of reduced phoneme recognition on the intelligibility of sentences may range from severe to virtually absent, depending on (for instance) the speech-to-noise ratio. Objective acoustic-phonetic analyses of accented speech were also carried out, but satisfactory overall predictions of speech intelligibility could not be obtained with relatively simple acoustic-phonetic measures.

  18. Speech intelligibility of children with cochlear implants, tactile aids, or hearing aids.

    PubMed

    Osberger, M J; Maso, M; Sam, L K

    1993-02-01

    Speech intelligibility was measured in 31 children who used the 3M/House single-channel implant (n = 12), the Nucleus 22-Channel Cochlear Implant System (n = 15), or the Tactaid II + two-channel vibrotactile aid (n = 4). The subjects were divided into subgroups based on age at onset of deafness (early or late). The speech intelligibility of the experimental subjects was compared to that of children who were profoundly hearing impaired who used conventional hearing aids (n = 12) or no sensory aid (n = 2). The subjects with early onset of deafness who received their single- or multichannel cochlear implant before age 10 demonstrated the highest speech intelligibility, whereas subjects who did not receive their device until after age 10 had the poorest speech intelligibility. There was no obvious difference in the speech intelligibility scores of these subjects as a function of type of device (implant or tactile aid). On the average, the postimplant or tactile aid speech intelligibility of the subjects with early onset of deafness was similar to that of hearing aid users with hearing levels between 100 and 110 dB HL and limited hearing in the high frequencies. The speech intelligibility of subjects with late onset of deafness showed marked deterioration after the onset of deafness with relatively large improvements by most subjects after they received a single- or multichannel implant. The one subject with late onset of deafness who used a tactile aid showed no improvement in speech intelligibility. PMID:8450658

  19. Speech Intelligibility of Children with Cochlear Implants, Tactile, or Hearing Aids.

    ERIC Educational Resources Information Center

    Osberger, Mary Joe; And Others

    1993-01-01

    This study found that children with early onset of deafness who received single-channel or multichannel cochlear implants before age 10 demonstrated higher speech intelligibility than children receiving their device after age 10. There was no obvious difference in speech intelligibility scores as a function of type of device (implant or tactile…

  20. The Relationship between Measures of Hearing Loss and Speech Intelligibility in Young Deaf Children.

    ERIC Educational Resources Information Center

    Musselman, Carol Reich

    1990-01-01

    This study of 121 young deaf children identified 3 distinct groups: children with losses of 70-89 decibels developed some intelligible speech, children with losses of 90-104 decibels exhibited considerable variability, and children with losses above 105 decibels developed little intelligible speech. The unaided hearing threshold level was the best…

  1. Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

    ERIC Educational Resources Information Center

    Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

    2010-01-01

    Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…

  2. Speech-in-noise enhancement using amplification and dynamic range compression controlled by the speech intelligibility index.

    PubMed

    Schepker, Henning; Rennies, Jan; Doclo, Simon

    2015-11-01

    In many speech communication applications, such as public address systems, speech is degraded by additive noise, leading to reduced speech intelligibility. In this paper a pre-processing algorithm is proposed that is capable of increasing speech intelligibility under an equal-power constraint. The proposed AdaptDRC algorithm comprises two time- and frequency-dependent stages, i.e., an amplification stage and a dynamic range compression stage that are both dependent on the Speech Intelligibility Index (SII). Experiments using two objective measures, namely, the extended SII and the short-time objective intelligibility measure (STOI), and a formal listening test were conducted to compare the AdaptDRC algorithm with a modified version of a recently proposed algorithm in three different noise conditions (stationary car noise and speech-shaped noise and non-stationary cafeteria noise). While the objective measures indicate a similar performance for both algorithms, results from the formal listening test indicate that for the two stationary noises both algorithms lead to statistically significant improvements in speech intelligibility and for the non-stationary cafeteria noise only the proposed AdaptDRC algorithm leads to statistically significant improvements. A comparison of both objective measures and results from the listening test shows high correlations, although, in general, the performance of both algorithms is overestimated. PMID:26627746

  3. Minimal Pair Distinctions and Intelligibility in Preschool Children with and without Speech Sound Disorders

    ERIC Educational Resources Information Center

    Hodge, Megan M.; Gotzke, Carrie L.

    2011-01-01

    Listeners' identification of young children's productions of minimally contrastive words and predictive relationships between accurately identified words and intelligibility scores obtained from a 100-word spontaneous speech sample were determined for 36 children with typically developing speech (TDS) and 36 children with speech sound disorders…

  4. Speech intelligibility and speech quality of modified loudspeaker announcements examined in a simulated aircraft cabin.

    PubMed

    Pennig, Sibylle; Quehl, Julia; Wittkowski, Martin

    2014-01-01

    Acoustic modifications of loudspeaker announcements were investigated in a simulated aircraft cabin to improve passengers' speech intelligibility and quality of communication in this specific setting. Four experiments with 278 participants in total were conducted in an acoustic laboratory using a standardised speech test and subjective rating scales. In experiments 1 and 2 the sound pressure level (SPL) of the announcements was varied (ranging from 70 to 85 dB(A)). Experiments 3 and 4 focused on frequency modification (octave bands) of the announcements. All studies used a background noise with the same SPL (74 dB(A)), but recorded at different seat positions in the aircraft cabin (front, rear). The results quantify speech intelligibility improvements with increasing signal-to-noise ratio and amplification of particular octave bands, especially the 2 kHz and the 4 kHz band. Thus, loudspeaker power in an aircraft cabin can be reduced by using appropriate filter settings in the loudspeaker system. PMID:25183056

  5. Speech intelligibility in free field: Spatial unmasking in preschool children

    PubMed Central

    Garadat, Soha N.; Litovsky, Ruth Y.

    2009-01-01

    This study introduces a new test (CRISP-Jr.) for measuring speech intelligibility and spatial release from masking (SRM) in young children ages 2.5–4 years. Study 1 examined whether thresholds, masking, and SRM obtained with a test designed for older children (CRISP) and CRISP-Jr. are comparable in 4 to 5-year-old children. Thresholds were measured for target speech in front, in quiet, and with a different-sex masker either in front or on the right. CRISP-Jr. yielded higher speech reception thresholds (SRTs) than CRISP, but the amount of masking and SRM did not differ across the tests. In study 2, CRISP-Jr. was extended to a group of 3-year-old children. Results showed that while SRTs were higher in the younger group, there were no age differences in masking and SRM. These findings indicate that children as young as 3 years old are able to use spatial cues in sound source segregation, which suggests that some of the auditory mechanisms that mediate this ability develop early in life. In addition, the findings suggest that measures of SRM in young children are not limited to a particular set of stimuli. These tests have potentially useful applications in clinical settings, where bilateral fittings of amplification devices are evaluated. PMID:17348527

  6. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests.

    PubMed

    Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A

    2015-01-01

    Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that

  7. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    PubMed Central

    Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

    2015-01-01

    Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that

  8. Quantifying the intelligibility of speech in noise for non-native listeners

    NASA Astrophysics Data System (ADS)

    van Wijngaarden, Sander J.; Steeneken, Herman J. M.; Houtgast, Tammo

    2002-04-01

    When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.

  9. Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

    NASA Astrophysics Data System (ADS)

    Mapp, Peter

    2002-11-01

    Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.

  10. Investigation of in-vehicle speech intelligibility metrics for normal hearing and hearing impaired listeners

    NASA Astrophysics Data System (ADS)

    Samardzic, Nikolina

    The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly

  11. Effect of Whole-Body Vibration on Speech. Part 2; Effect on Intelligibility

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.

    2011-01-01

    The effect on speech intelligibility was measured for speech where talkers reading Diagnostic Rhyme Test material were exposed to 0.7 g whole body vibration to simulate space vehicle launch. Across all talkers, the effect of vibration was to degrade the percentage of correctly transcribed words from 83% to 74%. The magnitude of the effect of vibration on speech communication varies between individuals, for both talkers and listeners. A worst case scenario for intelligibility would be the most sensitive listener hearing the most sensitive talker; one participant s intelligibility was reduced by 26% (97% to 71%) for one of the talkers.

  12. Predicting speech intelligibility in noise for hearing-critical jobs

    NASA Astrophysics Data System (ADS)

    Soli, Sigfrid D.; Laroche, Chantal; Giguere, Christian

    2003-10-01

    Many jobs require auditory abilities such as speech communication, sound localization, and sound detection. An employee for whom these abilities are impaired may constitute a safety risk for himself or herself, for fellow workers, and possibly for the general public. A number of methods have been used to predict these abilities from diagnostic measures of hearing (e.g., the pure-tone audiogram); however, these methods have not proved to be sufficiently accurate for predicting performance in the noise environments where hearing-critical jobs are performed. We have taken an alternative and potentially more accurate approach. A direct measure of speech intelligibility in noise, the Hearing in Noise Test (HINT), is instead used to screen individuals. The screening criteria are validated by establishing the empirical relationship between the HINT score and the auditory abilities of the individual, as measured in laboratory recreations of real-world workplace noise environments. The psychometric properties of the HINT enable screening of individuals with an acceptable amount of error. In this presentation, we will describe the predictive model and report the results of field measurements and laboratory studies used to provide empirical validation of the model. [Work supported by Fisheries and Oceans Canada.

  13. Effects of Adaptation Rate and Noise Suppression on the Intelligibility of Compressed-Envelope Based Speech

    PubMed Central

    Lai, Ying-Hui; Tsao, Yu; Chen, Fei

    2015-01-01

    Temporal envelope is the primary acoustic cue used in most cochlear implant (CI) speech processors to elicit speech perception for patients fitted with CI devices. Envelope compression narrows down envelope dynamic range and accordingly degrades speech understanding abilities of CI users, especially under challenging listening conditions (e.g., in noise). A new adaptive envelope compression (AEC) strategy was proposed recently, which in contrast to the traditional static envelope compression, is effective at enhancing the modulation depth of envelope waveform by making best use of its dynamic range and thus improving the intelligibility of envelope-based speech. The present study further explored the effect of adaptation rate in envelope compression on the intelligibility of compressed-envelope based speech. Moreover, since noise reduction is another essential unit in modern CI systems, the compatibility of AEC and noise reduction was also investigated. In this study, listening experiments were carried out by presenting vocoded sentences to normal hearing listeners for recognition. Experimental results demonstrated that the adaptation rate in envelope compression had a notable effect on the speech intelligibility performance of the AEC strategy. By specifying a suitable adaptation rate, speech intelligibility could be enhanced significantly in noise compared to when using static envelope compression. Moreover, results confirmed that the AEC strategy was suitable for combining with noise reduction to improve the intelligibility of envelope-based speech in noise. PMID:26196508

  14. Evaluating the benefit of recorded early reflections from a classroom for speech intelligibility

    NASA Astrophysics Data System (ADS)

    Larsen, Jeffery B.

    Recent standards for classrooms acoustics recommend achieving low levels of reverberation to provide suitable conditions for speech communication (ANSI, 2002; ASHA, 1995). Another viewpoint recommends optimizing classroom acoustics to emphasize early reflections and reduce later arriving reflections (Boothroyd, 2004; Bradley, Sato, & Picard, 2003). The idea of emphasizing early reflections is based in the useful-to-detrimental ratio (UDR) model of speech intelligibility in rooms (Lochner & Burger, 1964). The UDR model predicts that listeners integrate energy from early reflections to improve the signal-to-noise (SNR) of the direct speech signal. However, both early and more recent studies of early reflections and speech intelligibility have used simulated reflections that may not accurately represent the effects of real early reflections on the speech intelligibility of listeners. Is speech intelligibility performance enhanced by the presence of real early reflections in noisy classroom environments? The effect of actual early reflections on speech intelligibility was evaluated by recording a binaural impulse response (BRIR) with a K.E.M.A.R. in a college classroom. From the BRIR, five listening conditions were created with varying amounts of early reflections. Young-adult listeners with normal hearing participated in a fixed SNR word intelligibility task and a variable SNR task to test if speech intelligibility was improved in competing noise when recorded early reflections were present as compared to direct speech alone. Mean speech intelligibility performance gains or SNR benefits were not observed with recorded early reflections. When simulated early reflections were included, improved speech understanding was observed for simulated reflections but for with real reflections. Spectral, temporal, and phonemic analyses were performed to investigate acoustic differences in recorded and simulated reflections. Spectral distortions in the recorded reflections may have

  15. Comparing the single-word intelligibility of two speech synthesizers for small computers

    SciTech Connect

    Cochran, P.S.

    1986-01-01

    Previous research on the intelligibility of synthesized speech has placed emphasis on the segmental intelligibility (rather than word or sentence intelligibility) of expensive and sophisticated synthesis systems. There is a need for more information about the intelligibility of low-to-moderately priced speech synthesizers because they are the most likely to be widely purchase for clinical and educational use. This study was to compare the word intelligibility of two such synthesizers for small computers, the Votrax Personal Speech System (PSS) and the Echo GP (General Purpose). A multiple-choice word identification task was used in a two-part study in which 48 young adults served as listeners. Groups of subjects in Part I completed one trial listening to taped natural speech followed by one trial with each synthesizer. Subjects in Part II listened to the taped human speech followed by two trials with the same synthesizer. Under the quiet listening conditions used for this study, taped human speech was 30% more intelligible than the Votrax PSS, and 53% more intelligible than the Echo GP.

  16. Speech Intelligibility in Deaf Children After Long-Term Cochlear Implant Use

    PubMed Central

    Montag, Jessica L.; AuBuchon, Angela M.; Pisoni, David B.; Kronenberger, William G.

    2015-01-01

    Purpose This study investigated long-term speech intelligibility outcomes in 63 prelingually deaf children, adolescents, and young adults who received cochlear implants (CIs) before age 7 (M = 2;11 [years;months], range = 0;8–6;3) and used their implants for at least 7 years (M = 12;1, range = 7;0–22;5). Method Speech intelligibility was assessed using playback methods with naïve, normal-hearing listeners. Results Mean intelligibility scores were lower than scores obtained from an age- and nonverbal IQ–matched, normal-hearing control sample, although the majority of CI users scored within the range of the control sample. Our sample allowed us to investigate the contribution of several demographic and cognitive factors to speech intelligibility. CI users who used their implant for longer periods of time exhibited poorer speech intelligibility scores. Crucially, results from a hierarchical regression model suggested that this difference was due to more conservative candidacy criteria in CI users with more years of use. No other demographic variables accounted for significant variance in speech intelligibility scores beyond age of implantation and amount of spoken language experience (assessed by communication mode and family income measures). Conclusion Many factors that have been found to contribute to individual differences in language outcomes in normal-hearing children also contribute to long-term CI users’ ability to produce intelligible speech. PMID:25260109

  17. Speech Intelligibility of Pediatric Cochlear Implant Recipients With 7 Years of Device Experience

    PubMed Central

    Peng, Shu-Chen; Spencer, Linda J.; Tomblin, J. Bruce

    2011-01-01

    Speech intelligibility of 24 prelingually deaf pediatric cochlear implant (CI) recipients with 84 months of device experience was investigated. Each CI participant's speech samples were judged by a panel of 3 listeners. Intelligibility scores were calculated as the average of the 3 listeners' responses. The average write-down intelligibility score was 71.54% (SD = 29.89), and the average rating-scale intelligibility score was 3.03 points (SD = 1.01). Write-down and rating-scale intelligibility scores were highly correlated (r = .91, p < .001). Linear regression analyses revealed that both age at implantation and different speech-coding strategies contribute to the variability of CI participants' speech intelligibility. Implantation at a younger age and the use of the spectral-peak speech-coding strategy yielded higher intelligibility scores than implantation at an older age and the use of the multipeak speech-coding strategy. These results serve as indices for clinical applications when long-term advancements in spoken-language development are considered for pediatric CI recipients. PMID:15842006

  18. Factors Contributing to the Developnent of Intelligible Speech among Prelingually Deaf Persons.

    ERIC Educational Resources Information Center

    Sims, Donald G.; And Others

    1980-01-01

    A descriptive study of the incidence of semi-intelligible or better speech among 108 National Technical Institute for the Deaf students with congenital hearing loss greater than 91 decibels is presented. (Author/PHR)

  19. Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners.

    PubMed

    Geravanchizadeh, Masoud; Fallah, Ali

    2015-12-01

    A binaural and psychoacoustically motivated intelligibility model, based on a well-known monaural microscopic model is proposed. This model simulates a phoneme recognition task in the presence of spatially distributed speech-shaped noise in anechoic scenarios. In the proposed model, binaural advantage effects are considered by generating a feature vector for a dynamic-time-warping speech recognizer. This vector consists of three subvectors incorporating two monaural subvectors to model the better-ear hearing, and a binaural subvector to simulate the binaural unmasking effect. The binaural unit of the model is based on equalization-cancellation theory. This model operates blindly, which means separate recordings of speech and noise are not required for the predictions. Speech intelligibility tests were conducted with 12 normal hearing listeners by collecting speech reception thresholds (SRTs) in the presence of single and multiple sources of speech-shaped noise. The comparison of the model predictions with the measured binaural SRTs, and with the predictions of a macroscopic binaural model called extended equalization-cancellation, shows that this approach predicts the intelligibility in anechoic scenarios with good precision. The square of the correlation coefficient (r(2)) and the mean-absolute error between the model predictions and the measurements are 0.98 and 0.62 dB, respectively. PMID:26723354

  20. Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit

    PubMed Central

    2012-01-01

    Abstract Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based on the non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of the proposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearing impaired children and elderly listeners. It was shown that for the speech with average rate equal to or higher than 6.48 vowels/s, all of the proposed methods have statistically significant impact on the improvement of speech intelligibility for hearing impaired children with reduced hearing resolution and one of the proposed methods significantly improves comprehension of speech in the group of elderly listeners with reduced hearing resolution. Virtual slides http://www.diagnosticpathology.diagnomx.eu/vs/2065486371761991 PMID:23009662

  1. Proximate factors associated with speech intelligibility in children with cochlear implants: A preliminary study.

    PubMed

    Chin, Steven B; Kuhns, Matthew J

    2014-01-01

    The purpose of this descriptive pilot study was to examine possible relationships among speech intelligibility and structural characteristics of speech in children who use cochlear implants. The Beginners Intelligibility Test (BIT) was administered to 10 children with cochlear implants, and the intelligibility of the words in the sentences was judged by panels of naïve adult listeners. Additionally, several qualitative and quantitative measures of word omission, segment correctness, duration, and intonation variability were applied to the sentences used to assess intelligibility. Correlational analyses were conducted to determine if BIT scores and the other speech parameters were related. There was a significant correlation between BIT score and percent words omitted, but no other variables correlated significantly with BIT score. The correlation between intelligibility and word omission may be task-specific as well as reflective of memory limitations. PMID:25000376

  2. Effects of Noise and Speech Intelligibility on Listener Comprehension and Processing Time of Korean-Accented English

    ERIC Educational Resources Information Center

    Wilson, Erin O'Brien; Spaulding, Tammie J.

    2010-01-01

    Purpose: This study evaluated the effects of noise and speech intelligibility on the processing of speech produced from native English; high-intelligibility, Korean-accented English; and moderate-intelligibility, Korean-accented English speakers. Method: Both listener comprehension, determined by accuracy judgment on true/false sentences, and…

  3. Perceptual Measures of Speech from Individuals with Parkinson's Disease and Multiple Sclerosis: Intelligibility and beyond

    ERIC Educational Resources Information Center

    Sussman, Joan E.; Tjaden, Kris

    2012-01-01

    Purpose: The primary purpose of this study was to compare percent correct word and sentence intelligibility scores for individuals with multiple sclerosis (MS) and Parkinson's disease (PD) with scaled estimates of speech severity obtained for a reading passage. Method: Speech samples for 78 talkers were judged, including 30 speakers with MS, 16…

  4. Talker Differences in Clear and Conversational Speech: Vowel Intelligibility for Older Adults with Hearing Loss

    ERIC Educational Resources Information Center

    Ferguson, Sarah Hargus

    2012-01-01

    Purpose: To establish the range of talker variability for vowel intelligibility in clear versus conversational speech for older adults with hearing loss and to determine whether talkers who produced a clear speech benefit for young listeners with normal hearing also did so for older adults with hearing loss. Method: Clear and conversational vowels…

  5. Effects of Loud and Amplified Speech on Sentence and Word Intelligibility in Parkinson Disease

    ERIC Educational Resources Information Center

    Neel, Amy T.

    2009-01-01

    Purpose: In the two experiments in this study, the author examined the effects of increased vocal effort (loud speech) and amplification on sentence and word intelligibility in speakers with Parkinson disease (PD). Methods: Five talkers with PD produced sentences and words at habitual levels of effort and using loud speech techniques. Amplified…

  6. Intelligibility, Comprehensibility, and Accentedness of L2 Speech: The Role of Listener Experience and Semantic Context

    ERIC Educational Resources Information Center

    Kennedy, Sara; Trofimovich, Pavel

    2008-01-01

    This study investigated how listener experience (extent of previous exposure to non-native speech) and semantic context (degree and type of semantic information available) influence measures of intelligibility, comprehensibility, and accentedness of non-native (L2) speech. Participants were 24 native English-speaking listeners, half experienced…

  7. Intelligibility of Modified Speech for Young Listeners with Normal and Impaired Hearing.

    ERIC Educational Resources Information Center

    Uchanski, Rosalie M.; Geers, Ann E.; Protopapas, Athanassios

    2002-01-01

    A study examined whether the benefits of modified speech could be extended to provide intelligibility improvements for eight children (ages 8-14) with severe-to-profound hearing impairments who wear sensory aids and five controls. All varieties of modified speech (envelope-amplified, slowed, and both) yielded either equivalent or poorer…

  8. Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech

    PubMed Central

    Kishida, Takuya; Nakajima, Yoshitaka; Ueda, Kazuo; Remijn, Gerard B.

    2016-01-01

    Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary

  9. Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech.

    PubMed

    Kishida, Takuya; Nakajima, Yoshitaka; Ueda, Kazuo; Remijn, Gerard B

    2016-01-01

    Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3-4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1-9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic

  10. Assessing Speech Intelligibility in Children with Hearing Loss: Toward Revitalizing a Valuable Clinical Tool

    ERIC Educational Resources Information Center

    Ertmer, David J.

    2011-01-01

    Background: Newborn hearing screening, early intervention programs, and advancements in cochlear implant and hearing aid technology have greatly increased opportunities for children with hearing loss to become intelligible talkers. Optimizing speech intelligibility requires that progress be monitored closely. Although direct assessment of…

  11. Predicting the intelligibility of reverberant speech for cochlear implant listeners with a non-intrusive intelligibility measure

    PubMed Central

    Chen, Fei; Hazrati, Oldooz; Loizou, Philipos C.

    2012-01-01

    Reverberation is known to reduce the temporal envelope modulations present in the signal and affect the shape of the modulation spectrum. A non-intrusive intelligibility measure for reverberant speech is proposed motivated by the fact that the area of the modulation spectrum decreases with increasing reverberation. The proposed measure is based on the average modulation area computed across four acoustic frequency bands spanning the signal bandwidth. High correlations (r = 0.98) were observed with sentence intelligibility scores obtained by cochlear implant listeners. Proposed measure outperformed other measures including an intrusive speech-transmission index based measure. PMID:23710246

  12. The effect of three variables on synthetic speech intelligibility in noisy environments

    NASA Astrophysics Data System (ADS)

    Munlin, Joyce C.

    1990-03-01

    Military Command and Control (C2) requires easy access to information needed for the commander's situation assessment and direction of troops. Providing this information via synthetic speech is a viable alternative, but additional information is required before speech systems can be implemented for C2 functions. An experiment was conducted to study several factors which may affect the intelligibility of synthetic speech. The factors examined were: (1) speech rate; (2) synthetic speech messages presented at lower, the same, and higher frequencies than background noise frequency; (3) voice richness; and (4) interactions between speech rate, voice fundamental frequency, and voice richness. Response latency and recognition accuracy were measured. Results clearly indicate that increasing speech rate leads to an increase latency and a decrease in recognition accuracy, at least for the novice user. No effect of voice fundamental frequency or richness was demonstrated.

  13. Functional connectivity between face-movement and speech-intelligibility areas during auditory-only speech perception.

    PubMed

    Schall, Sonja; von Kriegstein, Katharina

    2014-01-01

    It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers' voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker's face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas. PMID:24466026

  14. Talker differences in clear and conversational speech: Vowel intelligibility for normal-hearing listeners

    NASA Astrophysics Data System (ADS)

    Hargus Ferguson, Sarah

    2004-10-01

    Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. While variability among talkers during speech production is well known, only one study to date [Gagné et al., J. Acad. Rehab. Audiol. 27, 135-158 (1994)] has directly examined differences among talkers producing clear and conversational speech. Data from that study, which utilized ten talkers, suggested that talkers vary in the extent to which they improve their intelligibility by speaking clearly. Similar variability can be also seen in studies using smaller groups of talkers [e.g., Picheny, Durlach, and Braida, J. Speech Hear. Res. 28, 96-103 (1985)]. In the current paper, clear and conversational speech materials were recorded from 41 male and female talkers aged 18 to 45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. While female talkers showed a larger clear speech vowel intelligibility benefit than male talkers, neither talker age nor prior experience communicating with hearing-impaired listeners significantly affected the speaking style effect. .

  15. Intelligibilities of 1-octave rectangular bands spanning the speech spectrum when heard separately and paired

    PubMed Central

    Warren, Richard M.; Bashford, James A.; Lenz, Peter W.

    2011-01-01

    There is a need, both for speech theory and for many practical applications, to know the intelligibilities of individual passbands that span the speech spectrum when they are heard singly and in combination. While indirect procedures have been employed for estimating passband intelligibilities (e.g., the Speech Intelligibility Index), direct measurements have been blocked by the confounding contributions from transition band slopes that accompany filtering. A recent study has reported that slopes of several thousand dBA/octave produced by high-order finite impulse response filtering were required to produce the effectively rectangular bands necessary to eliminate appreciable contributions from transition bands [Warren et al., J. Acoust. Soc. Am. 115, 1292–1295 (2004)]. Using such essentially vertical slopes, the present study employed sentences, and reports the intelligibilities of their six 1-octave contiguous passbands having center frequencies from 0.25 to 8 kHz when heard alone, and for each of their 15 possible pairings. PMID:16334905

  16. Three Subgroups of Preschool Children with Speech Intelligibility Disorders.

    ERIC Educational Resources Information Center

    Whaley, Patricia; And Others

    The study was designed to measure language, cognitive, perceptual, and oral-motor abilities in 17 preschoolers referred for speech-language evaluation because of unintelligible speech. Ss were administered a test battery which included tests of hearing, coarticulation, vocabulary, speech sound discrimination, and oral examination of the speech…

  17. State of the Art in Rehabilitative Audiology: Speech Intelligibility.

    ERIC Educational Resources Information Center

    Schultz, Martin C.

    In a speech presented at the Annual Meeting of the American Speech and Hearing Association (1974), the author discusses prospective direction for research in speech recognition and comprehension of the hearing impaired. A theoretical perspective focuses on the role of probabilities in the process of transmitting and receiving auditory messages.…

  18. The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts

    PubMed Central

    Hayes-Harb, Rachel; Smith, Bruce L.; Bent, Tessa; Bradlow, Ann R.

    2009-01-01

    This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit. PMID:19606271

  19. On the Role of Theta-Driven Syllabic Parsing in Decoding Speech: Intelligibility of Speech with a Manipulated Modulation Spectrum

    PubMed Central

    Ghitza, Oded

    2012-01-01

    Recent hypotheses on the potential role of neuronal oscillations in speech perception propose that speech is processed on multi-scale temporal analysis windows formed by a cascade of neuronal oscillators locked to the input pseudo-rhythm. In particular, Ghitza (2011) proposed that the oscillators are in the theta, beta, and gamma frequency bands with the theta oscillator the master, tracking the input syllabic rhythm and setting a time-varying, hierarchical window structure synchronized with the input. In the study described here the hypothesized role of theta was examined by measuring the intelligibility of speech with a manipulated modulation spectrum. Each critical-band signal was manipulated by controlling the degree of temporal envelope flatness. Intelligibility of speech with critical-band envelopes that are flat is poor; inserting extra information, restricted to the input syllabic rhythm, markedly improves intelligibility. It is concluded that flattening the critical-band envelopes prevents the theta oscillator from tracking the input rhythm, hence the disruption of the hierarchical window structure that controls the decoding process. Reinstating the input-rhythm information revives the tracking capability, hence restoring the synchronization between the window structure and the input, resulting in the extraction of additional information from the flat modulation spectrum. PMID:22811672

  20. Assessing children's speech intelligibility and oral structures, and functions via an Internet-based telehealth system.

    PubMed

    Waite, Monique C; Theodoros, Deborah G; Russell, Trevor G; Cahill, Louise M

    2012-06-01

    We examined the validity and reliability of an Internet-based telehealth system for screening speech intelligibility and oro-motor structure, and function in children with speech disorders. Twenty children aged 4-9 years were assessed by a clinician in the conventional, face-to-face (FTF) manner; simultaneously, they were assessed by a second clinician via the videoconferencing system using a 128-kbit/s Internet connection. Speech intelligibility in conversation was rated and an informal assessment of oro-motor structure and function was conducted. There was a high level of agreement between the online and FTF speech intelligibility ratings, with 70% exact agreement and 100% close agreement (within ± point on a 5-point scale). The weighted kappa statistic revealed very good agreement between raters (kappa = 0.86). Data for online and FTF ratings of oro-motor function revealed overall exact agreement of 73%, close agreement of 96%, moderate or good strength of agreement for six variables (kappa = 0.48-0.74), and poor to fair agreement for six variables (kappa = 0.12-0.36). Intra- and inter-rater reliability measures (ICCs) were similar between the online and FTF assessments. Low levels of agreement for some oro-motor variables highlighted the subjectivity of this assessment. However, the overall results support the validity and reliability of Internet-based screening of speech intelligibility and oro-motor function in children with speech disorders. PMID:22604277

  1. Usage of the HMM-Based Speech Synthesis for Intelligent Arabic Voice

    NASA Astrophysics Data System (ADS)

    Fares, Tamer S.; Khalil, Awad H.; Hegazy, Abd El-Fatah A.

    2008-06-01

    The HMM as a suitable model for time sequence modeling is used for estimation of speech synthesis parameters, A speech parameter sequence is generated from HMMs themselves whose observation vectors consists of spectral parameter vector and its dynamic feature vectors. HMMs generate cepstral coefficients and pitch parameter which are then fed to speech synthesis filter named Mel Log Spectral Approximation (MLSA), this paper explains how this approach can be applied to the Arabic language to produce intelligent Arabic speech synthesis using the HMM-Based Speech Synthesis and the influence of using of the dynamic features and the increasing of the number of mixture components on the quality enhancement of the Arabic speech synthesized.

  2. Listening to the brainstem: musicianship enhances intelligibility of subcortical representations for speech.

    PubMed

    Weiss, Michael W; Bidelman, Gavin M

    2015-01-28

    Auditory experiences including musicianship and bilingualism have been shown to enhance subcortical speech encoding operating below conscious awareness. Yet, the behavioral consequence of such enhanced subcortical auditory processing remains undetermined. Exploiting their remarkable fidelity, we examined the intelligibility of auditory playbacks (i.e., "sonifications") of brainstem potentials recorded in human listeners. We found naive listeners' behavioral classification of sonifications was faster and more categorical when evaluating brain responses recorded in individuals with extensive musical training versus those recorded in nonmusicians. These results reveal stronger behaviorally relevant speech cues in musicians' neural representations and demonstrate causal evidence that superior subcortical processing creates a more comprehensible speech signal (i.e., to naive listeners). We infer that neural sonifications of speech-evoked brainstem responses could be used in the early detection of speech-language impairments due to neurodegenerative disorders, or in objectively measuring individual differences in speech reception solely by listening to individuals' brain activity. PMID:25632143

  3. Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet.

    PubMed

    Rennies, Jan; Brand, Thomas; Kollmeier, Birger

    2011-11-01

    Reverberation usually degrades speech intelligibility for spatially separated speech and noise sources since spatial unmasking is reduced and late reflections decrease the fidelity of the received speech signal. The latter effect could not satisfactorily be predicted by a recently presented binaural speech intelligibility model [Beutelmann et al. (2010). J. Acoust. Soc. Am. 127, 2479-2497]. This study therefore evaluated three extensions of the model to improve its predictions: (1) an extension of the speech intelligibility index based on modulation transfer functions, (2) a correction factor based on the room acoustical quantity "definition," and (3) a separation of the speech signal into useful and detrimental parts. The predictions were compared to results of two experiments in which speech reception thresholds were measured in a reverberant room in quiet and in the presence of a noise source for listeners with normal hearing. All extensions yielded better predictions than the original model when the influence of reverberation was strong, while predictions were similar for conditions with less reverberation. Although model (3) differed substantially in the assumed interaction of binaural processing and early reflections, its predictions were very similar to model (2) that achieved the best fit to the data. PMID:22087928

  4. Listening effort and speech intelligibility in listening situations affected by noise and reverberation.

    PubMed

    Rennies, Jan; Schepker, Henning; Holube, Inga; Kollmeier, Birger

    2014-11-01

    This study compared the combined effect of noise and reverberation on listening effort and speech intelligibility to predictions of the speech transmission index (STI). Listening effort was measured in normal-hearing subjects using a scaling procedure. Speech intelligibility scores were measured in the same subjects and conditions: (a) Speech-shaped noise as the only interfering factor, (b) + (c) fixed signal-to-noise ratios (SNRs) of 0 or 7 dB and reverberation as detrimental factors, and (d) reverberation as the only detrimental factor. In each condition, SNR and reverberation were combined to produce STI values of 0.17, 0.30, 0.43, 0.57, and 0.70, respectively. Listening effort always decreased with increasing STI, thus enabling a rough prediction, but a significant bias was observed indicating that listening effort was lower in reverberation only than in noise only at the same STI for one type of impulse responses. Accordingly, speech intelligibility increased with increasing STI and was significantly better in reverberation only than in noise only at the same STI. Further analyses showed that the broadband reverberation time is not always a good estimate of speech degradation in reverberation and that different speech materials may differ in their robustness toward detrimental effects of reverberation. PMID:25373965

  5. The effect of semantic context on speech intelligibility in reverberant rooms

    PubMed Central

    Srinivasan, Nirmal; Zahorik, Pavel

    2013-01-01

    Although it is well known that semantic context affects speech intelligibility and that different reverberant rooms affect speech intelligibility differentially, these effects have seldom been studied together. Revised SPIN sentences in a background of Gaussian noise in simulated rooms with reverberation time (T60) of 1 and 0.25 s were used. The carrier phrase and the target word of the speech stimuli were manipulated to be either in the same room or in different rooms. As expected, intelligibility of predictable sentences was higher compared to unpredictable sentences-the context effect. The context effect was higher in the low-reverberant room as compared to the high-reverberant room. When the carrier phrase and target words were in different rooms, the context effect was higher when the carrier phrase was in the low-reverberant room and target word in the high-reverberant room. For predictable sentences, changing the target word from high-reverberation to low reverberation with a high reverberant carrier increased intelligibility. However, with a low-reverberant carrier and different rooms for the target word, there was no change in intelligibility. Overall, it could be concluded that there is an interaction between semantic context and room acoustics for speech intelligibility. PMID:24058720

  6. The benefit of head orientation to speech intelligibility in noise.

    PubMed

    Grange, Jacques A; Culling, John F

    2016-02-01

    Spatial release from masking is traditionally measured with speech in front. The effect of head-orientation with respect to the speech direction has rarely been studied. Speech-reception thresholds (SRTs) were measured for eight head orientations and four spatial configurations. Benefits of head orientation away from the speech source of up to 8 dB were measured. These correlated with predictions of a model based on better-ear listening and binaural unmasking (r = 0.96). Use of spontaneous head orientations was measured when listeners attended to long speech clips of gradually diminishing speech-to-noise ratio in a sound-deadened room. Speech was presented from the loudspeaker that initially faced the listener and noise from one of four other locations. In an undirected paradigm, listeners spontaneously turned their heads away from the speech in 56% of trials. When instructed to rotate their heads in the diminishing speech-to-noise ratio, all listeners turned away from the speech and reached head orientations associated with lower SRTs. Head orientation may prove valuable for hearing-impaired listeners. PMID:26936554

  7. The Role of Music in Speech Intelligibility of Learners with Post Lingual Hearing Impairment in Selected Units in Lusaka District

    ERIC Educational Resources Information Center

    Katongo, Emily Mwamba; Ndhlovu, Daniel

    2015-01-01

    This study sought to establish the role of music in speech intelligibility of learners with Post Lingual Hearing Impairment (PLHI) and strategies teachers used to enhance speech intelligibility in learners with PLHI in selected special units for the deaf in Lusaka district. The study used a descriptive research design. Qualitative and quantitative…

  8. Effect of the Number of Presentations on Listener Transcriptions and Reliability in the Assessment of Speech Intelligibility in Children

    ERIC Educational Resources Information Center

    Lagerberg, Tove B.; Johnels, Jakob Åsberg; Hartelius, Lena; Persson, Christina

    2015-01-01

    Background: The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, "inter alia," to the speech material, the listener and the listener task. Aims: To explore the impact of the number of…

  9. Peaks in the Frequency Response of Hearing Aids: Evaluation of the Effects on Speech Intelligibility and Sound Quality.

    ERIC Educational Resources Information Center

    Buuren, Ronald A. van; And Others

    1996-01-01

    This study evaluated speech intelligibility under noise conditions of varying peaks (10, 20, and 30 decibels) in frequency response, with 26 listeners with sensorineural impaired hearing who used hearing aids and 10 listeners with normal hearing. Results indicated that the peaks affected speech intelligibility more for listeners with impaired than…

  10. Understanding the effect of noise on electrical stimulation sequences in cochlear implants and its impact on speech intelligibility.

    PubMed

    Qazi, Obaid Ur Rehman; van Dijk, Bas; Moonen, Marc; Wouters, Jan

    2013-05-01

    The present study investigates the most important factors that limit the intelligibility of the cochlear implant (CI) processed speech in noisy environments. The electrical stimulation sequences provided in CIs are affected by the noise in the following three manners. First of all, the natural gaps in the speech are filled, which distorts the low-frequency ON/OFF modulations of the speech signal. Secondly, speech envelopes are distorted to include modulations of both speech and noise. Lastly, the N-of-M type of speech coding strategies may select the noise dominated channels instead of the dominant speech channels at low signal-to-noise ratio's (SNRs). Different stimulation sequences are tested with CI subjects to study how these three noise effects individually limit the intelligibility of the CI processed speech. Tests are also conducted with normal hearing (NH) subjects using vocoded speech to identify any significant differences in the noise reduction requirements and speech distortion limitations between the two subject groups. Results indicate that compared to NH subjects CI subjects can tolerate significantly lower levels of steady state speech shaped noise in the speech gaps but at the same time can tolerate comparable levels of distortions in the speech segments. Furthermore, modulations in the stimulus current level have no effect on speech intelligibility as long as the channel selection remains ideal. Finally, wrong maxima selection together with the introduction of noise in the speech gaps significantly degrades the intelligibility. At low SNRs wrong maxima selection introduces interruptions in the speech and makes it difficult to fuse noisy and interrupted speech signals into a coherent speech stream. PMID:23396271

  11. [Communication and noise. Speech intelligibility of airplane pilots with and without active noise compensation].

    PubMed

    Matschke, R G

    1994-08-01

    Noise exposure measurements were performed with pilots of the German Federal Navy during flight situations. The ambient noise levels during regular flight were maintained at levels above a 90 dB A-weighted level. This noise intensity requires wearing ear protection to avoid sound-induced hearing loss. To be able to understand radio communication (ATC) in spite of a noisy environment, headphone volume must be raised above the noise of the engines. The use of ear plugs in addition to the headsets and flight helmets is only of limited value because personal ear protection affects the intelligibility of ATC. Whereas speech intelligibility of pilots with normal hearing is affected to only a smaller degree, pilots with pre-existing high-frequency hearing losses show substantial impairments of speech intelligibility that vary in proportion to the hearing deficit present. Communication abilities can be reduced drastically, which in turn can affect air traffic security. The development of active noise compensation devices (ANC) that make use of the "anti-noise" principle may be a solution to this dilemma. To evaluate the effectiveness of an ANC-system and its influence on speech intelligibility, speech audiometry was performed with a German standardized test during simulated flight conditions with helicopter pilots. Results demonstrate the helpful effect on speech understanding especially for pilots with noise-induced hearing losses. This may help to avoid pre-retirement professional disability. PMID:7960953

  12. Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility.

    PubMed

    Biberger, Thomas; Ewert, Stephan D

    2016-08-01

    Human auditory perception and speech intelligibility have been successfully described based on the two concepts of spectral masking and amplitude modulation (AM) masking. The power-spectrum model (PSM) [Patterson and Moore (1986). Frequency Selectivity in Hearing, pp. 123-177] accounts for effects of spectral masking and critical bandwidth, while the envelope power-spectrum model (EPSM) [Ewert and Dau (2000). J. Acoust. Soc. Am. 108, 1181-1196] has been successfully applied to AM masking and discrimination. Both models extract the long-term (envelope) power to calculate signal-to-noise ratios (SNR). Recently, the EPSM has been applied to speech intelligibility (SI) considering the short-term envelope SNR on various time scales (multi-resolution speech-based envelope power-spectrum model; mr-sEPSM) to account for SI in fluctuating noise [Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134, 436-446]. Here, a generalized auditory model is suggested combining the classical PSM and the mr-sEPSM to jointly account for psychoacoustics and speech intelligibility. The model was extended to consider the local AM depth in conditions with slowly varying signal levels, and the relative role of long-term and short-term SNR was assessed. The suggested generalized power-spectrum model is shown to account for a large variety of psychoacoustic data and to predict speech intelligibility in various types of background noise. PMID:27586734

  13. Accent, intelligibility, and comprehensibility in the perception of foreign-accented Lombard speech

    NASA Astrophysics Data System (ADS)

    Li, Chi-Nin

    2003-10-01

    Speech produced in noise (Lombard speech) has been reported to be more intelligible than speech produced in quiet (normal speech). This study examined the perception of non-native Lombard speech in terms of intelligibility, comprehensibility, and degree of foreign accent. Twelve Cantonese speakers and a comparison group of English speakers read simple true and false English statements in quiet and in 70 dB of masking noise. Lombard and normal utterances were mixed with noise at a constant signal-to-noise ratio, and presented along with noise-free stimuli to eight new English listeners who provided transcription scores, comprehensibility ratings, and accent ratings. Analyses showed that, as expected, utterances presented in noise were less well perceived than were noise-free sentences, and that the Cantonese speakers' productions were more accented, but less intelligible and less comprehensible than those of the English speakers. For both groups of speakers, the Lombard sentences were correctly transcribed more often than their normal utterances in noisy conditions. However, the Cantonese-accented Lombard sentences were not rated as easier to understand than was the normal speech in all conditions. The assigned accent ratings were similar throughout all listening conditions. Implications of these findings will be discussed.

  14. Cross-Channel Amplitude Sweeps Are Crucial to Speech Intelligibility

    ERIC Educational Resources Information Center

    Prendergast, Garreth; Green, Gary G. R.

    2012-01-01

    Classical views of speech perception argue that the static and dynamic characteristics of spectral energy peaks (formants) are the acoustic features that underpin phoneme recognition. Here we use representations where the amplitude modulations of sub-band filtered speech are described, precisely, in terms of co-sinusoidal pulses. These pulses are…

  15. Outcome measures based on classification performance fail to predict the intelligibility of binary-masked speech.

    PubMed

    Kressner, Abigail Anne; May, Tobias; Rozell, Christopher J

    2016-06-01

    To date, the most commonly used outcome measure for assessing ideal binary mask estimation algorithms is based on the difference between the hit rate and the false alarm rate (H-FA). Recently, the error distribution has been shown to substantially affect intelligibility. However, H-FA treats each mask unit independently and does not take into account how errors are distributed. Alternatively, algorithms can be evaluated with the short-time objective intelligibility (STOI) metric using the reconstructed speech. This study investigates the ability of H-FA and STOI to predict intelligibility for binary-masked speech using masks with different error distributions. The results demonstrate the inability of H-FA to predict the behavioral intelligibility and also illustrate the limitations of STOI. Since every estimation algorithm will make errors that are distributed in different ways, performance evaluations should not be made solely on the basis of these metrics. PMID:27369123

  16. The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners

    NASA Astrophysics Data System (ADS)

    Versfeld, Niek J.; Dreschler, Wouter A.

    2002-01-01

    A conventional measure to determine the ability to understand speech in noisy backgrounds is the so-called speech reception threshold (SRT) for sentences. It yields the signal-to-noise ratio (in dB) for which half of the sentences are correctly perceived. The SRT defines to what degree speech must be audible to a listener in order to become just intelligible. There are indications that elderly listeners have greater difficulty in understanding speech in adverse listening conditions than young listeners. This may be partly due to the differences in hearing sensitivity (presbycusis), hence audibility, but other factors, such as temporal acuity, may also play a significant role. A potential measure for the temporal acuity may be the threshold to which speech can be accelerated, or compressed in time. A new test is introduced where the speech rate is varied adaptively. In analogy to the SRT, the time-compression threshold (or TCT) then is defined as the speech rate (expressed in syllables per second) for which half of the sentences are correctly perceived. In experiment I, the TCT test is introduced and normative data are provided. In experiment II, four groups of subjects (young and elderly normal-hearing and hearing-impaired subjects) participated, and the SRT's in stationary and fluctuating speech-shaped noise were determined, as well as the TCT. The results show that the SRT in fluctuating noise and the TCT are highly correlated. All tests indicate that, even after correction for the hearing loss, elderly normal-hearing subjects perform worse than young normal-hearing subjects. The results indicate that the use of the TCT test or the SRT test in fluctuating noise is preferred over the SRT test in stationary noise.

  17. The effects of fundamental frequency contour manipulations on speech intelligibility in background noise.

    PubMed

    Miller, Sharon E; Schlauch, Robert S; Watson, Peter J

    2010-07-01

    Previous studies have documented that speech with flattened or inverted fundamental frequency (F0) contours is less intelligible than speech with natural variations in F0. The purpose of this present study was to further investigate how F0 manipulations affect speech intelligibility in background noise. Speech recognition in noise was measured for sentences having the following F0 contours: unmodified, flattened at the median, natural but exaggerated, inverted, and sinusoidally frequency modulated at rates of 2.5 and 5.0 Hz, rates shown to make vowels more perceptually salient in background noise. Five talkers produced 180 stimulus sentences, with 30 unique sentences per F0 contour condition. Flattening or exaggerating the F0 contour reduced key word recognition performance by 13% relative to the naturally produced speech. Inverting or sinusoidally frequency modulating the F0 contour reduced performance by 23% relative to typically produced speech. These results support the notion that linguistically incorrect or misleading cues have a greater deleterious effect on speech understanding than linguistically neutral cues. PMID:20649237

  18. Relative intelligibility of dynamically extracted transient versus steady-state components of speech

    NASA Astrophysics Data System (ADS)

    Boston, J. R.; Yoo, Sungyub; Li, C. C.; El-Jaroudi, Amro; Durrant, J. D.; Kovacyk, Kristie; Karn, Stacey

    2001-05-01

    Consonants are recognized to dominate higher frequencies of the speech spectrum and to carry more information than vowels, but both demonstrate quasi-steady state and transient components, such as vowel to consonant transitions. Fixed filters somewhat separate these effects, but probably not optimally, given diverse words, speakers, and situations. To enhance the transient characteristics of speech, this study used time-varying adaptive filters [Rao and Kumaresan, IEEE Trans. Speech Audio Process. 8, 240-254 (2000)], following high-pass filtering at 700 Hz (well-known to have minimal effect on intelligibility), to extract predominantly steady-state components of speech material (CVC words, NU-6). The transient component was the difference between the sum of the filter outputs and the original signal. Psychometric functions were determined in five subjects with and without background noise and fitted by ogives. The transient components averaged filtered speech energy, but PBmax was not significantly different (nonparametric ANOVA) from that of either the original or highpass filtered speech. The steady-state components yielded significantly lower PBmax (p 3D 0.003) despite their much greater energy, as expected. These results suggest a potential approach to dynamic enhancement of speech intelligibility. [Work supported by ONR.

  19. Investigation of objective measures for intelligibility prediction of noise-reduced speech for Chinese, Japanese, and English.

    PubMed

    Li, Junfeng; Xia, Risheng; Ying, Dongwen; Yan, Yonghong; Akagi, Masato

    2014-12-01

    Many objective measures have been reported to predict speech intelligibility in noise, most of which were designed and evaluated with English speech corpora. Given the different perceptual cues used by native listeners of different languages, examining whether there is any language effect when the same objective measure is used to predict speech intelligibility in different languages is of great interest, particularly when non-linear noise-reduction processing is involved. In the present study, an extensive evaluation is taken of objective measures for speech intelligibility prediction of noisy speech processed by noise-reduction algorithms in Chinese, Japanese, and English. Of all the objective measures tested, the short-time objective intelligibility (STOI) measure produced the most accurate results in speech intelligibility prediction for Chinese, while the normalized covariance metric (NCM) and middle-level coherence speech intelligibility index ( CSIIm) incorporating the signal-dependent band-importance functions (BIFs) produced the most accurate results for Japanese and English, respectively. The objective measures that performed best in predicting the effect of non-linear noise-reduction processing in speech intelligibility were found to be the BIF-modified NCM measure for Chinese, the STOI measure for Japanese, and the BIF-modified CSIIm measure for English. Most of the objective measures examined performed differently even under the same conditions for different languages. PMID:25480075

  20. Phonological Accuracy and Intelligibility in Connected Speech of Boys with Fragile X Syndrome or Down Syndrome

    PubMed Central

    Barnes, Elizabeth; Roberts, Joanne; Long, Steven H.; Martin, Gary E.; Berni, Mary C.; Mandulak, Kerry C.; Sideris, John

    2008-01-01

    Purpose We compared the phonological accuracy and speech intelligibility of boys with fragile X syndrome with autism spectrum disorder (FXS-ASD), fragile X syndrome only (FXS-O), Down Syndrome (DS), and typically developing (TD) boys. Method Participants were 32 boys with FXS-O (3 to 14 years), 31 with FXS-ASD (5 to 15 years), 34 with DS (4 to16 years), and 45 TD boys of similar nonverbal mental age. We used connected speech samples to compute measures of phonological accuracy, phonological process occurrence, and intelligibility. Results The boys with FXS, regardless of autism status, did not differ from TD boys on phonological accuracy and phonological process occurrence but produced fewer intelligible words than TD boys. The boys with DS scored lower on measures of phonological accuracy and occurrence of phonological processes than all other groups and used fewer intelligible words than TD boys. The boys with FXS and the boys with DS did not differ on measures of intelligibility. Conclusion Boys with FXS, regardless of autism status, exhibit phonological characteristics similar to those of younger TD children but are less intelligible in connected speech. The boys with DS show greater delays in all phonological measures than the boys with FXS and TD boys. PMID:19641081

  1. On the relationship between auditory cognition and speech intelligibility in cochlear implant users: An ERP study.

    PubMed

    Finke, Mareike; Büchner, Andreas; Ruigendijk, Esther; Meyer, Martin; Sandmann, Pascale

    2016-07-01

    There is a high degree of variability in speech intelligibility outcomes across cochlear-implant (CI) users. To better understand how auditory cognition affects speech intelligibility with the CI, we performed an electroencephalography study in which we examined the relationship between central auditory processing, cognitive abilities, and speech intelligibility. Postlingually deafened CI users (N=13) and matched normal-hearing (NH) listeners (N=13) performed an oddball task with words presented in different background conditions (quiet, stationary noise, modulated noise). Participants had to categorize words as living (targets) or non-living entities (standards). We also assessed participants' working memory (WM) capacity and verbal abilities. For the oddball task, we found lower hit rates and prolonged response times in CI users when compared with NH listeners. Noise-related prolongation of the N1 amplitude was found for all participants. Further, we observed group-specific modulation effects of event-related potentials (ERPs) as a function of background noise. While NH listeners showed stronger noise-related modulation of the N1 latency, CI users revealed enhanced modulation effects of the N2/N4 latency. In general, higher-order processing (N2/N4, P3) was prolonged in CI users in all background conditions when compared with NH listeners. Longer N2/N4 latency in CI users suggests that these individuals have difficulties to map acoustic-phonetic features to lexical representations. These difficulties seem to be increased for speech-in-noise conditions when compared with speech in quiet background. Correlation analyses showed that shorter ERP latencies were related to enhanced speech intelligibility (N1, N2/N4), better lexical fluency (N1), and lower ratings of listening effort (N2/N4) in CI users. In sum, our findings suggest that CI users and NH listeners differ with regards to both the sensory and the higher-order processing of speech in quiet as well as in

  2. Northeast Artificial Intelligence Consortium (NAIC). Volume 8. Artificial intelligence applications to speech recognition. Final report, Sep 84-Dec 89

    SciTech Connect

    Rhody, H.; Biles, J.

    1990-12-01

    The Northeast Artificial Intelligence Consortium (NAIC) was created by the Air Force Systems Command, Rome Air Development Center, and the Office of Scientific Research. Its purpose was to conduct pertinent research in artificial intelligence and to perform activities ancillary to this research. This report describes progress during the existence of the NAIC of the technical research tasks undertaken at the member universities. The topics covered in general are: versatile expert system for equipment maintenance, distributed AI for communications system control, automatic photointerpretation, time-oriented problem solving, speech understanding systems, knowledge based maintenance, hardware architectures for very large systems, knowledge based reasoning and planning, and a knowledge acquisition, assistance, and explanation system. The specific topic for this volume is the design and implementation of a knowledge-based system to read speech spectrograms.

  3. Listening with a foreign-accent: The interlanguage speech intelligibility benefit in Mandarin speakers of English

    PubMed Central

    Xie, Xin; Fowler, Carol A.

    2013-01-01

    This study examined the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. In the latter group, it also examined the role of the language environment and English proficiency. Three groups of listeners were tested: native English listeners (NE), Mandarin-speaking Chinese listeners in the US (M-US) and Mandarin listeners in Beijing, China (M-BJ). As a group, M-US and M-BJ listeners were matched on English proficiency and age of acquisition. A nonword transcription task was used. Identification accuracy for word-final stops in the nonwords established two independent interlanguage intelligibility effects. An interlanguage speech intelligibility benefit for listeners (ISIB-L) was manifest by both groups of Mandarin listeners outperforming native English listeners in identification of Mandarin-accented speech. In the benefit for talkers (ISIB-T), only M-BJ listeners were more accurate identifying Mandarin-accented speech than native English speech. Thus, both Mandarin groups demonstrated an ISIB-L while only the M-BJ group overall demonstrated an ISIB-T. The English proficiency of listeners was found to modulate the magnitude of the ISIB-T in both groups. Regression analyses also suggested that the listener groups differ in their use of acoustic information to identify voicing in stop consonants. PMID:24293741

  4. Modeling the effects of a single reflection on binaural speech intelligibility.

    PubMed

    Rennies, Jan; Warzybok, Anna; Brand, Thomas; Kollmeier, Birger

    2014-03-01

    Recently the influence of delay and azimuth of a single speech reflection on speech reception thresholds (SRTs) was systematically investigated using frontal, diffuse, and lateral noise [Warzybok et al. (2013). J. Acoust. Soc. Am. 133, 269-282]. The experiments showed that the benefit of an early reflection was independent of its azimuth and mostly independent of noise type, but that the detrimental effect of a late reflection depended on its direction relative to the noise. This study tests if different extensions of a binaural speech intelligibility model can predict these data. The extensions differ in the order in which binaural processing and temporal integration of early reflections take place. Models employing a correction for the detrimental effects of reverberation on speech intelligibility after performing the binaural processing predict SRTs in symmetric masking conditions (frontal, diffuse), but cannot predict the measured interaction of temporal and spatial integration. In contrast, a model extension accounting for the distinction between useful and detrimental reflections before the binaural processing stage predicts the data with an overall R(2) of 0.95. This indicates that any model framework predicting speech intelligibility in rooms should incorporate an interaction between binaural and temporal integration of reflections at a comparatively early stage. PMID:24606290

  5. The Use of Artificial Neural Networks to Estimate Speech Intelligibility from Acoustic Variables: A Preliminary Analysis.

    ERIC Educational Resources Information Center

    Metz, Dale Evan; And Others

    1992-01-01

    A preliminary scheme for estimating the speech intelligibility of hearing-impaired speakers from acoustic parameters, using a computerized artificial neural network to process mathematically the acoustic input variables, is outlined. Tests with 60 hearing-impaired speakers found the scheme to be highly accurate in identifying speakers separated by…

  6. Speech Intelligibility and Marital Communication in Amyotrophic Lateral Sclerosis: An Exploratory Study

    ERIC Educational Resources Information Center

    Joubert, Karin; Bornman, Juan; Alant, Erna

    2011-01-01

    Amyotrophic lateral sclerosis (ALS), a rapidly progressive neuromuscular disease, has a devastating impact not only on individuals diagnosed with ALS but also their spouses. Speech intelligibility, often compromised as a result of dysarthria, affects the couple's ability to maintain effective, intimate communication. The purpose of this…

  7. The Effect of Fundamental Frequency on the Intelligibility of Speech with Flattened Intonation Contours

    ERIC Educational Resources Information Center

    Watson, Peter J.; Schlauch, Robert S.

    2008-01-01

    Purpose: To examine the effect of fundamental frequency (F0) on the intelligibility of speech with flattened F0 contours in noise. Method: Participants listened to sentences produced by 2 female talkers in white noise. The listening conditions included the unmodified original sentences and sentences with resynthesized F0 that reflected the average…

  8. Vowel Targeted Intervention for Children with Persisting Speech Difficulties: Impact on Intelligibility

    ERIC Educational Resources Information Center

    Speake, Jane; Stackhouse, Joy; Pascoe, Michelle

    2012-01-01

    Compared to the treatment of consonant segments, the treatment of vowels is infrequently described in the literature on children's speech difficulties. Vowel difficulties occur less frequently than those with consonants but may have significant impact on intelligibility. In order to evaluate the effectiveness of vowel targeted intervention (VTI)…

  9. Factors influencing relative speech intelligibility in patients with oral squamous cell carcinoma: a prospective study using automatic, computer-based speech analysis.

    PubMed

    Stelzle, F; Knipfer, C; Schuster, M; Bocklet, T; Nöth, E; Adler, W; Schempf, L; Vieler, P; Riemann, M; Neukam, F W; Nkenke, E

    2013-11-01

    Oral squamous cell carcinoma (OSCC) and its treatment impair speech intelligibility by alteration of the vocal tract. The aim of this study was to identify the factors of oral cancer treatment that influence speech intelligibility by means of an automatic, standardized speech-recognition system. The study group comprised 71 patients (mean age 59.89, range 35-82 years) with OSCC ranging from stage T1 to T4 (TNM staging). Tumours were located on the tongue (n=23), lower alveolar crest (n=27), and floor of the mouth (n=21). Reconstruction was conducted through local tissue plasty or microvascular transplants. Adjuvant radiotherapy was performed in 49 patients. Speech intelligibility was evaluated before, and at 3, 6, and 12 months after tumour resection, and compared to that of a healthy control group (n=40). Postoperatively, significant influences on speech intelligibility were tumour localization (P=0.010) and resection volume (P=0.019). Additionally, adjuvant radiotherapy (P=0.049) influenced intelligibility at 3 months after surgery. At 6 months after surgery, influences were resection volume (P=0.028) and adjuvant radiotherapy (P=0.034). The influence of tumour localization (P=0.001) and adjuvant radiotherapy (P=0.022) persisted after 12 months. Tumour localization, resection volume, and radiotherapy are crucial factors for speech intelligibility. Radiotherapy significantly impaired word recognition rate (WR) values with a progression of the impairment for up to 12 months after surgery. PMID:23845298

  10. The role of accent imitation in sensorimotor integration during processing of intelligible speech.

    PubMed

    Adank, Patti; Rueschemeyer, Shirley-Ann; Bekkering, Harold

    2013-01-01

    Recent theories on how listeners maintain perceptual invariance despite variation in the speech signal allocate a prominent role to imitation mechanisms. Notably, these simulation accounts propose that motor mechanisms support perception of ambiguous or noisy signals. Indeed, imitation of ambiguous signals, e.g., accented speech, has been found to aid effective speech comprehension. Here, we explored the possibility that imitation in speech benefits perception by increasing activation in speech perception and production areas. Participants rated the intelligibility of sentences spoken in an unfamiliar accent of Dutch in a functional Magnetic Resonance Imaging experiment. Next, participants in one group repeated the sentences in their own accent, while a second group vocally imitated the accent. Finally, both groups rated the intelligibility of accented sentences in a post-test. The neuroimaging results showed an interaction between type of training and pre- and post-test sessions in left Inferior Frontal Gyrus, Supplementary Motor Area, and left Superior Temporal Sulcus. Although alternative explanations such as task engagement and fatigue need to be considered as well, the results suggest that imitation may aid effective speech comprehension by supporting sensorimotor integration. PMID:24109447

  11. Spectrotemporal Modulation Sensitivity as a Predictor of Speech Intelligibility for Hearing-Impaired Listeners

    PubMed Central

    Bernstein, Joshua G.W.; Mehraei, Golbarg; Shamma, Shihab; Gallun, Frederick J.; Theodoroff, Sarah M.; Leek, Marjorie R.

    2014-01-01

    Background A model that can accurately predict speech intelligibility for a given hearing-impaired (HI) listener would be an important tool for hearing-aid fitting or hearing-aid algorithm development. Existing speech-intelligibility models do not incorporate variability in suprathreshold deficits that are not well predicted by classical audiometric measures. One possible approach to the incorporation of such deficits is to base intelligibility predictions on sensitivity to simultaneously spectrally and temporally modulated signals. Purpose The likelihood of success of this approach was evaluated by comparing estimates of spectrotemporal modulation (STM) sensitivity to speech intelligibility and to psychoacoustic estimates of frequency selectivity and temporal fine-structure (TFS) sensitivity across a group of HI listeners. Research Design The minimum modulation depth required to detect STM applied to an 86 dB SPL four-octave noise carrier was measured for combinations of temporal modulation rate (4, 12, or 32 Hz) and spectral modulation density (0.5, 1, 2, or 4 cycles/octave). STM sensitivity estimates for individual HI listeners were compared to estimates of frequency selectivity (measured using the notched-noise method at 500, 1000measured using the notched-noise method at 500, 2000, and 4000 Hz), TFS processing ability (2 Hz frequency-modulation detection thresholds for 500, 10002 Hz frequency-modulation detection thresholds for 500, 2000, and 4000 Hz carriers) and sentence intelligibility in noise (at a 0 dB signal-to-noise ratio) that were measured for the same listeners in a separate study. Study Sample Eight normal-hearing (NH) listeners and 12 listeners with a diagnosis of bilateral sensorineural hearing loss participated. Data Collection and Analysis STM sensitivity was compared between NH and HI listener groups using a repeated-measures analysis of variance. A stepwise regression analysis compared STM sensitivity for individual HI listeners to

  12. Speech Intelligibility and Accents in Speech-Mediated Interfaces: Results and Recommendations

    ERIC Educational Resources Information Center

    Lawrence, Halcyon M.

    2013-01-01

    There continues to be significant growth in the development and use of speech--mediated devices and technology products; however, there is no evidence that non-native English speech is used in these devices, despite the fact that English is now spoken by more non-native speakers than native speakers, worldwide. This relative absence of nonnative…

  13. Emotional intelligence, not music training, predicts recognition of emotional speech prosody.

    PubMed

    Trimmer, Christopher G; Cuddy, Lola L

    2008-12-01

    Is music training associated with greater sensitivity to emotional prosody in speech? University undergraduates (n = 100) were asked to identify the emotion conveyed in both semantically neutral utterances and melodic analogues that preserved the fundamental frequency contour and intensity pattern of the utterances. Utterances were expressed in four basic emotional tones (anger, fear, joy, sadness) and in a neutral condition. Participants also completed an extended questionnaire about music education and activities, and a battery of tests to assess emotional intelligence, musical perception and memory, and fluid intelligence. Emotional intelligence, not music training or music perception abilities, successfully predicted identification of intended emotion in speech and melodic analogues. The ability to recognize cues of emotion accurately and efficiently across domains may reflect the operation of a cross-modal processor that does not rely on gains of perceptual sensitivity such as those related to music training. PMID:19102595

  14. Speech intelligibility prediction in reverberation: Towards an integrated model of speech transmission, spatial unmasking, and binaural de-reverberation.

    PubMed

    Leclère, Thibaud; Lavandier, Mathieu; Culling, John F

    2015-06-01

    Room acoustic indicators of intelligibility have focused on the effects of temporal smearing of speech by reverberation and masking by diffuse ambient noise. In the presence of a discrete noise source, these indicators neglect the binaural listener's ability to separate target speech from noise. Lavandier and Culling [(2010). J. Acoust. Soc. Am. 127, 387-399] proposed a model that incorporates this ability but neglects the temporal smearing of speech, so that predictions hold for near-field targets. An extended model based on useful-to-detrimental (U/D) ratios is presented here that accounts for temporal smearing, spatial unmasking, and binaural de-reverberation in reverberant environments. The influence of the model parameters was tested by comparing the model predictions with speech reception thresholds measured in three experiments from the literature. Accurate predictions were obtained by adjusting the parameters to each room. Room-independent parameters did not lead to similar performances, suggesting that a single U/D model cannot be generalized to any room. Despite this limitation, the model framework allows to propose a unified interpretation of spatial unmasking, temporal smearing, and binaural de-reverberation. PMID:26093423

  15. The effects of syllabic compression and frequency shaping on speech intelligibility in hearing impaired people.

    PubMed

    Verschuure, H; Prinsen, T T; Dreschler, W A

    1994-02-01

    The effect of syllabic compression on speech intelligibility is rarely positive and in those cases that positive effects have been found, the same positive results could in general be obtained by frequency shaping of the frequency response curve. We programmed a syllabic compressor on a digital processor; the compressor differed from a conventional syllabic compressor by incorporating a delay in the signal path to suppress overshoots and thus minimize transient distortion. Furthermore, the time constants were short: attack time of 5 msec and release time of 15 msec. The compressor was only active in the high-frequency band. An essentially linear signal was added to deliver the low-frequency speech components. The processing resulted in a frequency response that mirrored the hearing loss near threshold and became much flatter for higher level input signals. Speech intelligibility scores for nonsense consonant-vowel-consonant words embedded in carrier phrases were determined for hearing-impaired persons with sloping audiograms and discrimination losses for speech. Results showed little additional effect of frequency shaping to the existing improved speech score for compressed speech. Optimum results were found for a compression ratio 2 with lower speech scores for linear amplification and for compression ratio 8. We next determined the effect of providing high-frequency emphasis to the speech signal and/or to the compression control signal to compensate for the upward spread of masking. The frequency response at the root-mean-square level was adjusted according to the half-gain rule. The positive effects of moderate compression could be found again; the high-frequency emphasis, however, was positive for the vowels but made consonant recognition poorer.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:8194675

  16. Channel selection in the modulation domain for improved speech intelligibility in noise

    PubMed Central

    Wójcicki, Kamil K.; Loizou, Philipos C.

    2012-01-01

    Background noise reduces the depth of the low-frequency envelope modulations known to be important for speech intelligibility. The relative strength of the target and masker envelope modulations can be quantified using a modulation signal-to-noise ratio, (S/N)mod, measure. Such a measure can be used in noise-suppression algorithms to extract target-relevant modulations from the corrupted (target + masker) envelopes for potential improvement in speech intelligibility. In the present study, envelopes are decomposed in the modulation spectral domain into a number of channels spanning the range of 0–30 Hz. Target-dominant modulations are identified and retained in each channel based on the (S/N)mod selection criterion, while modulations which potentially interfere with perception of the target (i.e., those dominated by the masker) are discarded. The impact of modulation-selective processing on the speech-reception threshold for sentences in noise is assessed with normal-hearing listeners. Results indicate that the intelligibility of noise-masked speech can be improved by as much as 13 dB when preserving target-dominant modulations, present up to a modulation frequency of 18 Hz, while discarding masker-dominant modulations from the mixture envelopes. PMID:22501068

  17. The combined effects of reverberation and noise on speech intelligibility by cochlear implant listeners

    PubMed Central

    Hazrati, Oldooz; Loizou, Philipos C.

    2013-01-01

    Objective The purpose of this study is to assess the individual effect of reverberation and noise, as well as their combined effect, on speech intelligibility by cochlear implant (CI) users. Design Sentence stimuli corrupted by reverberation, noise, and reverberation + noise are presented to 11 CI listeners for word identification. They are tested in two reverberation conditions (T60 = 0.6 s, 0.8 s), two noise conditions (SNR = 5 dB, 10 dB), and four reverberation + noise conditions. Study sample Eleven CI users participated. Results Results indicated that reverberation degrades speech intelligibility to a greater extent than additive noise (speech-shaped noise), at least for the SNR levels tested. The combined effects were greater than those introduced by either reverberation or noise alone. Conclusions The effect of reverberation on speech intelligibility by CI users was found to be larger than that by noise. The results from the present study highlight the importance of testing CI users in reverberant conditions, since testing in noise-alone conditions might underestimate the difficulties they experience in their daily lives where reverberation and noise often coexist. PMID:22356300

  18. Intelligibility of foreign-accented speech: Effects of listening condition, listener age, and listener hearing status

    NASA Astrophysics Data System (ADS)

    Ferguson, Sarah Hargus

    2005-09-01

    It is well known that, for listeners with normal hearing, speech produced by non-native speakers of the listener's first language is less intelligible than speech produced by native speakers. Intelligibility is well correlated with listener's ratings of talker comprehensibility and accentedness, which have been shown to be related to several talker factors, including age of second language acquisition and level of similarity between the talker's native and second language phoneme inventories. Relatively few studies have focused on factors extrinsic to the talker. The current project explored the effects of listener and environmental factors on the intelligibility of foreign-accented speech. Specifically, monosyllabic English words previously recorded from two talkers, one a native speaker of American English and the other a native speaker of Spanish, were presented to three groups of listeners (young listeners with normal hearing, elderly listeners with normal hearing, and elderly listeners with hearing impairment; n=20 each) in three different listening conditions (undistorted words in quiet, undistorted words in 12-talker babble, and filtered words in quiet). Data analysis will focus on interactions between talker accent, listener age, listener hearing status, and listening condition. [Project supported by American Speech-Language-Hearing Association AARC Award.

  19. Effects of a single reflection with varied horizontal angle and time delay on speech intelligibility.

    PubMed

    Nakajima, T; Ando, Y

    1991-12-01

    Previously, almost all physical measures for estimating speech intelligibility in a room have been derived from only temporal-monaural criteria. This paper shows that speech intelligibility for a sound field with a single reflection depends not only on the temporal-monaural factor but also on the spatial-binaural factor of the sound field. Articulation tests for sound fields simulated with a single reflection of delay time delta t1 after the direct sound were conducted changing the horizontal incident angle xi of the reflection. Remarkable findings are as followings: (1) speech intelligibility (SI) decreases with increasing delay time delta t1, (2) SI increases when xi approaches 90 degrees; the horizontal angle of the reflection causes a significant effect on SI, and (3) the analysis of variance for articulation test scores clearly demonstrated that the effects of both delta t1 and xi on SI are fully independent. Concerning result (2), if listeners get a spatial separation of signals at the two ears, then the listener's capability for speech perception is assumed to be improved due to "adding" further information to the temporal pattern recognition. PMID:1787252

  20. The relative importance of temporal envelope information for intelligibility prediction: a study on cochlear-implant vocoded speech.

    PubMed

    Chen, Fei

    2011-10-01

    Vocoder simulation has been long applied as an effective tool to assess factors influencing the intelligibility of cochlear implants listeners. Considering that the temporal envelope information contained in contiguous bands of vocoded speech is correlated and redundant, this study examined the hypothesis that the intelligibility measure evaluating the distortions from a small number of selected envelope cues is sufficient to well predict the intelligibility scores. The speech intelligibility data from 80 conditions was collected from vocoder simulation experiments involving 22 normal-hearing listeners. The relative importance of temporal envelope information in cochlear-implant vocoded speech was modeled by correlating its speech-transmission indices (STIs) with the intelligibility scores. The relative importance pattern was subsequently utilized to determine a binary weight vector for STIs of all envelopes to compute the index predicting the speech intelligibility. A high correlation (r=0.95) was obtained when selecting a small number (e.g., 4 out of 20) of temporal envelope cues from disjoint bands to predict the intelligibility of cochlear-implant vocoded speech. PMID:21546304

  1. Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram

    PubMed Central

    Hossain, Mohammad E.; Jassim, Wissam A.; Zilany, Muhammad S. A.

    2016-01-01

    Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants. PMID:26967160

  2. Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram.

    PubMed

    Hossain, Mohammad E; Jassim, Wissam A; Zilany, Muhammad S A

    2016-01-01

    Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants. PMID:26967160

  3. The effect of vocal and demographic traits on speech intelligibility over bone conduction.

    PubMed

    Pollard, Kimberly A; Tran, Phuong K; Letowski, Tomasz

    2015-04-01

    Bone conduction (BC) communication systems provide benefits over air conduction systems but are not in widespread use, partly due to problems with speech intelligibility. Contributing factors like device location and background noise have been explored, but little attention has been paid to the role of individual user differences. Because BC signals travel through an individual's skull and facial tissues, demographic factors such as user age, sex, race, or regional origin may influence sound transmission. Vocal traits such as pitch, spectral tilt, jitter, and shimmer may also play a role. Along with microphone placement and background noise, these factors can affect BC speech intelligibility. Eight diverse talkers were recorded with bone microphones on two different skull locations and in different background noise conditions. Twenty-four diverse listeners listened to these samples over BC and completed Modified Rhyme Tests for speech intelligibility. Forehead bone recordings were more intelligible than condyle recordings. In condyle recordings, female talkers, talkers with high fundamental frequency, and talkers in background noise were understood better, as were communications between talkers and listeners of the same regional origin. Listeners' individual traits had no significant effects. Thoughtful application of this knowledge can help improve BC communication for diverse users. PMID:25920856

  4. Within-speaker speech intelligibility in dysarthria: Variation across a reading passage

    NASA Astrophysics Data System (ADS)

    Yunusova, Yana; Weismer, Gary; Westbury, John; Rusche, Nicole

    2001-05-01

    Understanding factors underlying intelligibility deficits in dysarthria is important for clinical and theoretical reasons. Correlation/regression analyses between intelligibility measures and various speech production measures (e.g., acoustic or phonetic) are often reported in the literature. However, the analyses rarely control for the effect of a third variable (severity of speech disorder, in this case) likely to be correlated with the primary correlated variables. The current report controls for this effect by using a within-speaker analysis approach. Factors that were hypothesized to underlie the intelligibility variations in multiple breath groups within a connected discourse included structural elements (e.g., number of total words) as well as acoustic measures (e.g., F2 variation). Results showed that speech intelligibility in dysarthric speakers with two forms of neurological disease (Parkinson and ALS) does, in fact, vary across breath groups extracted from a connected discourse, and that these variations are related in some cases to a per breath estimate of F2 variation. [Work supported by NIDCD Award No. R01 DC03723.

  5. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions

    PubMed Central

    Ma, Jianfen; Hu, Yi; Loizou, Philipos C.

    2009-01-01

    The articulation index (AI), speech-transmission index (STI), and coherence-based intelligibility metrics have been evaluated primarily in steady-state noisy conditions and have not been tested extensively in fluctuating noise conditions. The aim of the present work is to evaluate the performance of new speech-based STI measures, modified coherence-based measures, and AI-based measures operating on short-term (30 ms) intervals in realistic noisy conditions. Much emphasis is placed on the design of new band-importance weighting functions which can be used in situations wherein speech is corrupted by fluctuating maskers. The proposed measures were evaluated with intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech (consonants and sentences) corrupted by four different maskers (car, babble, train, and street interferences). Of all the measures considered, the modified coherence-based measures and speech-based STI measures incorporating signal-specific band-importance functions yielded the highest correlations (r=0.89–0.94). The modified coherence measure, in particular, that only included vowel∕consonant transitions and weak consonant information yielded the highest correlation (r=0.94) with sentence recognition scores. The results from this study clearly suggest that the traditional AI and STI indices could benefit from the use of the proposed signal- and segment-dependent band-importance functions. PMID:19425678

  6. Stimulation of the pedunculopontine nucleus area in Parkinson's disease: effects on speech and intelligibility.

    PubMed

    Pinto, Serge; Ferraye, Murielle; Espesser, Robert; Fraix, Valérie; Maillet, Audrey; Guirchoum, Jennifer; Layani-Zemour, Deborah; Ghio, Alain; Chabardès, Stéphan; Pollak, Pierre; Debû, Bettina

    2014-10-01

    Improvement of gait disorders following pedunculopontine nucleus area stimulation in patients with Parkinson's disease has previously been reported and led us to propose this surgical treatment to patients who progressively developed severe gait disorders and freezing despite optimal dopaminergic drug treatment and subthalamic nucleus stimulation. The outcome of our prospective study on the first six patients was somewhat mitigated, as freezing of gait and falls related to freezing were improved by low frequency electrical stimulation of the pedunculopontine nucleus area in some, but not all, patients. Here, we report the speech data prospectively collected in these patients with Parkinson's disease. Indeed, because subthalamic nucleus surgery may lead to speech impairment and a worsening of dysarthria in some patients with Parkinson's disease, we felt it was important to precisely examine any possible modulations of speech for a novel target for deep brain stimulation. Our results suggested a trend towards speech degradation related to the pedunculopontine nucleus area surgery (off stimulation) for aero-phonatory control (maximum phonation time), phono-articulatory coordination (oral diadochokinesis) and speech intelligibility. Possibly, the observed speech degradation may also be linked to the clinical characteristics of the group of patients. The influence of pedunculopontine nucleus area stimulation per se was more complex, depending on the nature of the task: it had a deleterious effect on maximum phonation time and oral diadochokinesis, and mixed effects on speech intelligibility. Whereas levodopa intake and subthalamic nucleus stimulation alone had no and positive effects on speech dimensions, respectively, a negative interaction between the two treatments was observed both before and after pedunculopontine nucleus area surgery. This combination effect did not seem to be modulated by pedunculopontine nucleus area stimulation. Although limited in our group of

  7. Spectral density affects the intelligibility of tone-vocoded speech: Implications for cochlear implant simulations.

    PubMed

    Rosen, Stuart; Zhang, Yue; Speers, Kathryn

    2015-09-01

    For small numbers of channels, tone vocoders using low envelope cutoff frequencies are less intelligible than noise vocoders, even though the noise carriers introduce random fluctuations into the crucial envelope information. Here it is shown that using tone carriers with a denser spectrum improves performance considerably over typical tone vocoders, at least equaling, and often surpassing, the performance possible with noise vocoders. In short, the spectral sparseness of tone vocoded sounds for low channel numbers, separate from the degradations introduced by using only a small number of channels, is an important limitation on the intelligibility of tone-vocoded speech. PMID:26428833

  8. Synthesized Speech Intelligibility and Early Preschool-Age Children: Comparing Accuracy for Single-Word Repetition with Repeated Exposure

    ERIC Educational Resources Information Center

    Pinkoski-Ball, Carrie L.; Reichle, Joe; Munson, Benjamin

    2012-01-01

    Purpose: This investigation examined the effect of repeated exposure to novel and repeated spoken words in typical environments on the intelligibility of 2 synthesized voices and human recorded speech in preschools. Method: Eighteen preschoolers listened to and repeated single words presented in human-recorded speech, DECtalk Paul, and AT&T Voice…

  9. Phonological Accuracy and Intelligibility in Connected Speech of Boys with Fragile X Syndrome or Down Syndrome

    ERIC Educational Resources Information Center

    Barnes, Elizabeth; Roberts, Joanne; Long, Steven H.; Martin, Gary E.; Berni, Mary C.; Mandulak, Kerry C.; Sideris, John

    2009-01-01

    Purpose: To compare the phonological accuracy and speech intelligibility of boys with fragile X syndrome with autism spectrum disorder (FXS-ASD), fragile X syndrome only (FXS-O), Down syndrome (DS), and typically developing (TD) boys. Method: Participants were 32 boys with FXS-O (3-14 years), 31 with FXS-ASD (5-15 years), 34 with DS (4-16 years),…

  10. Effect of Gender and Sound Spatialization on Speech Intelligibility in Multiple Speaker Environment

    NASA Astrophysics Data System (ADS)

    Joshi, M.; Iyer, M.; Gupta, N.; Barreto, A.

    In multiple speaker environments such as teleconferences we observe a loss of intelligibility, particularly if the sound is monaural in nature. In this study, we exploit the "Cocktail Party Effect", where a person can isolate one sound above all others using sound localization and gender cues. To improve clarity of speech, each speaker is assigned a direction using Head Related Transfer Functions (HRTFs) which creates an auditory map of multiple conversations. A mixture of male and female voices is used to improve comprehension.

  11. The influence of visual speech information on the intelligibility of English consonants produced by non-native speakers.

    PubMed

    Kawase, Saya; Hannah, Beverly; Wang, Yue

    2014-09-01

    This study examines how visual speech information affects native judgments of the intelligibility of speech sounds produced by non-native (L2) speakers. Native Canadian English perceivers as judges perceived three English phonemic contrasts (/b-v, θ-s, l-ɹ/) produced by native Japanese speakers as well as native Canadian English speakers as controls. These stimuli were presented under audio-visual (AV, with speaker voice and face), audio-only (AO), and visual-only (VO) conditions. The results showed that, across conditions, the overall intelligibility of Japanese productions of the native (Japanese)-like phonemes (/b, s, l/) was significantly higher than the non-Japanese phonemes (/v, θ, ɹ/). In terms of visual effects, the more visually salient non-Japanese phonemes /v, θ/ were perceived as significantly more intelligible when presented in the AV compared to the AO condition, indicating enhanced intelligibility when visual speech information is available. However, the non-Japanese phoneme /ɹ/ was perceived as less intelligible in the AV compared to the AO condition. Further analysis revealed that, unlike the native English productions, the Japanese speakers produced /ɹ/ without visible lip-rounding, indicating that non-native speakers' incorrect articulatory configurations may decrease the degree of intelligibility. These results suggest that visual speech information may either positively or negatively affect L2 speech intelligibility. PMID:25190408

  12. Intelligibility of time-compressed speech: the effect of uniform versus non-uniform time-compression algorithms.

    PubMed

    Schlueter, Anne; Lemke, Ulrike; Kollmeier, Birger; Holube, Inga

    2014-03-01

    For assessing hearing aid algorithms, a method is sought to shift the threshold of a speech-in-noise test to (mostly positive) signal-to-noise ratios (SNRs) that allow discrimination across algorithmic settings and are most relevant for hearing-impaired listeners in daily life. Hence, time-compressed speech with higher speech rates was evaluated to parametrically increase the difficulty of the test while preserving most of the relevant acoustical speech cues. A uniform and a non-uniform algorithm were used to compress the sentences of the German Oldenburg Sentence Test at different speech rates. In comparison, the non-uniform algorithm exhibited greater deviations from the targeted time compression, as well as greater changes of the phoneme duration, spectra, and modulation spectra. Speech intelligibility for fast Oldenburg sentences in background noise at different SNRs was determined with 48 normal-hearing listeners. The results confirmed decreasing intelligibility with increasing speech rate. Speech had to be compressed to more than 30% of its original length to reach 50% intelligibility at positive SNRs. Characteristics influencing the discrimination ability of the test for assessing effective SNR changes were investigated. Subjective and objective measures indicated a clear advantage of the uniform algorithm in comparison to the non-uniform algorithm for the application in speech-in-noise tests. PMID:24606289

  13. Evolution of the speech intelligibility of prelinguistically deaf children who received a cochlear implant

    NASA Astrophysics Data System (ADS)

    Bouchard, Marie-Eve; Cohen, Henri; Lenormand, Marie-Therese

    2005-04-01

    The 2 main objectives of this investigation are (1) to assess the evolution of the speech intelligibility of 12 prelinguistically deaf children implanted between 25 and 78 months of age and (2) to clarify the influence of the age at implantation on the intelligibility. Speech productions videorecorded at 6, 18 and 36 months following surgery during a standardized free play session. Selected syllables were then presented to 40 adults listeners who were asked to identify the vowels or the consonants they heard and to judge the quality of the segments. Perceived vowels were then located in the vocalic space whereas consonants were classified according to voicing, manner and place of articulation. 3 (Groups) ×3 (Times) ANOVA with repeated measures revealed a clear influence of time as well as age at implantation on the acquisition patterns. Speech intelligibility of these implanted children tended to improve as their experience with the device increased. Based on these results, it is proposed that sensory restoration following cochlear implant served as a probe to develop articulatory strategies allowing them to reach the intended acoustico-perceptual target.

  14. Comparisons of Auditory Performance and Speech Intelligibility after Cochlear Implant Reimplantation in Mandarin-Speaking Users

    PubMed Central

    Hwang, Chung-Feng; Ko, Hui-Chen; Tsou, Yung-Ting; Chan, Kai-Chieh; Fang, Hsuan-Yeh; Wu, Che-Ming

    2016-01-01

    Objectives. We evaluated the causes, hearing, and speech performance before and after cochlear implant reimplantation in Mandarin-speaking users. Methods. In total, 589 patients who underwent cochlear implantation in our medical center between 1999 and 2014 were reviewed retrospectively. Data related to demographics, etiologies, implant-related information, complications, and hearing and speech performance were collected. Results. In total, 22 (3.74%) cases were found to have major complications. Infection (n = 12) and hard failure of the device (n = 8) were the most common major complications. Among them, 13 were reimplanted in our hospital. The mean scores of the Categorical Auditory Performance (CAP) and the Speech Intelligibility Rating (SIR) obtained before and after reimplantation were 5.5 versus 5.8 and 3.7 versus 4.3, respectively. The SIR score after reimplantation was significantly better than preoperation. Conclusions. Cochlear implantation is a safe procedure with low rates of postsurgical revisions and device failures. The Mandarin-speaking patients in this study who received reimplantation had restored auditory performance and speech intelligibility after surgery. Device soft failure was rare in our series, calling attention to Mandarin-speaking CI users requiring revision of their implants due to undesirable symptoms or decreasing performance of uncertain cause. PMID:27413753

  15. Auditory "bubbles": Efficient classification of the spectrotemporal modulations essential for speech intelligibility.

    PubMed

    Venezia, Jonathan H; Hickok, Gregory; Richards, Virginia M

    2016-08-01

    Speech intelligibility depends on the integrity of spectrotemporal patterns in the signal. The current study is concerned with the speech modulation power spectrum (MPS), which is a two-dimensional representation of energy at different combinations of temporal and spectral (i.e., spectrotemporal) modulation rates. A psychophysical procedure was developed to identify the regions of the MPS that contribute to successful reception of auditory sentences. The procedure, based on the two-dimensional image classification technique known as "bubbles" (Gosselin and Schyns (2001). Vision Res. 41, 2261-2271), involves filtering (i.e., degrading) the speech signal by removing parts of the MPS at random, and relating filter patterns to observer performance (keywords identified) over a number of trials. The result is a classification image (CImg) or "perceptual map" that emphasizes regions of the MPS essential for speech intelligibility. This procedure was tested using normal-rate and 2×-time-compressed sentences. The results indicated: (a) CImgs could be reliably estimated in individual listeners in relatively few trials, (b) CImgs tracked changes in spectrotemporal modulation energy induced by time compression, though not completely, indicating that "perceptual maps" deviated from physical stimulus energy, and PMID:27586738

  16. A physical method for estimating speech intelligibility in a reverberant sound field

    NASA Astrophysics Data System (ADS)

    Ogura, Y.

    1984-12-01

    The MTF-STI method which is a physical method for measuring the quality of speech-transmission in a tunnel was investigated and it appears that the STI, which can be deduced from the MTF, correlates highly with the sound articulation score. The character of the information loss represented by the MTF, and the calculating system of the MTF are considered. In this system the effect of the reverberation on the MTF is calculated from the impulse response in a tunnel, and the effect of the noise separate from the effect of the reverberation is considered. The MTF is converted to the STI (Speech Transmission Index), which corresponds directly to the speech intelligibility. Essentially the STI represents an extension of the Articulation Index (AI) concept, therefore we determine the values of the parameters used in the STI calculation from the parameters of the AI for Japanese. Resulting STI correlates highly with the -log(1-s), where s is a sound articulation score. The data suggest that the STI may serve as a convenient predictor of speech intelligibility in a tunnel.

  17. Intelligibility of American English Vowels of Native and Non-Native Speakers in Quiet and Speech-Shaped Noise

    ERIC Educational Resources Information Center

    Liu, Chang; Jin, Su-Hyun

    2013-01-01

    This study examined intelligibility of twelve American English vowels produced by English, Chinese, and Korean native speakers in quiet and speech-shaped noise in which vowels were presented at six sensation levels from 0 dB to 10 dB. The slopes of vowel intelligibility functions and the processing time for listeners to identify vowels were…

  18. Preparing an E-learning-based Speech Therapy (EST) efficacy study: Identifying suitable outcome measures to detect within-subject changes of speech intelligibility in dysarthric speakers.

    PubMed

    Beijer, L J; Rietveld, A C M; Ruiter, M B; Geurts, A C H

    2014-12-01

    We explored the suitability of perceptual and acoustic outcome measures to prepare E-learning based Speech Therapy (EST) efficacy tests regarding speech intelligibility in dysarthric speakers. Eight speakers with stroke (n=3), Parkinson's disease (n=4) and traumatic brain injury (n=1) participated in a 4 weeks EST trial. A repeated measures design was employed. Perceptual measures were (a) scale ratings for "ease of intelligibility" and "pleasantness" in continuous speech and (b) orthographic transcription scores of semantically unpredictable sentences. Acoustic measures were (c) "intensity during closure" (ΔIDC) in the occlusion phase of voiceless plosives, (d) changes in the vowel space of /a/, /e/ and /o/ and (e) the F0 variability in semantically unpredictable sentences. The only consistent finding concerned an increased (instead of the expected decreased) ΔIDC after EST, possibly caused by increased speech intensity without articulatory adjustments. The importance of suitable perceptual and acoustic measures for efficacy research is discussed. PMID:25025268

  19. Effects of cross-language voice training on speech perception: Whose familiar voices are more intelligible?

    PubMed Central

    Levi, Susannah V.; Winters, Stephen J.; Pisoni, David B.

    2011-01-01

    Previous research has shown that familiarity with a talker’s voice can improve linguistic processing (herein, “Familiar Talker Advantage”), but this benefit is constrained by the context in which the talker’s voice is familiar. The current study examined how familiarity affects intelligibility by manipulating the type of talker information available to listeners. One group of listeners learned to identify bilingual talkers’ voices from English words, where they learned language-specific talker information. A second group of listeners learned the same talkers from German words, and thus only learned language-independent talker information. After voice training, both groups of listeners completed a word recognition task with English words produced by both familiar and unfamiliar talkers. Results revealed that English-trained listeners perceived more phonemes correct for familiar than unfamiliar talkers, while German-trained listeners did not show improved intelligibility for familiar talkers. The absence of a processing advantage in speech intelligibility for the German-trained listeners demonstrates limitations on the Familiar Talker Advantage, which crucially depends on the language context in which the talkers’ voices were learned; knowledge of how a talker produces linguistically relevant contrasts in a particular language is necessary to increase speech intelligibility for words produced by familiar talkers. PMID:22225059

  20. Bidirectional clear speech perception benefit for native and high-proficiency non-native talkers and listeners: Intelligibility and accentednessa

    PubMed Central

    Smiljanić, Rajka; Bradlow, Ann R.

    2011-01-01

    This study investigated how native language background interacts with speaking style adaptations in determining levels of speech intelligibility. The aim was to explore whether native and high proficiency non-native listeners benefit similarly from native and non-native clear speech adjustments. The sentence-in-noise perception results revealed that fluent non-native listeners gained a large clear speech benefit from native clear speech modifications. Furthermore, proficient non-native talkers in this study implemented conversational-to-clear speaking style modifications in their second language (L2) that resulted in significant intelligibility gain for both native and non-native listeners. The results of the accentedness ratings obtained for native and non-native conversational and clear speech sentences showed that while intelligibility was improved, the presence of foreign accent remained constant in both speaking styles. This suggests that objective intelligibility and subjective accentedness are two independent dimensions of non-native speech. Overall, these results provide strong evidence that greater experience in L2 processing leads to improved intelligibility in both production and perception domains. These results also demonstrated that speaking style adaptations along with less signal distortion can contribute significantly towards successful native and non-native interactions. PMID:22225056

  1. The correlation between subjective and objective measures of coded speech quality and intelligibility following noise corruption

    NASA Astrophysics Data System (ADS)

    Kayser, J. A.

    1981-12-01

    A scoring metric of speech intelligibility based on linear predictive coding (LPC) was developed and evaluated. The data base used for evaluating the metric consisted of a list of 50 words from the Modified Rhyme Test. The list was transmitted over a LPC-10 Vocoder with no background noise. The list was scored subjectively for intelligibility by a trained listener panel. The subjective scores were used to judge the effectiveness of the objective metric. The LPC scoring metric was calculated for the list of words and compared to the subjective scoring. The intelligibility score for the objective scoring metric was 82.99% with a standard deviation of 14.41%. The score for the subjective listener testing was 84.91% with a standard deviation of 7.47%. This shows a possible correlation between the objective LPC scoring metric and standard subjective listener scoring methods.

  2. Development of speech intelligibility and narrative abilities and their interrelationship three and five years after paediatric cochlear implantation.

    PubMed

    Huttunen, Kerttu

    2008-11-01

    This study sought to determine the level of speech intelligibility, narrative abilities, and their interrelationship in 18 Finnish children implanted at the average age of three years, four months. Additionally, background factors associated with speech intelligibility and storytelling ability were examined. Speech intelligibility was examined by means of an item identification task with five listeners per child. Three and five years after activation of the implant, the children reached average intelligibility scores of 53% and 81%, respectively. The story generation abilities of the implanted children exceeded their hearing age by one year, on average. This was found after comparing their results with those of normally-hearing two- to six-year-olds (N = 49). According to multiple regression analysis, comorbidity (number of additional needs), chronological age, and/or age at activation usually explained from 46% to 70% of the variation in speech intelligibility and narrative abilities. After controlling for age, communication mode, and number of additional needs, speech intelligibility and ability to narrate were statistically significantly associated with each other three years after activation, but not anymore five years after activation. PMID:19012111

  3. Role of intelligence tests in speech/language referrals.

    PubMed

    Sparks, R; Ganschow, L; Thomas, A

    1996-08-01

    This study examined the relation of the WISC-R Verbal IQ with measures of oral and written language among 190 students referred to a private educational clinic over a 5-yr. period. Correlations of Verbal IQ with scores on measures of oral language, written language, receptive language, reading comprehension, and basic reading skills were calculated for the total sample and by Grades 1-3, 4-7, and 8-11. Standard regression coefficients were used to estimate the proportion of variance explained by these five measures. Significant correlations were found for Verbal IQ with the measures, ranging from .36 (Basic Reading Skills) to .69 (Receptive Vocabulary). Multiple regression indicated that 59% of the variance was explained by the five measures and that three--Oral Language, Receptive Vocabulary, and Reading Comprehension--contributed significantly to Verbal IQ. Correlations across grades showed inconsistent differences by grade for Verbal IQ with language variables. Implications for speech-language referral practices are discussed. PMID:8873192

  4. Analysis of masking effects on speech intelligibility with respect to moving sound stimulus

    NASA Astrophysics Data System (ADS)

    Chen, Chiung Yao

    2001-05-01

    The purpose of this study is to compare the disturbed degree of speech by an immovable noise source and an apparent moving one (AMN). In the study of the sound localization, we found that source-directional sensitivity (SDS) well associates with the magnitude of interaural cross correlation (IACC). Ando et al. [Y. Ando, S. H. Kang, and H. Nagamatsu, J. Acoust. Soc. Jpn. (E) 8, 183-190 (1987)] reported that potential correlation between left and right inferior colliculus at auditory path in the brain is in harmony with the correlation function of amplitude input into two ear-canal entrances. We assume that the degree of disturbance under the apparent moving noisy source is probably different from that being installed in front of us within a constant distance in a free field (no reflection). Then, we found there is a different influence on speech intelligibility between a moving and a fixed source generated by 1/3-octave narrow-band noise with the center frequency 2 kHz. However, the reasons for the moving speed and the masking effects on speech intelligibility were uncertain.

  5. Inferior frontal sensitivity to common speech sounds is amplified by increasing word intelligibility

    PubMed Central

    Vaden, Kenneth I.; Kuchinsky, Stefanie E.; Keren, Noam I.; Harris, Kelly C.; Ahlstrom, Jayne B.; Dubno, Judy R.; Eckert, Mark A.

    2011-01-01

    The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of phonotactic frequency to LIFG activity by manipulating word intelligibility in participants of varying age. Thirty six native English speakers, 19–79 years old (mean = 50.5, sd = 21.0) indicated with a button press whether they recognized 120 binaurally presented consonant-vowel-consonant words during a sparse sampling fMRI experiment (TR = 8 sec). Word intelligibility was manipulated by low-pass filtering (cutoff frequencies of 400 Hz, 1000 Hz, 1600 Hz, and 3150 Hz). Group analyses revealed a significant positive correlation between phonotactic frequency and LIFG activity, which was unaffected by age and hearing thresholds. A region of interest analysis revealed that the relation between phonotactic frequency and LIFG activity was significantly strengthened for the most intelligible words (low-pass cutoff at 3150 Hz). These results suggest that the responsiveness of the left inferior frontal cortex to phonotactic frequency reflects the downstream impact of word recognition rather than support of word recognition, at least when there are no speech production demands. PMID:21925521

  6. Factors Affecting the Variation of Maximum Speech Intelligibility in Patients With Sensorineural Hearing Loss Other Than Apparent Retrocochlear Lesions

    PubMed Central

    Yahata, Izumi; Miyazaki, Hiromitsu; Takata, Yusuke; Yamauchi, Daisuke; Nomura, Kazuhiro; Katori, Yukio

    2015-01-01

    Objectives To examine the relationship between speech intelligibilities among the similar level of hearing loss and threshold elevation of the auditory brainstem response (ABR). Methods The relationship between maximum speech intelligibilities among similar levels of hearing loss and relative threshold elevation of the click-evoked ABR (ABR threshold - pure tone average at 2,000 and 4,000 Hz) was retrospectively reviewed in patients with sensorineural hearing loss (SNHL) other than apparent retrocochlear lesions as auditory neuropathy, vestibular schwannoma and the other brain lesions. Results Comparison of the speech intelligibilities in subjects with similar levels of hearing loss found that the variation in maximum speech intelligibility was significantly correlated with the threshold elevation of the ABR. Conclusion The present results appear to support the idea that variation in maximum speech intelligibility in patients with similar levels of SNHL may be related to the different degree of dysfunctions of the inner hair cells and/or cochlear nerves in addition to those of outer hair cells. PMID:26330909

  7. Developing a speech intelligibility test based on measuring speech reception thresholds in noise for English and Finnish

    NASA Astrophysics Data System (ADS)

    Vainio, Martti; Suni, Antti; Järveläinen, Hanna; Järvikivi, Juhani; Mattila, Ville-Veikko

    2005-09-01

    A subjective test was developed suitable for evaluating the effect of mobile communications devices on sentence intelligibility in background noise. Originally a total of 25 lists, each list including 16 sentences, were developed in British English and Finnish to serve as the test stimuli representative of adult language today. The sentences, produced by two male and two female speakers, were normalized for naturalness, length, and intelligibility in each language. The sentence sets were balanced with regard to the expected lexical and phonetic distributions in the given language. The sentence lists are intended for adaptive measurement of speech reception thresholds (SRTs) in noise. In the verification of the test stimuli, SRTs were measured for ten subjects in Finnish and nine subjects in English. Mean SRTs were -2.47 dB in Finnish and -1.12 dB in English, with standard deviations of 1.61 and 2.36 dB, respectively. The mean thresholds did not vary significantly between the lists or the talkers after two lists were removed from the Finnish set and one from the English set. Thus the numbers of lists were reduced from 25 to 23 and 24, respectively. The statistical power of the test increased when thresholds were averaged over several sentence lists. With three lists per condition, the test is able to detect a 1.5-dB difference in SRTs with the probability of about 90%.

  8. Dual-microphone and binaural noise reduction techniques for improved speech intelligibility by hearing aid users

    NASA Astrophysics Data System (ADS)

    Yousefian Jazi, Nima

    Spatial filtering and directional discrimination has been shown to be an effective pre-processing approach for noise reduction in microphone array systems. In dual-microphone hearing aids, fixed and adaptive beamforming techniques are the most common solutions for enhancing the desired speech and rejecting unwanted signals captured by the microphones. In fact, beamformers are widely utilized in systems where spatial properties of target source (usually in front of the listener) is assumed to be known. In this dissertation, some dual-microphone coherence-based speech enhancement techniques applicable to hearing aids are proposed. All proposed algorithms operate in the frequency domain and (like traditional beamforming techniques) are purely based on the spatial properties of the desired speech source and does not require any knowledge of noise statistics for calculating the noise reduction filter. This benefit gives our algorithms the ability to address adverse noise conditions, such as situations where interfering talker(s) speaks simultaneously with the target speaker. In such cases, the (adaptive) beamformers lose their effectiveness in suppressing interference, since the noise channel (reference) cannot be built and updated accordingly. This difference is the main advantage of the proposed techniques in the dissertation over traditional adaptive beamformers. Furthermore, since the suggested algorithms are independent of noise estimation, they offer significant improvement in scenarios that the power level of interfering sources are much more than that of target speech. The dissertation also shows the premise behind the proposed algorithms can be extended and employed to binaural hearing aids. The main purpose of the investigated techniques is to enhance the intelligibility level of speech, measured through subjective listening tests with normal hearing and cochlear implant listeners. However, the improvement in quality of the output speech achieved by the

  9. Synthesized speech rate and pitch effects on intelligibility of warning messages for pilots

    NASA Technical Reports Server (NTRS)

    Simpson, C. A.; Marchionda-Frost, K.

    1984-01-01

    In civilian and military operations, a future threat-warning system with a voice display could warn pilots of other traffic, obstacles in the flight path, and/or terrain during low-altitude helicopter flights. The present study was conducted to learn whether speech rate and voice pitch of phoneme-synthesized speech affects pilot accuracy and response time to typical threat-warning messages. Helicopter pilots engaged in an attention-demanding flying task and listened for voice threat warnings presented in a background of simulated helicopter cockpit noise. Performance was measured by flying-task performance, threat-warning intelligibility, and response time. Pilot ratings were elicited for the different voice pitches and speech rates. Significant effects were obtained only for response time and for pilot ratings, both as a function of speech rate. For the few cases when pilots forgot to respond to a voice message, they remembered 90 percent of the messages accurately when queried for their response 8 to 10 sec later.

  10. A multi-language evaluation of the RASTI method for estimating speech intelligibility in auditoria

    NASA Astrophysics Data System (ADS)

    Houtgast, T.; Steeneken, H. J. M.

    1982-01-01

    The physical measure Rapid Speech Transmission Index (RASTI) was developed to assess speech intelligibility in auditoria. In order to evaluate this method, a set of 14 auditorium conditions (plus 2 replicas) with various degrees of reverberation and/or interfering noise were subjected to: (1) RASTI measurements; (2) articulation tests performed by laboratories in 11 different countries; and (3) additional quality rating experiment by 4 of these laboratories. The various listening experiments show substantial differences in the ranking of the 14 conditions. For instance, it appears that the absence of a carrier phrase in some of the articulation tests has great influence on the relative importance of reverberation as compared to noise interference. When considering only the tests which use an appropriate carrier phrase (7 countries), it is found that the RASTI values are in good agreement with the mean results of these articulation tests.

  11. Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise

    PubMed Central

    Chen, Fei; Loizou, Philipos C.

    2012-01-01

    Recent evidence suggests that spectral change, as measured by cochlea-scaled entropy (CSE), predicts speech intelligibility better than the information carried by vowels or consonants in sentences. Motivated by this finding, the present study investigates whether intelligibility indices implemented to include segments marked with significant spectral change better predict speech intelligibility in noise than measures that include all phonetic segments paying no attention to vowels/consonants or spectral change. The prediction of two intelligibility measures [normalized covariance measure (NCM), coherence-based speech intelligibility index (CSII)] is investigated using three sentence-segmentation methods: relative root-mean-square (RMS) levels, CSE, and traditional phonetic segmentation of obstruents and sonorants. While the CSE method makes no distinction between spectral changes occurring within vowels/consonants, the RMS-level segmentation method places more emphasis on the vowel-consonant boundaries wherein the spectral change is often most prominent, and perhaps most robust, in the presence of noise. Higher correlation with intelligibility scores was obtained when including sentence segments containing a large number of consonant-vowel boundaries than when including segments with highest entropy or segments based on obstruent/sonorant classification. These data suggest that in the context of intelligibility measures the type of spectral change captured by the measure is important. PMID:22559382

  12. Comparison of speech intelligibility in cockpit noise using SPH-4 flight helmet with and without active noise reduction

    NASA Technical Reports Server (NTRS)

    Chan, Jeffrey W.; Simpson, Carol A.

    1990-01-01

    Active Noise Reduction (ANR) is a new technology which can reduce the level of aircraft cockpit noise that reaches the pilot's ear while simultaneously improving the signal to noise ratio for voice communications and other information bearing sound signals in the cockpit. A miniature, ear-cup mounted ANR system was tested to determine whether speech intelligibility is better for helicopter pilots using ANR compared to a control condition of ANR turned off. Two signal to noise ratios (S/N), representative of actual cockpit conditions, were used for the ratio of the speech to cockpit noise sound pressure levels. Speech intelligibility was significantly better with ANR compared to no ANR for both S/N conditions. Variability of speech intelligibility among pilots was also significantly less with ANR. When the stock helmet was used with ANR turned off, the average PB Word speech intelligibility score was below the Normally Acceptable level. In comparison, it was above that level with ANR on in both S/N levels.

  13. Effect of the division between early and late reflections on intelligibility of ideal binary-masked speech.

    PubMed

    Li, Junfeng; Xia, Risheng; Fang, Qiang; Li, Aijun; Pan, Jielin; Yan, Yonghong

    2015-05-01

    The ideal binary mask (IBM) that was originally defined in anechoic conditions has been found to yield substantial improvements in speech intelligibility in noise. The IBM has recently been extended to reverberant conditions where the direct sound and early reflections of target speech are regarded as the desired signal. It is of great interest to know how the division between early and late reflections impacts on the intelligibility of the IBM-processed noisy reverberant speech. In this present study, the division between early and late reflections in three rooms was first determined by four typical estimation approaches and then used to compute the IBMs in reverberant conditions. The IBMs were then applied to the noisy reverberant mixture signal for segregating the desired signal, and the segregated signal was further presented to normal-hearing listeners for word recognition. Results showed that the IBMs with different divisions between early and late reflections provided substantial improvements in speech intelligibility over the unprocessed mixture signals in all conditions tested, and there were small, but statistically significant, differences in speech intelligibility between the different IBMs in some conditions tested. PMID:25994708

  14. Effects of noise reduction on speech intelligibility, perceived listening effort, and personal preference in hearing-impaired listeners.

    PubMed

    Brons, Inge; Houben, Rolph; Dreschler, Wouter A

    2014-01-01

    This study evaluates the perceptual effects of single-microphone noise reduction in hearing aids. Twenty subjects with moderate sensorineural hearing loss listened to speech in babble noise processed via noise reduction from three different linearly fitted hearing aids. Subjects performed (a) speech-intelligibility tests, (b) listening-effort ratings, and (c) paired-comparison ratings on noise annoyance, speech naturalness, and overall preference. The perceptual effects of noise reduction differ between hearing aids. The results agree well with those of normal-hearing listeners in a previous study. None of the noise-reduction algorithms improved speech intelligibility, but all reduced the annoyance of noise. The noise reduction that scored best with respect to noise annoyance and preference had the worst intelligibility scores. The trade-off between intelligibility and listening comfort shows that preference measurements might be useful in addition to intelligibility measurements in the selection of noise reduction. Additionally, this trade-off should be taken into consideration to create realistic expectations in hearing-aid users. PMID:25315377

  15. Evaluation of Speech Intelligibility and Sound Localization Abilities with Hearing Aids Using Binaural Wireless Technology

    PubMed Central

    Ibrahim, Iman; Parsa, Vijay; Macpherson, Ewan; Cheesman, Margaret

    2012-01-01

    Wireless synchronization of the digital signal processing (DSP) features between two hearing aids in a bilateral hearing aid fitting is a fairly new technology. This technology is expected to preserve the differences in time and intensity between the two ears by co-ordinating the bilateral DSP features such as multichannel compression, noise reduction, and adaptive directionality. The purpose of this study was to evaluate the benefits of wireless communication as implemented in two commercially available hearing aids. More specifically, this study measured speech intelligibility and sound localization abilities of normal hearing and hearing impaired listeners using bilateral hearing aids with wireless synchronization of multichannel Wide Dynamic Range Compression (WDRC). Twenty subjects participated; 8 had normal hearing and 12 had bilaterally symmetrical sensorineural hearing loss. Each individual completed the Hearing in Noise Test (HINT) and a sound localization test with two types of stimuli. No specific benefit from wireless WDRC synchronization was observed for the HINT; however, hearing impaired listeners had better localization with the wireless synchronization. Binaural wireless technology in hearing aids may improve localization abilities although the possible effect appears to be small at the initial fitting. With adaptation, the hearing aids with synchronized signal processing may lead to an improvement in localization and speech intelligibility. Further research is required to demonstrate the effect of adaptation to the hearing aids with synchronized signal processing on different aspects of auditory performance. PMID:26557339

  16. Speech Intelligibility in Various Noise Conditions with the Nucleus® 5 CP810 Sound Processor

    PubMed Central

    Lai, Wai Kong

    2015-01-01

    The Nucleus® 5 System Sound Processor (CP810, Cochlear™, Macquarie University, NSW, Australia) contains two omnidirectional microphones. They can be configured as a fixed directional microphone combination (called Zoom) or as an adaptive beamformer (called Beam), which adjusts the directivity continuously to maximally reduce the interfering noise. Initial evaluation studies with the CP810 had compared performance and usability of the new processor in comparison with the Freedom™ Sound Processor (Cochlear™) for speech in quiet and noise for a subset of the processing options. This study compares the two processing options suggested to be used in noisy environments, Zoom and Beam, for various sound field conditions using a standardized speech in noise matrix test (Oldenburg sentences test). Nine German-speaking subjects who previously had been using the Freedom speech processor and subsequently were upgraded to the CP810 device participated in this series of additional evaluation tests. The speech reception threshold (SRT for 50% speech intelligibility in noise) was determined using sentences presented via loudspeaker at 65 dB SPL in front of the listener and noise presented either via the same loudspeaker (S0N0) or at 90 degrees at either the ear with the sound processor (S0NCI+) or the opposite unaided ear (S0NCI-). The fourth noise condition consisted of three uncorrelated noise sources placed at 90, 180 and 270 degrees. The noise level was adjusted through an adaptive procedure to yield a signal to noise ratio where 50% of the words in the sentences were correctly understood. In spatially separated speech and noise conditions both Zoom and Beam could improve the SRT significantly. For single noise sources, either ipsilateral or contralateral to the cochlear implant sound processor, average improvements with Beam of 12.9 and 7.9 dB in SRT were found. The average SRT of –8 dB for Beam in the diffuse noise condition (uncorrelated noise from both sides and

  17. Speech Intelligibility in Various Noise Conditions with the Nucleus® 5 CP810 Sound Processor.

    PubMed

    Dillier, Norbert; Lai, Wai Kong

    2015-06-11

    The Nucleus(®) 5 System Sound Processor (CP810, Cochlear™, Macquarie University, NSW, Australia) contains two omnidirectional microphones. They can be configured as a fixed directional microphone combination (called Zoom) or as an adaptive beamformer (called Beam), which adjusts the directivity continuously to maximally reduce the interfering noise. Initial evaluation studies with the CP810 had compared performance and usability of the new processor in comparison with the Freedom™ Sound Processor (Cochlear™) for speech in quiet and noise for a subset of the processing options. This study compares the two processing options suggested to be used in noisy environments, Zoom and Beam, for various sound field conditions using a standardized speech in noise matrix test (Oldenburg sentences test). Nine German-speaking subjects who previously had been using the Freedom speech processor and subsequently were upgraded to the CP810 device participated in this series of additional evaluation tests. The speech reception threshold (SRT for 50% speech intelligibility in noise) was determined using sentences presented via loudspeaker at 65 dB SPL in front of the listener and noise presented either via the same loudspeaker (S0N0) or at 90 degrees at either the ear with the sound processor (S0NCI+) or the opposite unaided ear (S0NCI-). The fourth noise condition consisted of three uncorrelated noise sources placed at 90, 180 and 270 degrees. The noise level was adjusted through an adaptive procedure to yield a signal to noise ratio where 50% of the words in the sentences were correctly understood. In spatially separated speech and noise conditions both Zoom and Beam could improve the SRT significantly. For single noise sources, either ipsilateral or contralateral to the cochlear implant sound processor, average improvements with Beam of 12.9 and 7.9 dB in SRT were found. The average SRT of -8 dB for Beam in the diffuse noise condition (uncorrelated noise from both sides and

  18. Perception of interrupted speech: Effects of dual-rate gating on the intelligibility of words and sentencesa

    PubMed Central

    Shafiro, Valeriy; Sheft, Stanley; Risley, Robert

    2011-01-01

    Perception of interrupted speech and the influence of speech materials and memory load were investigated using one or two concurrent square-wave gating functions. Sentences (Experiment 1) and random one-, three-, and five-word sequences (Experiment 2) were interrupted using either a primary gating rate alone (0.5−24 Hz) or a combined primary and faster secondary rate. The secondary rate interrupted only speech left intact after primary gating, reducing the original speech to 25%. In both experiments, intelligibility increased with primary rate, but varied with memory load and speech material (highest for sentences, lowest for five-word sequences). With dual-rate gating of sentences, intelligibility with fast secondary rates was superior to that with single rates and a 25% duty cycle, approaching that of single rates with a 50% duty cycle for some low and high rates. For dual-rate gating of words, the positive effect of fast secondary gating was smaller than for sentences, and the advantage of sentences over word-sequences was not obtained in many dual-rate conditions. These findings suggest that integration of interrupted speech fragments after gating depends on the duration of the gated speech interval and that sufficiently robust acoustic-phonetic word cues are needed to access higher-level contextual sentence information. PMID:21973362

  19. Spectral contrast enhancement improves speech intelligibility in noise for cochlear implants.

    PubMed

    Nogueira, Waldo; Rode, Thilo; Büchner, Andreas

    2016-02-01

    Spectral smearing causes, at least partially, that cochlear implant (CI) users require a higher signal-to-noise ratio to obtain the same speech intelligibility as normal hearing listeners. A spectral contrast enhancement (SCE) algorithm has been designed and evaluated as an additional feature for a standard CI strategy. The algorithm keeps the most prominent peaks within a speech signal constant while attenuating valleys in the spectrum. The goal is to partly compensate for the spectral smearing produced by the limited number of stimulation electrodes and the overlap of electrical fields produced in CIs. Twelve CI users were tested for their speech reception threshold (SRT) using the standard CI coding strategy with and without SCE. No significant differences in SRT were observed between conditions. However, an analysis of the electrical stimulation patterns shows a reduction in stimulation current when using SCE. In a second evaluation, 12 CI users were tested in a similar configuration of the SCE strategy with the stimulation being balanced between the SCE and the non-SCE variants such that the loudness perception delivered by the strategies was the same. Results show a significant improvement in SRT of 0.57 dB (p < 0.0005) for the SCE algorithm. PMID:26936556

  20. Previous exposure to intact speech increases intelligibility of its digitally degraded counterpart as a function of stimulus complexity.

    PubMed

    Hakonen, Maria; May, Patrick J C; Alho, Jussi; Alku, Paavo; Jokinen, Emma; Jääskeläinen, Iiro P; Tiitinen, Hannu

    2016-01-15

    Recent studies have shown that acoustically distorted sentences can be perceived as either unintelligible or intelligible depending on whether one has previously been exposed to the undistorted, intelligible versions of the sentences. This allows studying processes specifically related to speech intelligibility since any change between the responses to the distorted stimuli before and after the presentation of their undistorted counterparts cannot be attributed to acoustic variability but, rather, to the successful mapping of sensory information onto memory representations. To estimate how the complexity of the message is reflected in speech comprehension, we applied this rapid change in perception to behavioral and magnetoencephalography (MEG) experiments using vowels, words and sentences. In the experiments, stimuli were initially presented to the subject in a distorted form, after which undistorted versions of the stimuli were presented. Finally, the original distorted stimuli were presented once more. The resulting increase in intelligibility observed for the second presentation of the distorted stimuli depended on the complexity of the stimulus: vowels remained unintelligible (behaviorally measured intelligibility 27%) whereas the intelligibility of the words increased from 19% to 45% and that of the sentences from 31% to 65%. This increase in the intelligibility of the degraded stimuli was reflected as an enhancement of activity in the auditory cortex and surrounding areas at early latencies of 130-160ms. In the same regions, increasing stimulus complexity attenuated mean currents at latencies of 130-160ms whereas at latencies of 200-270ms the mean currents increased. These modulations in cortical activity may reflect feedback from top-down mechanisms enhancing the extraction of information from speech. The behavioral results suggest that memory-driven expectancies can have a significant effect on speech comprehension, especially in acoustically adverse

  1. Hybridizing Conversational and Clear Speech to Investigate the Source of Increased Intelligibility in Speakers with Parkinson's Disease

    ERIC Educational Resources Information Center

    Tjaden, Kris; Kain, Alexander; Lam, Jennifer

    2014-01-01

    Purpose: A speech analysis-resynthesis paradigm was used to investigate segmental and suprasegmental acoustic variables explaining intelligibility variation for 2 speakers with Parkinson's disease (PD). Method: Sentences were read in conversational and clear styles. Acoustic characteristics from clear sentences were extracted and applied to…

  2. Conversational and clear speech intelligibility of /bVd/ syllables produced by native and non-native English speakers.

    PubMed

    Rogers, Catherine L; DeMasi, Teresa M; Krause, Jean C

    2010-07-01

    The ability of native and non-native speakers to enhance intelligibility of target vowels by speaking clearly was compared across three talker groups: monolingual English speakers and native Spanish speakers with either an earlier or a later age of immersion in an English-speaking environment. Talkers produced the target syllables "bead, bid, bayed, bed, bad" and "bod" in 'conversational' and clear speech styles. The stimuli were presented to native English-speaking listeners in multi-talker babble with signal-to-noise ratios of -8 dB for the monolingual and early learners and -4 dB for the later learners. The monolinguals and early learners of English showed a similar average clear speech benefit, and the early learners showed equal or greater intelligibility than monolinguals for most target vowels. The 4-dB difference in signal-to-noise ratio yielded approximately equal average intelligibility for the monolinguals and later learners. The average clear speech benefit was smallest for the later learners, and a significant clear speech decrement was obtained for the target syllable "bid." These results suggest that later learners of English as a second language may be less able than monolinguals to accommodate listeners in noisy environments, due to a reduced ability to improve intelligibility by speaking more clearly. PMID:20649235

  3. The Intelligibility and Comprehensibility of Learner Speech in Russian: A Study in the Salience of Pronunciation, Lexicon, Grammar and Syntax

    ERIC Educational Resources Information Center

    Neuendorf, Jill A.

    2010-01-01

    This study of L-2 Russian interlanguage production examined the salience of phonetic, lexical and syntactical features for L-1 listener intelligibility, based on L-2 recitation of written scripts (Part I) and also unrehearsed speech (Part II). Part III of the study investigated strategies used by native-speaking teachers of Russian as a Second…

  4. Relations Between the Intelligibility of Speech in Noise and Psychophysical Measures of Hearing Measured in Four Languages Using the Auditory Profile Test Battery

    PubMed Central

    Van Esch, T. E. M.

    2015-01-01

    The aim of the present study was to determine the relations between the intelligibility of speech in noise and measures of auditory resolution, loudness recruitment, and cognitive function. The analyses were based on data published earlier as part of the presentation of the Auditory Profile, a test battery implemented in four languages. Tests of the intelligibility of speech, resolution, loudness recruitment, and lexical decision making were measured using headphones in five centers: in Germany, the Netherlands, Sweden, and the United Kingdom. Correlations and stepwise linear regression models were calculated. In sum, 72 hearing-impaired listeners aged 22 to 91 years with a broad range of hearing losses were included in the study. Several significant correlations were found with the intelligibility of speech in noise. Stepwise linear regression analyses showed that pure-tone average, age, spectral and temporal resolution, and loudness recruitment were significant predictors of the intelligibility of speech in fluctuating noise. Complex interrelationships between auditory factors and the intelligibility of speech in noise were revealed using the Auditory Profile data set in four languages. After taking into account the effects of pure-tone average and age, spectral and temporal resolution and loudness recruitment had an added value in the prediction of variation among listeners with respect to the intelligibility of speech in noise. The results of the lexical decision making test were not related to the intelligibility of speech in noise, in the population studied. PMID:26647417

  5. Relations Between the Intelligibility of Speech in Noise and Psychophysical Measures of Hearing Measured in Four Languages Using the Auditory Profile Test Battery.

    PubMed

    Van Esch, T E M; Dreschler, W A

    2015-01-01

    The aim of the present study was to determine the relations between the intelligibility of speech in noise and measures of auditory resolution, loudness recruitment, and cognitive function. The analyses were based on data published earlier as part of the presentation of the Auditory Profile, a test battery implemented in four languages. Tests of the intelligibility of speech, resolution, loudness recruitment, and lexical decision making were measured using headphones in five centers: in Germany, the Netherlands, Sweden, and the United Kingdom. Correlations and stepwise linear regression models were calculated. In sum, 72 hearing-impaired listeners aged 22 to 91 years with a broad range of hearing losses were included in the study. Several significant correlations were found with the intelligibility of speech in noise. Stepwise linear regression analyses showed that pure-tone average, age, spectral and temporal resolution, and loudness recruitment were significant predictors of the intelligibility of speech in fluctuating noise. Complex interrelationships between auditory factors and the intelligibility of speech in noise were revealed using the Auditory Profile data set in four languages. After taking into account the effects of pure-tone average and age, spectral and temporal resolution and loudness recruitment had an added value in the prediction of variation among listeners with respect to the intelligibility of speech in noise. The results of the lexical decision making test were not related to the intelligibility of speech in noise, in the population studied. PMID:26647417

  6. The Effect of Automatic Gain Control Structure and Release Time on Cochlear Implant Speech Intelligibility

    PubMed Central

    Khing, Phyu P.; Swanson, Brett A.; Ambikairajah, Eliathamby

    2013-01-01

    Nucleus cochlear implant systems incorporate a fast-acting front-end automatic gain control (AGC), sometimes called a compression limiter. The objective of the present study was to determine the effect of replacing the front-end compression limiter with a newly proposed envelope profile limiter. A secondary objective was to investigate the effect of AGC speed on cochlear implant speech intelligibility. The envelope profile limiter was located after the filter bank and reduced the gain when the largest of the filter bank envelopes exceeded the compression threshold. The compression threshold was set equal to the saturation level of the loudness growth function (i.e. the envelope level that mapped to the maximum comfortable current level), ensuring that no envelope clipping occurred. To preserve the spectral profile, the same gain was applied to all channels. Experiment 1 compared sentence recognition with the front-end limiter and with the envelope profile limiter, each with two release times (75 and 625 ms). Six implant recipients were tested in quiet and in four-talker babble noise, at a high presentation level of 89 dB SPL. Overall, release time had a larger effect than the AGC type. With both AGC types, speech intelligibility was lower for the 75 ms release time than for the 625 ms release time. With the shorter release time, the envelope profile limiter provided higher group mean scores than the front-end limiter in quiet, but there was no significant difference in noise. Experiment 2 measured sentence recognition in noise as a function of presentation level, from 55 to 89 dB SPL. The envelope profile limiter with 625 ms release time yielded better scores than the front-end limiter with 75 ms release time. A take-home study showed no clear pattern of preferences. It is concluded that the envelope profile limiter is a feasible alternative to a front-end compression limiter. PMID:24312408

  7. Suppressed alpha oscillations predict intelligibility of speech and its acoustic details.

    PubMed

    Obleser, Jonas; Weisz, Nathan

    2012-11-01

    Modulations of human alpha oscillations (8-13 Hz) accompany many cognitive processes, but their functional role in auditory perception has proven elusive: Do oscillatory dynamics of alpha reflect acoustic details of the speech signal and are they indicative of comprehension success? Acoustically presented words were degraded in acoustic envelope and spectrum in an orthogonal design, and electroencephalogram responses in the frequency domain were analyzed in 24 participants, who rated word comprehensibility after each trial. First, the alpha power suppression during and after a degraded word depended monotonically on spectral and, to a lesser extent, envelope detail. The magnitude of this alpha suppression exhibited an additional and independent influence on later comprehension ratings. Second, source localization of alpha suppression yielded superior parietal, prefrontal, as well as anterior temporal brain areas. Third, multivariate classification of the time-frequency pattern across participants showed that patterns of late posterior alpha power allowed best for above-chance classification of word intelligibility. Results suggest that both magnitude and topography of late alpha suppression in response to single words can indicate a listener's sensitivity to acoustic features and the ability to comprehend speech under adverse listening conditions. PMID:22100354

  8. Suppressed Alpha Oscillations Predict Intelligibility of Speech and its Acoustic Details

    PubMed Central

    Weisz, Nathan

    2012-01-01

    Modulations of human alpha oscillations (8–13 Hz) accompany many cognitive processes, but their functional role in auditory perception has proven elusive: Do oscillatory dynamics of alpha reflect acoustic details of the speech signal and are they indicative of comprehension success? Acoustically presented words were degraded in acoustic envelope and spectrum in an orthogonal design, and electroencephalogram responses in the frequency domain were analyzed in 24 participants, who rated word comprehensibility after each trial. First, the alpha power suppression during and after a degraded word depended monotonically on spectral and, to a lesser extent, envelope detail. The magnitude of this alpha suppression exhibited an additional and independent influence on later comprehension ratings. Second, source localization of alpha suppression yielded superior parietal, prefrontal, as well as anterior temporal brain areas. Third, multivariate classification of the time–frequency pattern across participants showed that patterns of late posterior alpha power allowed best for above-chance classification of word intelligibility. Results suggest that both magnitude and topography of late alpha suppression in response to single words can indicate a listener's sensitivity to acoustic features and the ability to comprehend speech under adverse listening conditions. PMID:22100354

  9. The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy

    NASA Astrophysics Data System (ADS)

    Liu, Huei-Mei; Tsao, Feng-Ming; Kuhl, Patricia K.

    2005-06-01

    The purpose of this study was to examine the effect of reduced vowel working space on dysarthric talkers' speech intelligibility using both acoustic and perceptual approaches. In experiment 1, the acoustic-perceptual relationship between vowel working space area and speech intelligibility was examined in Mandarin-speaking young adults with cerebral palsy. Subjects read aloud 18 bisyllabic words containing the vowels /eye/, /aye/, and /you/ using their normal speaking rate. Each talker's words were identified by three normal listeners. The percentage of correct vowel and word identification were calculated as vowel intelligibility and word intelligibility, respectively. Results revealed that talkers with cerebral palsy exhibited smaller vowel working space areas compared to ten age-matched controls. The vowel working space area was significantly correlated with vowel intelligibility (r=0.632, p<0.005) and with word intelligibility (r=0.684, p<0.005). Experiment 2 examined whether tokens of expanded vowel working spaces were perceived as better vowel exemplars and represented with greater perceptual spaces than tokens of reduced vowel working spaces. The results of the perceptual experiment support this prediction. The distorted vowels of talkers with cerebral palsy compose a smaller acoustic space that results in shrunken intervowel perceptual distances for listeners. .

  10. Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss.

    PubMed

    Fogerty, Daniel; Ahlstrom, Jayne B; Bologna, William J; Dubno, Judy R

    2015-06-01

    This study investigated how single-talker modulated noise impacts consonant and vowel cues to sentence intelligibility. Younger normal-hearing, older normal-hearing, and older hearing-impaired listeners completed speech recognition tests. All listeners received spectrally shaped speech matched to their individual audiometric thresholds to ensure sufficient audibility with the exception of a second younger listener group who received spectral shaping that matched the mean audiogram of the hearing-impaired listeners. Results demonstrated minimal declines in intelligibility for older listeners with normal hearing and more evident declines for older hearing-impaired listeners, possibly related to impaired temporal processing. A correlational analysis suggests a common underlying ability to process information during vowels that is predictive of speech-in-modulated noise abilities. Whereas, the ability to use consonant cues appears specific to the particular characteristics of the noise and interruption. Performance declines for older listeners were mostly confined to consonant conditions. Spectral shaping accounted for the primary contributions of audibility. However, comparison with the young spectral controls who received identical spectral shaping suggests that this procedure may reduce wideband temporal modulation cues due to frequency-specific amplification that affected high-frequency consonants more than low-frequency vowels. These spectral changes may impact speech intelligibility in certain modulation masking conditions. PMID:26093436

  11. Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss

    PubMed Central

    Fogerty, Daniel; Ahlstrom, Jayne B.; Bologna, William J.; Dubno, Judy R.

    2015-01-01

    This study investigated how single-talker modulated noise impacts consonant and vowel cues to sentence intelligibility. Younger normal-hearing, older normal-hearing, and older hearing-impaired listeners completed speech recognition tests. All listeners received spectrally shaped speech matched to their individual audiometric thresholds to ensure sufficient audibility with the exception of a second younger listener group who received spectral shaping that matched the mean audiogram of the hearing-impaired listeners. Results demonstrated minimal declines in intelligibility for older listeners with normal hearing and more evident declines for older hearing-impaired listeners, possibly related to impaired temporal processing. A correlational analysis suggests a common underlying ability to process information during vowels that is predictive of speech-in-modulated noise abilities. Whereas, the ability to use consonant cues appears specific to the particular characteristics of the noise and interruption. Performance declines for older listeners were mostly confined to consonant conditions. Spectral shaping accounted for the primary contributions of audibility. However, comparison with the young spectral controls who received identical spectral shaping suggests that this procedure may reduce wideband temporal modulation cues due to frequency-specific amplification that affected high-frequency consonants more than low-frequency vowels. These spectral changes may impact speech intelligibility in certain modulation masking conditions. PMID:26093436

  12. Effects of a music therapy voice protocol on speech intelligibility, vocal acoustic measures, and mood of individuals with Parkinson's disease.

    PubMed

    Haneishi, E

    2001-01-01

    This study examined the effects of a Music Therapy Voice Protocol (MTVP) on speech intelligibility, vocal intensity, maximum vocal range, maximum duration of sustained vowel phonation, vocal fundamental frequency, vocal fundamental frequency variability, and mood of individuals with Parkinson's disease. Four female patients, who demonstrated voice and speech problems, served as their own controls and participated in baseline assessment (study pretest), a series of MTVP sessions involving vocal and singing exercises, and final evaluation (study posttest). In study pre and posttests, data for speech intelligibility and all acoustic variables were collected. Statistically significant increases were found in speech intelligibility, as rated by caregivers, and in vocal intensity from study pretest to posttest as the results of paired samples t-tests. In addition, before and after each MTVP session (session pre and posttests), self-rated mood scores and selected acoustic variables were collected. No significant differences were found in any of the variables from the session pretests to posttests, across the entire treatment period, or their interactions as the results of two-way ANOVAs with repeated measures. Although not significant, the mean of mood scores in session posttests (M = 8.69) was higher than that in session pretests (M = 7.93). PMID:11796078

  13. The Interlanguage Speech Intelligibility Benefit as Bias Toward Native-Language Phonology.

    PubMed

    Wang, Hongyan; van Heuven, Vincent J

    2015-12-01

    Two hypotheses have been advanced in the recent literature with respect to the so-called Interlanguage Speech Intelligibility Benefit (ISIB): a nonnative speaker will be better understood by a another nonnative listener than a native speaker of the target language will be (a) only when the nonnatives share the same native language (matched interlanguage) or (b) even when the nonnatives have different mother tongues (non-matched interlanguage). Based on a survey of published experimental materials, the present article will demonstrate that both the restricted (a) and the generalized (b) hypotheses are false when the ISIB effect is evaluated in terms of absolute intelligibility scores. We will then propose a simple way to compute a relative measure for the ISIB (R-ISIB), which we claim is a more insightful way of evaluating the interlanguage benefit, and test the hypotheses in relative (R-ISIB) terms on the same literature data. We then find that our R-ISIB measure only supports the more restricted hypothesis (a) while rejecting the more general hypothesis (b). This finding shows that the native language shared by the interactants biases the listener toward interpreting sounds in terms of the phonology of the shared mother tongue. PMID:27551352

  14. The Interlanguage Speech Intelligibility Benefit as Bias Toward Native-Language Phonology

    PubMed Central

    van Heuven, Vincent J.

    2015-01-01

    Two hypotheses have been advanced in the recent literature with respect to the so-called Interlanguage Speech Intelligibility Benefit (ISIB): a nonnative speaker will be better understood by a another nonnative listener than a native speaker of the target language will be (a) only when the nonnatives share the same native language (matched interlanguage) or (b) even when the nonnatives have different mother tongues (non-matched interlanguage). Based on a survey of published experimental materials, the present article will demonstrate that both the restricted (a) and the generalized (b) hypotheses are false when the ISIB effect is evaluated in terms of absolute intelligibility scores. We will then propose a simple way to compute a relative measure for the ISIB (R-ISIB), which we claim is a more insightful way of evaluating the interlanguage benefit, and test the hypotheses in relative (R-ISIB) terms on the same literature data. We then find that our R-ISIB measure only supports the more restricted hypothesis (a) while rejecting the more general hypothesis (b). This finding shows that the native language shared by the interactants biases the listener toward interpreting sounds in terms of the phonology of the shared mother tongue. PMID:27551352

  15. Speech communications in noise

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  16. Dual-echo fMRI can detect activations in inferior temporal lobe during intelligible speech comprehension

    PubMed Central

    Halai, Ajay D.; Parkes, Laura M.; Welbourne, Stephen R.

    2015-01-01

    The neural basis of speech comprehension has been investigated intensively during the past few decades. Incoming auditory signals are analysed for speech-like patterns and meaningful information can be extracted by mapping these sounds onto stored semantic representations. Investigation into the neural basis of speech comprehension has largely focused on the temporal lobe, in particular the superior and posterior regions. The ventral anterior temporal lobe (vATL), which includes the inferior temporal gyrus (ITG) and temporal fusiform gyrus (TFG) is consistently omitted in fMRI studies. In contrast, PET studies have shown the involvement of these ventral temporal regions. One crucial factor is the signal loss experienced using conventional echo planar imaging (EPI) for fMRI, at tissue interfaces such as the vATL. One method to overcome this signal loss is to employ a dual-echo EPI technique. The aim of this study was to use intelligible and unintelligible (spectrally rotated) sentences to determine if the vATL could be detected during a passive speech comprehension task using a dual-echo acquisition. A whole brain analysis for an intelligibility contrast showed bilateral superior temporal lobe activations and a cluster of activation within the left vATL. Converging evidence implicates the same ventral temporal regions during semantic processing tasks, which include language processing. The specific role of the ventral temporal region during intelligible speech processing cannot be determined from this data alone, but the converging evidence from PET, MEG, TMS and neuropsychology strongly suggest that it contains the stored semantic representations, which are activated by the speech decoding process. PMID:26037055

  17. Variations in the Slope of the Psychometric Functions for Speech Intelligibility: A Systematic Survey

    PubMed Central

    Akeroyd, Michael A.

    2014-01-01

    Although many studies have looked at the effects of different listening conditions on the intelligibility of speech, their analyses have often concentrated on changes to a single value on the psychometric function, namely, the threshold. Far less commonly has the slope of the psychometric function, that is, the rate at which intelligibility changes with level, been considered. The slope of the function is crucial because it is the slope, rather than the threshold, that determines the improvement in intelligibility caused by any given improvement in signal-to-noise ratio by, for instance, a hearing aid. The aim of the current study was to systematically survey and reanalyze the psychometric function data available in the literature in an attempt to quantify the range of slope changes across studies and to identify listening conditions that affect the slope of the psychometric function. The data for 885 individual psychometric functions, taken from 139 different studies, were fitted with a common logistic equation from which the slope was calculated. Large variations in slope across studies were found, with slope values ranging from as shallow as 1% per dB to as steep as 44% per dB (median = 6.6% per dB), suggesting that the perceptual benefit offered by an improvement in signal-to-noise ratio depends greatly on listening environment. The type and number of maskers used were found to be major factors on the value of the slope of the psychometric function while other minor effects of target predictability, target corpus, and target/masker similarity were also found. PMID:24906905

  18. [A study on functional plasticity of the brain in childhood. II. Speech development and intelligence after the damage of cerebral hemisphere under 1 year of age].

    PubMed

    Ichiba, N; Takigawa, H

    1992-11-01

    To investigate the functional plasticity of the brain in childhood, the speech development, the intelligence test and dichotic listening test were performed on 27 patients who had suffered from hemiplegia under 1 year of age. Among 13 patients with right hemiplegia, 7 to 24 years old, 11 patients showed a left ear dominance suggesting the lateralization of language in the right hemisphere. All 14 patients with left hemiplegia, 5 to 37 years old, showed a right ear dominance suggesting the lateralization of language in the left hemisphere. All 27 patients acquired speech function enough to converse with other people during daily life. There were no differences in speech development or intelligence scores between both groups of hemiplegia. Although there was no correlation between the speech development and the age of onset of hemiplegia, there was a correlation between the speech development and the intelligence score in both groups of hemiplegia. PMID:1419166

  19. Predicting binaural speech intelligibility using the signal-to-noise ratio in the envelope power spectrum domain.

    PubMed

    Chabot-Leclerc, Alexandre; MacDonald, Ewen N; Dau, Torsten

    2016-07-01

    This study proposes a binaural extension to the multi-resolution speech-based envelope power spectrum model (mr-sEPSM) [Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134, 436-446]. It consists of a combination of better-ear (BE) and binaural unmasking processes, implemented as two monaural realizations of the mr-sEPSM combined with a short-term equalization-cancellation process, and uses the signal-to-noise ratio in the envelope domain (SNRenv) as the decision metric. The model requires only two parameters to be fitted per speech material and does not require an explicit frequency weighting. The model was validated against three data sets from the literature, which covered the following effects: the number of maskers, the masker types [speech-shaped noise (SSN), speech-modulated SSN, babble, and reversed speech], the masker(s) azimuths, reverberation on the target and masker, and the interaural time difference of the target and masker. The Pearson correlation coefficient between the simulated speech reception thresholds and the data across all experiments was 0.91. A model version that considered only BE processing performed similarly (correlation coefficient of 0.86) to the complete model, suggesting that BE processing could be considered sufficient to predict intelligibility in most realistic conditions. PMID:27475146

  20. Effect of Slow-Acting Wide Dynamic Range Compression on Measures of Intelligibility and Ratings of Speech Quality in Simulated-Loss Listeners

    ERIC Educational Resources Information Center

    Rosengard, Peninah S.; Payton, Karen L.; Braida, Louis D.

    2005-01-01

    The purpose of this study was twofold: (a) to determine the extent to which 4-channel, slow-acting wide dynamic range amplitude compression (WDRC) can counteract the perceptual effects of reduced auditory dynamic range and (b) to examine the relation between objective measures of speech intelligibility and categorical ratings of speech quality for…

  1. Parental and Spousal Self-Efficacy of Young Adults Who Are Deaf or Hard of Hearing: Relationship to Speech Intelligibility

    ERIC Educational Resources Information Center

    Adi-Bensaid, Limor; Michael, Rinat; Most, Tova; Gali-Cinamon, Rachel

    2012-01-01

    This study examined the parental and spousal self-efficacy (SE) of adults who are deaf and who are hard of hearing (d/hh) in relation to their speech intelligibility. Forty individuals with hearing loss completed self-report measures: Spousal SE in a relationship with a spouse who was hearing/deaf, parental SE to a child who was hearing/deaf, and…

  2. Acoustic Source Characteristics, Across-Formant Integration, and Speech Intelligibility Under Competitive Conditions

    PubMed Central

    2015-01-01

    An important aspect of speech perception is the ability to group or select formants using cues in the acoustic source characteristics—for example, fundamental frequency (F0) differences between formants promote their segregation. This study explored the role of more radical differences in source characteristics. Three-formant (F1+F2+F3) synthetic speech analogues were derived from natural sentences. In Experiment 1, F1+F3 were generated by passing a harmonic glottal source (F0 = 140 Hz) through second-order resonators (H1+H3); in Experiment 2, F1+F3 were tonal (sine-wave) analogues (T1+T3). F2 could take either form (H2 or T2). In some conditions, the target formants were presented alone, either monaurally or dichotically (left ear = F1+F3; right ear = F2). In others, they were accompanied by a competitor for F2 (F1+F2C+F3; F2), which listeners must reject to optimize recognition. Competitors (H2C or T2C) were created using the time-reversed frequency and amplitude contours of F2. Dichotic presentation of F2 and F2C ensured that the impact of the competitor arose primarily through informational masking. In the absence of F2C, the effect of a source mismatch between F1+F3 and F2 was relatively modest. When F2C was present, intelligibility was lowest when F2 was tonal and F2C was harmonic, irrespective of which type matched F1+F3. This finding suggests that source type and context, rather than similarity, govern the phonetic contribution of a formant. It is proposed that wideband harmonic analogues are more effective informational maskers than narrowband tonal analogues, and so become dominant in across-frequency integration of phonetic information when placed in competition. PMID:25751040

  3. Acoustic source characteristics, across-formant integration, and speech intelligibility under competitive conditions.

    PubMed

    Roberts, Brian; Summers, Robert J; Bailey, Peter J

    2015-06-01

    An important aspect of speech perception is the ability to group or select formants using cues in the acoustic source characteristics--for example, fundamental frequency (F0) differences between formants promote their segregation. This study explored the role of more radical differences in source characteristics. Three-formant (F1+F2+F3) synthetic speech analogues were derived from natural sentences. In Experiment 1, F1+F3 were generated by passing a harmonic glottal source (F0 = 140 Hz) through second-order resonators (H1+H3); in Experiment 2, F1+F3 were tonal (sine-wave) analogues (T1+T3). F2 could take either form (H2 or T2). In some conditions, the target formants were presented alone, either monaurally or dichotically (left ear = F1+F3; right ear = F2). In others, they were accompanied by a competitor for F2 (F1+F2C+F3; F2), which listeners must reject to optimize recognition. Competitors (H2C or T2C) were created using the time-reversed frequency and amplitude contours of F2. Dichotic presentation of F2 and F2C ensured that the impact of the competitor arose primarily through informational masking. In the absence of F2C, the effect of a source mismatch between F1+F3 and F2 was relatively modest. When F2C was present, intelligibility was lowest when F2 was tonal and F2C was harmonic, irrespective of which type matched F1+F3. This finding suggests that source type and context, rather than similarity, govern the phonetic contribution of a formant. It is proposed that wideband harmonic analogues are more effective informational maskers than narrowband tonal analogues, and so become dominant in across-frequency integration of phonetic information when placed in competition. PMID:25751040

  4. The intelligibility of noise-vocoded speech: spectral information available from across-channel comparison of amplitude envelopes

    PubMed Central

    Roberts, Brian; Summers, Robert J.; Bailey, Peter J.

    2011-01-01

    Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the formant frequencies in the vocal-tract output—a key source of phonetic detail—from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher formants (F3′ ≈ N5 + N6), such that the frequency contour of each formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying formants. PMID:21068039

  5. Side effects of fast-acting dynamic range compression that affect intelligibility in a competing speech task

    NASA Astrophysics Data System (ADS)

    Stone, Michael A.; Moore, Brian C. J.

    2004-10-01

    Using a cochlear implant simulator, Stone and Moore [J. Acoust. Soc. Am. 114, 1023-1034 (2003)] reported that wideband fast-acting compression led to poorer intelligibility than slow-acting compression in a competing speech task. Compression speed was varied by using different pairs of attack and release times. In the first experiment reported here, it is shown that attack times less than about 2 ms in a wideband compressor are deleterious to intelligibility. In experiment 2, fast wideband compression was applied to the target and background either before or after mixing. The former reduced the modulation depth of each signal but maintained the independence between the two signals, while the latter introduced ``comodulation.'' Using simulations with 6 and 11 channels, intelligibility was higher when compression was applied before mixing. In experiment 3, wideband compression was compared with multichannel compression; the latter led to reduced comodulation effects. For 6 channels, the position of the compressor, either wideband or within each channel, had no effect on intelligibility. For 11 channels, channel compression severely degraded intelligibility compared to wideband compression, presumably because of the greater reduction of across-channel contrasts. Overall, caution appears necessary in the use of fast-acting compression in cochlear implants, so as to preserve intelligibility. .

  6. Expanding the phenotypic profile of Kleefstra syndrome: A female with low-average intelligence and childhood apraxia of speech.

    PubMed

    Samango-Sprouse, Carole; Lawson, Patrick; Sprouse, Courtney; Stapleton, Emily; Sadeghin, Teresa; Gropman, Andrea

    2016-05-01

    Kleefstra syndrome (KS) is a rare neurogenetic disorder most commonly caused by deletion in the 9q34.3 chromosomal region and is associated with intellectual disabilities, severe speech delay, and motor planning deficits. To our knowledge, this is the first patient (PQ, a 6-year-old female) with a 9q34.3 deletion who has near normal intelligence, and developmental dyspraxia with childhood apraxia of speech (CAS). At 6, the Wechsler Preschool and Primary Intelligence testing (WPPSI-III) revealed a Verbal IQ of 81 and Performance IQ of 79. The Beery Buktenica Test of Visual Motor Integration, 5th Edition (VMI) indicated severe visual motor deficits: VMI = 51; Visual Perception = 48; Motor Coordination < 45. On the Receptive One Word Picture Vocabulary Test-R (ROWPVT-R), she had standard scores of 96 and 99 in contrast to an Expressive One Word Picture Vocabulary-R (EOWPVT-R) standard scores of 73 and 82, revealing a discrepancy in vocabulary domains on both evaluations. Preschool Language Scale-4 (PLS-4) on PQ's first evaluation reveals a significant difference between auditory comprehension and expressive communication with standard scores of 78 and 57, respectively, further supporting the presence of CAS. This patient's near normal intelligence expands the phenotypic profile as well as the prognosis associated with KS. The identification of CAS in this patient provides a novel explanation for the previously reported speech delay and expressive language disorder. Further research is warranted on the impact of CAS on intelligence and behavioral outcome in KS. Therapeutic and prognostic implications are discussed. PMID:26833960

  7. Acoustic Predictors of Intelligibility for Segmentally Interrupted Speech: Temporal Envelope, Voicing, and Duration

    ERIC Educational Resources Information Center

    Fogerty, Daniel

    2013-01-01

    Purpose: Temporal interruption limits the perception of speech to isolated temporal glimpses. An analysis was conducted to determine the acoustic parameter that best predicts speech recognition from temporal fragments that preserve different types of speech information--namely, consonants and vowels. Method: Young listeners with normal hearing…

  8. An Ecosystem of Intelligent ICT Tools for Speech-Language Therapy Based on a Formal Knowledge Model.

    PubMed

    Robles-Bykbaev, Vladimir; López-Nores, Martín; Pazos-Arias, José; Quisi-Peralta, Diego; García-Duque, Jorge

    2015-01-01

    The language and communication constitute the development mainstays of several intellectual and cognitive skills in humans. However, there are millions of people around the world who suffer from several disabilities and disorders related with language and communication, while most of the countries present a lack of corresponding services related with health care and rehabilitation. On these grounds, we are working to develop an ecosystem of intelligent ICT tools to support speech and language pathologists, doctors, students, patients and their relatives. This ecosystem has several layers and components, integrating Electronic Health Records management, standardized vocabularies, a knowledge database, an ontology of concepts from the speech-language domain, and an expert system. We discuss the advantages of such an approach through experiments carried out in several institutions assisting children with a wide spectrum of disabilities. PMID:26262008

  9. Comparing Binaural Pre-processing Strategies III: Speech Intelligibility of Normal-Hearing and Hearing-Impaired Listeners.

    PubMed

    Völker, Christoph; Warzybok, Anna; Ernst, Stephan M A

    2015-01-01

    A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922

  10. A comparison of NAL and DSL prescriptive methods for paediatric hearing-aid fitting: Predicted speech intelligibility and loudness

    PubMed Central

    Ching, Teresa Y.C.; Johnson, Earl E.; Hou, Sanna; Dillon, Harvey; Zhang, Vicky; Burns, Lauren; van Buynder, Patricia; Wong, Angela; Flynn, Christopher

    2013-01-01

    OBJECTIVE To examine the impact of prescription on predicted speech intelligibility and loudness for children. DESIGN A between-group comparison of Speech Intelligibility Index (SII) and loudness, based on hearing aids fitted according to NAL-NL1, DSL v4.1, or DSL m[i/o] prescriptions. A within-group comparison of gains prescribed by DSL m[i/o] and NAL-NL2 for children in terms of SII and loudness. STUDY SAMPLE Participants were 200 children , who were randomly assigned to first hearing-aid fitting with either NAL-NL1, DSL v4.1, or DSL m[i/o]. Audiometric data and hearing aid data at 3 years of age were used. RESULTS On average, SII calculated on the basis of hearing-aid gains were higher for DSL than for NAL-NL1 at low input level, equivalent at medium input level, and higher for NAL-NL1 than DSL at high input level. Greater loudness was associated with DSL than with NAL-NL1, across a range of input levels. Comparing NAL-NL2 and DSL m[i/o] target gains revealed higher SII for the latter at low input level. SII was higher for NAL-NL2 than for DSL m[i/o] at medium- and high-input levels despite greater loudness for gains prescribed by DSL m[i/o] than by NAL-NL2. CONCLUSION The choice of prescription has minimal effects on speech intelligibility predictions but marked effects on loudness predictions. PMID:24350692

  11. Spectrotemporal modulation sensitivity for hearing-impaired listeners: Dependence on carrier center frequency and the relationship to speech intelligibility

    PubMed Central

    Mehraei, Golbarg; Gallun, Frederick J.; Leek, Marjorie R.; Bernstein, Joshua G. W.

    2014-01-01

    Poor speech understanding in noise by hearing-impaired (HI) listeners is only partly explained by elevated audiometric thresholds. Suprathreshold-processing impairments such as reduced temporal or spectral resolution or temporal fine-structure (TFS) processing ability might also contribute. Although speech contains dynamic combinations of temporal and spectral modulation and TFS content, these capabilities are often treated separately. Modulation-depth detection thresholds for spectrotemporal modulation (STM) applied to octave-band noise were measured for normal-hearing and HI listeners as a function of temporal modulation rate (4–32 Hz), spectral ripple density [0.5–4 cycles/octave (c/o)] and carrier center frequency (500–4000 Hz). STM sensitivity was worse than normal for HI listeners only for a low-frequency carrier (1000 Hz) at low temporal modulation rates (4–12 Hz) and a spectral ripple density of 2 c/o, and for a high-frequency carrier (4000 Hz) at a high spectral ripple density (4 c/o). STM sensitivity for the 4-Hz, 4-c/o condition for a 4000-Hz carrier and for the 4-Hz, 2-c/o condition for a 1000-Hz carrier were correlated with speech-recognition performance in noise after partialling out the audiogram-based speech-intelligibility index. Poor speech-reception and STM-detection performance for HI listeners may be related to a combination of reduced frequency selectivity and a TFS-processing deficit limiting the ability to track spectral-peak movements. PMID:24993215

  12. Lip movements entrain the observers' low-frequency brain oscillations to facilitate speech intelligibility.

    PubMed

    Park, Hyojin; Kayser, Christoph; Thut, Gregor; Gross, Joachim

    2016-01-01

    During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker's lip movements and demonstrated significant entrainment in visual cortex. We then used partial coherence to remove contributions of the coherent auditory speech signal from the lip-brain coherence. Comparing this synchronization between different attention conditions revealed that attending visual speech enhances the coherence between activity in visual cortex and the speaker's lips. Further, we identified a significant partial coherence between left motor cortex and lip movements and this partial coherence directly predicted comprehension accuracy. Our results emphasize the importance of visually entrained and attention-modulated rhythmic brain activity for the enhancement of audiovisual speech processing. PMID:27146891

  13. Effect of the speed of a single-channel dynamic range compressor on intelligibility in a competing speech task

    NASA Astrophysics Data System (ADS)

    Stone, Michael A.; Moore, Brian C. J.

    2003-08-01

    Using a ``noise-vocoder'' cochlear implant simulator [Shannon et al., Science 270, 303-304 (1995)], the effect of the speed of dynamic range compression on speech intelligibility was assessed, using normal-hearing subjects. The target speech had a level 5 dB above that of the competing speech. Initially, baseline performance was measured with no compression active, using between 4 and 16 processing channels. Then, performance was measured using a fast-acting compressor and a slow-acting compressor, each operating prior to the vocoder simulation. The fast system produced significant gain variation over syllabic timescales. The slow system produced significant gain variation only over the timescale of sentences. With no compression active, about six channels were necessary to achieve 50% correct identification of words in sentences. Sixteen channels produced near-maximum performance. Slow-acting compression produced no significant degradation relative to the baseline. However, fast-acting compression consistently reduced performance relative to that for the baseline, over a wide range of performance levels. It is suggested that fast-acting compression degrades performance for two reasons: (1) because it introduces correlated fluctuations in amplitude in different frequency bands, which tends to produce perceptual fusion of the target and background sounds and (2) because it reduces amplitude modulation depth and intensity contrasts.

  14. Effect of the speed of a single-channel dynamic range compressor on intelligibility in a competing speech task.

    PubMed

    Stone, Michael A; Moore, Brian C J

    2003-08-01

    Using a "noise-vocoder" cochlear implant simulator [Shannon et al., Science 270, 303-304 (1995)], the effect of the speed of dynamic range compression on speech intelligibility was assessed, using normal-hearing subjects. The target speech had a level 5 dB above that of the competing speech. Initially, baseline performance was measured with no compression active, using between 4 and 16 processing channels. Then, performance was measured using a fast-acting compressor and a slow-acting compressor, each operating prior to the vocoder simulation. The fast system produced significant gain variation over syllabic timescales. The slow system produced significant gain variation only over the timescale of sentences. With no compression active, about six channels were necessary to achieve 50% correct identification of words in sentences. Sixteen channels produced near-maximum performance. Slow-acting compression produced no significant degradation relative to the baseline. However, fast-acting compression consistently reduced performance relative to that for the baseline, over a wide range of performance levels. It is suggested that fast-acting compression degrades performance for two reasons: (1) because it introduces correlated fluctuations in amplitude in different frequency bands, which tends to produce perceptual fusion of the target and background sounds and (2) because it reduces amplitude modulation depth and intensity contrasts. PMID:12942981

  15. Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility.

    PubMed

    Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

    2014-01-01

    Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise. PMID:25566159

  16. Effects of fast-acting high-frequency compression on the intelligibility of speech in steady and fluctuating background sounds.

    PubMed

    Stone, M A; Moore, B C; Wojtczak, M; Gudgin, E

    1997-08-01

    This study examines whether speech intelligibility in background sounds can be improved for persons with loudness recruitment by the use of fast-acting compression applied at high frequencies, when the overall level of the sounds is held constant by means of a slow-acting automatic gain control (AGC) system and when appropriate frequency-response shaping is applied. Two types of fast-acting compression were used in the high-frequency channel of a two-channel system: a compression limiter with a 10:1 compression ratio and with a compression threshold about 9 dB below the peak level of the signal in the high-frequency channel; and a wide dynamic range compressor with a 2:1 compression ratio and with the compression threshold about 24 dB below the peak level of the signal in the high-frequency channel. A condition with linear processing in the high-frequency channel was also used. Speech reception thresholds (SRTs) were measured for two background sounds: a steady speech-shaped noise and a single male talker. All subjects had moderate-to-severe sensorineural hearing loss. Three different types of speech material were used: the adaptive sentence lists (ASL), the Bamford-Kowal-Bench (BKB) sentence lists and the Boothroyd word lists. For the steady background noise, the compression generally led to poorer performance than for the linear condition, although the deleterious effect was only significant for the 10:1 compression ratio. For the background of a single talker, the compression had no significant effect except for the ASL sentences, where the 10:1 compression gave significantly better performance than the linear condition. Overall, the results did not show any clear benefits of the fast-acting compression, possibly because the slow-acting AGC allowed the use of gains in the linear condition that were markedly higher than would normally be used with linear hearing aids. PMID:9307821

  17. The Speech Intelligibility Index and the Pure-Tone Average as Predictors of Lexical Ability in Children Fit with Hearing Aids

    ERIC Educational Resources Information Center

    Stiles, Derek J.; Bentler, Ruth A.; McGregor, Karla K.

    2012-01-01

    Purpose: To determine whether a clinically obtainable measure of audibility, the aided Speech Intelligibility Index (SII; American National Standards Institute, 2007), is more sensitive than the pure-tone average (PTA) at predicting the lexical abilities of children who wear hearing aids (CHA). Method: School-age CHA and age-matched children with…

  18. Can Children Substitute for Adult Listeners in Judging the Intelligibility of the Speech of Children Who Are Deaf or Hard of Hearing?

    ERIC Educational Resources Information Center

    Kloiber, Diana True; Ertmer, David J.

    2015-01-01

    Purpose: Assessments of the intelligibility of speech produced by children who are deaf or hard of hearing (D/HH) provide unique insights into functional speaking ability, readiness for mainstream classroom placements, and intervention effectiveness. The development of sentence lists for a wide age range of children and the advent of handheld…

  19. Rasch Analysis of Word Identification and Magnitude Estimation Scaling Responses in Measuring Naive Listeners' Judgments of Speech Intelligibility of Children with Severe-to-Profound Hearing Impairments

    ERIC Educational Resources Information Center

    Beltyukova, Svetlana A.; Stone, Gregory M.; Ellis, Lee W.

    2008-01-01

    Purpose: Speech intelligibility research typically relies on traditional evidence of reliability and validity. This investigation used Rasch analysis to enhance understanding of the functioning and meaning of scores obtained with 2 commonly used procedures: word identification (WI) and magnitude estimation scaling (MES). Method: Narrative samples…

  20. The Influence of Cochlear Mechanical Dysfunction, Temporal Processing Deficits, and Age on the Intelligibility of Audible Speech in Noise for Hearing-Impaired Listeners.

    PubMed

    Johannesen, Peter T; Pérez-González, Patricia; Kalluri, Sridhar; Blanco, José L; Lopez-Poveda, Enrique A

    2016-01-01

    The aim of this study was to assess the relative importance of cochlear mechanical dysfunction, temporal processing deficits, and age on the ability of hearing-impaired listeners to understand speech in noisy backgrounds. Sixty-eight listeners took part in the study. They were provided with linear, frequency-specific amplification to compensate for their audiometric losses, and intelligibility was assessed for speech-shaped noise (SSN) and a time-reversed two-talker masker (R2TM). Behavioral estimates of cochlear gain loss and residual compression were available from a previous study and were used as indicators of cochlear mechanical dysfunction. Temporal processing abilities were assessed using frequency modulation detection thresholds. Age, audiometric thresholds, and the difference between audiometric threshold and cochlear gain loss were also included in the analyses. Stepwise multiple linear regression models were used to assess the relative importance of the various factors for intelligibility. Results showed that (a) cochlear gain loss was unrelated to intelligibility, (b) residual cochlear compression was related to intelligibility in SSN but not in a R2TM, (c) temporal processing was strongly related to intelligibility in a R2TM and much less so in SSN, and (d) age per se impaired intelligibility. In summary, all factors affected intelligibility, but their relative importance varied across maskers. PMID:27604779

  1. Intelligibility Assessment of Ideal Binary-Masked Noisy Speech with Acceptance of Room Acoustic

    NASA Astrophysics Data System (ADS)

    Vladimír, Sedlak; Daniela, Durackova; Roman, Zalusky; Tomas, Kovacik

    2015-01-01

    In this paper the intelligibility of ideal binary-masked noisy signal is evaluated for different signal to noise ratio (SNR), mask error, masker types, distance between source and receiver, reverberation time and local criteria for forming the binary mask. The ideal binary mask is computed from time-frequency decompositions of target and masker signals by thresholding the local SNR within time-frequency units. The intelligibility of separated signal is measured using different objective measures computed in frequency and perceptual domain. The present study replicates and extends the findings which were already presented but mainly shows impact of room acoustic on the intelligibility performance of IBM technique.

  2. Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility

    PubMed Central

    Park, Hyojin; Kayser, Christoph; Thut, Gregor; Gross, Joachim

    2016-01-01

    During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker’s lip movements and demonstrated significant entrainment in visual cortex. We then used partial coherence to remove contributions of the coherent auditory speech signal from the lip-brain coherence. Comparing this synchronization between different attention conditions revealed that attending visual speech enhances the coherence between activity in visual cortex and the speaker’s lips. Further, we identified a significant partial coherence between left motor cortex and lip movements and this partial coherence directly predicted comprehension accuracy. Our results emphasize the importance of visually entrained and attention-modulated rhythmic brain activity for the enhancement of audiovisual speech processing. DOI: http://dx.doi.org/10.7554/eLife.14521.001 PMID:27146891

  3. The Effects of Different Frequency Responses on Sound Quality Judgments and Speech Intelligibility.

    ERIC Educational Resources Information Center

    Gabrielsson, Alf; And Others

    1988-01-01

    Twelve hearing-impaired and eight normal-hearing adults listened to speech and music programs that were reproduced using five different frequency responses (one flat, the others combinations of reduced lower frequencies and/or increased higher frequencies). Most preferred was a flat response at lower frequencies and a 6dB/octave increase…

  4. Inferior Frontal Sensitivity to Common Speech Sounds Is Amplified by Increasing Word Intelligibility

    ERIC Educational Resources Information Center

    Vaden, Kenneth I., Jr.; Kuchinsky, Stefanie E.; Keren, Noam I.; Harris, Kelly C.; Ahlstrom, Jayne B.; Dubno, Judy R.; Eckert, Mark A.

    2011-01-01

    The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of…

  5. Can Children Substitute for Adult Listeners in Judging the Intelligibility of the Speech of Children Who Are Deaf or Hard of Hearing?

    PubMed Central

    Kloiber, Diana True

    2015-01-01

    Purpose Assessments of the intelligibility of speech produced by children who are deaf or hard of hearing (D/HH) provide unique insights into functional speaking ability, readiness for mainstream classroom placements, and intervention effectiveness. The development of sentence lists for a wide age range of children and the advent of handheld digital recording devices have overcome two barriers to routine use of this tool. Yet, difficulties in recruiting adequate numbers of adults to judge speech samples continue to make routine assessment impractical. In response to this barrier, it has been proposed that children who are 9 years or older might be adequate substitutes for adult listener-judges (Ertmer, 2011). Method To examine this possibility, 22 children from the 3rd, 4th, and 5th grades identified words from speech samples previously judged by adults. Results Children in the 3rd and 4th grades identified fewer words than adults, whereas scores for 5th graders were not significantly different from those of the adults. All grade levels showed increasing scores across low, mid, and high levels of intelligibility. Conclusions Children who are functioning at a 5th grade level or higher can act as listener-judges in speech intelligibility assessments. Suggestions for implementing assessments and scoring child-listeners' written responses are discussed. PMID:25381439

  6. Intelligence.

    PubMed

    Deary, Ian J

    2012-01-01

    Individual differences in human intelligence are of interest to a wide range of psychologists and to many people outside the discipline. This overview of contributions to intelligence research covers the first decade of the twenty-first century. There is a survey of some of the major books that appeared since 2000, at different levels of expertise and from different points of view. Contributions to the phenotype of intelligence differences are discussed, as well as some contributions to causes and consequences of intelligence differences. The major causal issues covered concern the environment and genetics, and how intelligence differences are being mapped to brain differences. The major outcomes discussed are health, education, and socioeconomic status. Aging and intelligence are discussed, as are sex differences in intelligence and whether twins and singletons differ in intelligence. More generally, the degree to which intelligence has become a part of broader research in neuroscience, health, and social science is discussed. PMID:21943169

  7. Speech intelligibility in rooms: Effect of prior listening exposure interacts with room acoustics.

    PubMed

    Zahorik, Pavel; Brandewie, Eugene J

    2016-07-01

    There is now converging evidence that a brief period of prior listening exposure to a reverberant room can influence speech understanding in that environment. Although the effect appears to depend critically on the amplitude modulation characteristic of the speech signal reaching the ear, the extent to which the effect may be influenced by room acoustics has not been thoroughly evaluated. This study seeks to fill this gap in knowledge by testing the effect of prior listening exposure or listening context on speech understanding in five different simulated sound fields, ranging from anechoic space to a room with broadband reverberation time (T60) of approximately 3 s. Although substantial individual variability in the effect was observed and quantified, the context effect was, on average, strongly room dependent. At threshold, the effect was minimal in anechoic space, increased to a maximum of 3 dB on average in moderate reverberation (T60 = 1 s), and returned to minimal levels again in high reverberation. This interaction suggests that the functional effects of prior listening exposure may be limited to sound fields with moderate reverberation (0.4 ≤ T60 ≤ 1 s). PMID:27475133

  8. Multimodal Interaction in Ambient Intelligence Environments Using Speech, Localization and Robotics

    ERIC Educational Resources Information Center

    Galatas, Georgios

    2013-01-01

    An Ambient Intelligence Environment is meant to sense and respond to the presence of people, using its embedded technology. In order to effectively sense the activities and intentions of its inhabitants, such an environment needs to utilize information captured from multiple sensors and modalities. By doing so, the interaction becomes more natural…

  9. Speech of Color Naming and Intelligence: Association in Girls, Dissociation in Boys.

    ERIC Educational Resources Information Center

    Jaffe, Joseph; And Others

    1985-01-01

    A rapid naming test was administered to 321 prereaders (five-seven years old). Results showed sex differences in degree of correlation between naming performance and a test of general intelligence. Results bear theoretically on the degree to which a learning disability can appear as an isolated deficit in the two sexes. (Author/CL)

  10. Relationship between Kinematics, F2 Slope and Speech Intelligibility in Dysarthria Due to Cerebral Palsy

    ERIC Educational Resources Information Center

    Rong, Panying; Loucks, Torrey; Kim, Heejin; Hasegawa-Johnson, Mark

    2012-01-01

    A multimodal approach combining acoustics, intelligibility ratings, articulography and surface electromyography was used to examine the characteristics of dysarthria due to cerebral palsy (CP). CV syllables were studied by obtaining the slope of F2 transition during the diphthong, tongue-jaw kinematics during the release of the onset consonant,…

  11. Developing Cultural Intelligence in Preservice Speech-Language Pathologists and Educators

    ERIC Educational Resources Information Center

    Griffer, Mona R.; Perlis, Susan M.

    2007-01-01

    Postsecondary educators preparing future clinicians and teachers have an important responsibility to develop cultural competence of their students in order to meet the increasing and ever-changing demands of today's global workforce and diverse workplace. In this article, the authors discuss key components to developing cultural intelligence.…

  12. Long-term impact of tongue reduction on speech intelligibility, articulation and oromyofunctional behaviour in a child with Beckwith-Wiedemann syndrome.

    PubMed

    Van Lierde, K M; Mortier, G; Huysman, E; Vermeersch, H

    2010-03-01

    The purpose of the present case study was to determine the long-term impact of partial glossectomy (using the keyhole technique) on overall speech intelligibility and articulation in a Dutch-speaking child with Beckwith-Wiedemann syndrome (BWS). Furthermore the present study is meant as a contribution to the further delineation of the phonation, resonance, articulation and language characteristics and oral behaviour in a child with BWS. Detailed information on the speech and language characteristics of children with BWS may lead to better guidance of pediatric management programs. The child's speech was assessed 9 years after partial glossectomy with regard to ENT characteristics, overall intelligibility (perceptual consensus evaluation), articulation (phonetic and phonological errors), voice (videostroboscopy, vocal quality), resonance (perceptual, nasometric assessment), language (expressive and receptive) and oral behaviour. A class III malocclusion, an anterior open bite, diastema, overangulation of lower incisors and an enlarged but normal symmetric shaped tongue were present. The overall speech intelligibility improved from severely impaired (presurgical) to slightly impaired (5 months post-glossectomy) to normal (9 years postoperative). Comparative phonetic inventory showed a remarkable improvement of articulation. Nine years post-glossectomy three types of distortions seemed to predominate: a rhotacism and sigmatism and the substitution of the alveolar /z/. Oral behaviour, vocal characteristics and resonance were normal, but problems with expressive syntactic abilities were present. The long-term impact of partial glossectomy, using the keyhole technique (preserving the vascularity and the nervous input of the remaining intrinsic tongue muscles), on speech intelligibility, articulation, and oral behaviour in this Dutch-speaking child with congenital macroglossia can be regarded as successful. It is not clear how these expressive syntactical problems

  13. Developing a wireless speech- and touch-based intelligent comprehensive triage support system.

    PubMed

    Chang, Polun; Sheng, Yu-Hsiang; Sang, Yiing-Yiing; Wang, Da-Wei

    2008-01-01

    The implementation of voice recognition technology has been expected to occur in mobile healthcare settings, but it is the least studied solution in nursing. The objective of this study is to examine its value to mobile nursing. The study was done at a triage station in an emergency department. The system was developed using VB6.0, Microsoft Speech SDK 5.1 and the Simplified Chinese Language pack, and was installed on touchscreen PCs with wireless headsets. Thirty nurses were enrolled. Accuracy rate and operation time were used to measure the subjects' performance. A "willingness to use" score on a scale of 1 to 10 was used to measure subjects' preference for the system. The results showed that the average accuracy rate was 99%, the average operation time was 108 seconds, and the mean "willingness to use" rating was 8.2. This study demonstrates the value of multimodal voice recognition techniques to mobile nursing. PMID:18091619

  14. Speech research directions

    SciTech Connect

    Atal, B.S.; Rabiner, L.R.

    1986-09-01

    This paper presents an overview of the current activities in speech research. The authors discuss the state of the art in speech coding, text-to-speech synthesis, speech recognition, and speaker recognition. In the speech coding area, current algorithms perform well at bit rates down to 9.6 kb/s, and the research is directed at bringing the rate for high-quality speech coding down to 2.4 kb/s. In text-to-speech synthesis, what we currently are able to produce is very intelligible but not yet completely natural. Current research aims at providing higher quality and intelligibility to the synthetic speech that these systems produce. Finally, today's systems for speech and speaker recognition provide excellent performance on limited tasks; i.e., limited vocabulary, modest syntax, small talker populations, constrained inputs, etc.

  15. Intelligence.

    PubMed

    Sternberg, Robert J

    2012-09-01

    Intelligence is the ability to learn from past experience and, in general, to adapt to, shape, and select environments. Aspects of intelligence are measured by standardized tests of intelligence. Average raw (number-correct) scores on such tests vary across the life span and also across generations, as well as across ethnic and socioeconomic groups. Intelligence can be understood in part in terms of the biology of the brain-especially with regard to the functioning in the prefrontal cortex. Measured values correlate with brain size, at least within humans. The heritability coefficient (ratio of genetic to phenotypic variation) is between 0.4 and 0.8. But genes always express themselves through environment. Heritability varies as a function of a number of factors, including socioeconomic status and range of environments. Racial-group differences in measured intelligence have been reported, but race is a socially constructed rather than biological variable. As a result, these differences are difficult to interpret. Different cultures have different conceptions of the nature of intelligence, and also require different skills in order to express intelligence in the environment. WIREs Cogn Sci 2012 doi: 10.1002/wcs.1193 For further resources related to this article, please visit the WIREs website. PMID:26302705

  16. Effects of Within-Talker Variability on Speech Intelligibility in Mandarin-Speaking Adult and Pediatric Cochlear Implant Patients.

    PubMed

    Su, Qiaotong; Galvin, John J; Zhang, Guoping; Li, Yongxin; Fu, Qian-Jie

    2016-01-01

    Cochlear implant (CI) speech performance is typically evaluated using well-enunciated speech produced at a normal rate by a single talker. CI users often have greater difficulty with variations in speech production encountered in everyday listening. Within a single talker, speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context. The coarse spectral resolution afforded by the CI limits perception of voice pitch, which is an important cue for speech prosody and for tonal languages such as Mandarin Chinese. In this study, sentence recognition from the Mandarin speech perception database was measured in adult and pediatric Mandarin-speaking CI listeners for a variety of speaking styles: voiced speech produced at slow, normal, and fast speaking rates; whispered speech; voiced emotional speech; and voiced shouted speech. Recognition of Mandarin Hearing in Noise Test sentences was also measured. Results showed that performance was significantly poorer with whispered speech relative to the other speaking styles and that performance was significantly better with slow speech than with fast or emotional speech. Results also showed that adult and pediatric performance was significantly poorer with Mandarin Hearing in Noise Test than with Mandarin speech perception sentences at the normal rate. The results suggest that adult and pediatric Mandarin-speaking CI patients are highly susceptible to whispered speech, due to the lack of lexically important voice pitch cues and perhaps other qualities associated with whispered speech. The results also suggest that test materials may contribute to differences in performance observed between adult and pediatric CI users. PMID:27363714

  17. Effects of Within-Talker Variability on Speech Intelligibility in Mandarin-Speaking Adult and Pediatric Cochlear Implant Patients

    PubMed Central

    Su, Qiaotong; Galvin, John J.; Zhang, Guoping; Li, Yongxin

    2016-01-01

    Cochlear implant (CI) speech performance is typically evaluated using well-enunciated speech produced at a normal rate by a single talker. CI users often have greater difficulty with variations in speech production encountered in everyday listening. Within a single talker, speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context. The coarse spectral resolution afforded by the CI limits perception of voice pitch, which is an important cue for speech prosody and for tonal languages such as Mandarin Chinese. In this study, sentence recognition from the Mandarin speech perception database was measured in adult and pediatric Mandarin-speaking CI listeners for a variety of speaking styles: voiced speech produced at slow, normal, and fast speaking rates; whispered speech; voiced emotional speech; and voiced shouted speech. Recognition of Mandarin Hearing in Noise Test sentences was also measured. Results showed that performance was significantly poorer with whispered speech relative to the other speaking styles and that performance was significantly better with slow speech than with fast or emotional speech. Results also showed that adult and pediatric performance was significantly poorer with Mandarin Hearing in Noise Test than with Mandarin speech perception sentences at the normal rate. The results suggest that adult and pediatric Mandarin-speaking CI patients are highly susceptible to whispered speech, due to the lack of lexically important voice pitch cues and perhaps other qualities associated with whispered speech. The results also suggest that test materials may contribute to differences in performance observed between adult and pediatric CI users. PMID:27363714

  18. Speech intelligibility index predictions for young and old listeners in automobile noise: Can the index be improved by incorporating factors other than absolute threshold?

    NASA Astrophysics Data System (ADS)

    Saweikis, Meghan; Surprenant, Aimée M.; Davies, Patricia; Gallant, Don

    2003-10-01

    While young and old subjects with comparable audiograms tend to perform comparably on speech recognition tasks in quiet environments, the older subjects have more difficulty than the younger subjects with recognition tasks in degraded listening conditions. This suggests that factors other than an absolute threshold may account for some of the difficulty older listeners have on recognition tasks in noisy environments. Many metrics, including the Speech Intelligibility Index (SII), used to measure speech intelligibility, only consider an absolute threshold when accounting for age related hearing loss. Therefore these metrics tend to overestimate the performance for elderly listeners in noisy environments [Tobias et al., J. Acoust. Soc. Am. 83, 859-895 (1988)]. The present studies examine the predictive capabilities of the SII in an environment with automobile noise present. This is of interest because people's evaluation of the automobile interior sound is closely linked to their ability to carry on conversations with their fellow passengers. The four studies examine whether, for subjects with age related hearing loss, the accuracy of the SII can be improved by incorporating factors other than an absolute threshold into the model. [Work supported by Ford Motor Company.

  19. Intelligence

    PubMed Central

    Sternberg, Robert J.

    2012-01-01

    Intelligence is the ability to learn from experience and to adapt to, shape, and select environments. Intelligence as measured by (raw scores on) conventional standardized tests varies across the lifespan, and also across generations. Intelligence can be understood in part in terms of the biology of the brain—especially with regard to the functioning in the prefrontal cortex—and also correlates with brain size, at least within humans. Studies of the effects of genes and environment suggest that the heritability coefficient (ratio of genetic to phenotypic variation) is between .4 and .8, although heritability varies as a function of socioeconomic status and other factors. Racial differences in measured intelligence have been observed, but race is a socially constructed rather than biological variable, so such differences are difficult to interpret. PMID:22577301

  20. Intelligence.

    PubMed

    Sternberg, Robert J

    2012-03-01

    Intelligence is the ability to learn from experience and to adapt to, shape, and select environments. Intelligence as measured by (raw scores on) conventional standardized tests varies across the lifespan, and also across generations. Intelligence can be understood in part in terms of the biology of the brain-especially with regard to the functioning in the prefrontal cortex-and also correlates with brain size, at least within humans. Studies of the effects of genes and environment suggest that the heritability coefficient (ratio of genetic to phenotypic variation) is between .4 and .8, although heritability varies as a function of socioeconomic status and other factors. Racial differences in measured intelligence have been observed, but race is a socially constructed rather than biological variable, so such differences are difficult to interpret. PMID:22577301

  1. Distributed processing for speech understanding

    SciTech Connect

    Bronson, E.C.; Siegel, L.

    1983-01-01

    Continuous speech understanding is a highly complex artificial intelligence task requiring extensive computation. This complexity precludes real-time speech understanding on a conventional serial computer. Distributed processing technique can be applied to the speech understanding task to improve processing speed. In the paper, the speech understanding task and several speech understanding systems are described. Parallel processing techniques are presented and a distributed processing architecture for speech understanding is outlined. 35 references.

  2. Intelligibility of ICAO (International Civil Aviation Organization) spelling alphabet words and digits using severely degraded speech communication systems. Part 1: Narrowband digital speech

    NASA Astrophysics Data System (ADS)

    Schmidt-Nielsen, Astrid

    1987-03-01

    The Diagnostic Rhyme Test (DRT) is widely used to evaluate digital voice systems. Would-be users often have no reference frame of interpreting DRT scores in terms of performance measures that they can understand, e.g., how many operational words are correctly understood. This research was aimed at providing a better understanding of the effects of very poor quality speech on human communication performance. It is especially important to determine how successful communications are likely to be when the speech quality is severely degraded. This report compares the recognition of ICAO spelling alphabet words (ALFA, BRAVO, CHARLIE, etc) with DRT scores for the same conditions. Confusions among the spelling alphabet words are also given. Two types of speech degradation were selected for investigation: narrowband digital speech (the DoD standard linear predictive coding algorithm operating at 2400 bits/s) with varying bit-error rates and analog jamming. The report will be in two parts. Part 1 covers the narrowband digital speech research, and Part 2 will cover the analog speech research.

  3. Speech intelligibility and recall of first and second language words heard at different signal-to-noise ratios.

    PubMed

    Hygge, Staffan; Kjellberg, Anders; Nöstl, Anatole

    2015-01-01

    Free recall of spoken words in Swedish (native tongue) and English were assessed in two signal-to-noise ratio (SNR) conditions (+3 and +12 dB), with and without half of the heard words being repeated back orally directly after presentation [shadowing, speech intelligibility (SI)]. A total of 24 word lists with 12 words each were presented in English and in Swedish to Swedish speaking college students. Pre-experimental measures of working memory capacity (operation span, OSPAN) were taken. A basic hypothesis was that the recall of the words would be impaired when the encoding of the words required more processing resources, thereby depleting working memory resources. This would be the case when the SNR was low or when the language was English. A low SNR was also expected to impair SI, but we wanted to compare the sizes of the SNR-effects on SI and recall. A low score on working memory capacity was expected to further add to the negative effects of SNR and language on both SI and recall. The results indicated that SNR had strong effects on both SI and recall, but also that the effect size was larger for recall than for SI. Language had a main effect on recall, but not on SI. The shadowing procedure had different effects on recall of the early and late parts of the word lists. Working memory capacity was unimportant for the effect on SI and recall. Thus, recall appear to be a more sensitive indicator than SI for the acoustics of learning, which has implications for building codes and recommendations concerning classrooms and other workplaces, where both hearing and learning is important. PMID:26441765

  4. Speech intelligibility and recall of first and second language words heard at different signal-to-noise ratios

    PubMed Central

    Hygge, Staffan; Kjellberg, Anders; Nöstl, Anatole

    2015-01-01

    Free recall of spoken words in Swedish (native tongue) and English were assessed in two signal-to-noise ratio (SNR) conditions (+3 and +12 dB), with and without half of the heard words being repeated back orally directly after presentation [shadowing, speech intelligibility (SI)]. A total of 24 word lists with 12 words each were presented in English and in Swedish to Swedish speaking college students. Pre-experimental measures of working memory capacity (operation span, OSPAN) were taken. A basic hypothesis was that the recall of the words would be impaired when the encoding of the words required more processing resources, thereby depleting working memory resources. This would be the case when the SNR was low or when the language was English. A low SNR was also expected to impair SI, but we wanted to compare the sizes of the SNR-effects on SI and recall. A low score on working memory capacity was expected to further add to the negative effects of SNR and language on both SI and recall. The results indicated that SNR had strong effects on both SI and recall, but also that the effect size was larger for recall than for SI. Language had a main effect on recall, but not on SI. The shadowing procedure had different effects on recall of the early and late parts of the word lists. Working memory capacity was unimportant for the effect on SI and recall. Thus, recall appear to be a more sensitive indicator than SI for the acoustics of learning, which has implications for building codes and recommendations concerning classrooms and other workplaces, where both hearing and learning is important. PMID:26441765

  5. Artificial intelligence

    SciTech Connect

    Firschein, O.

    1984-01-01

    This book presents papers on artificial intelligence. Topics considered include knowledge engineering, expert systems, applications of artificial intelligence to scientific reasoning, planning and problem solving, error recovery in robots through failure reason analysis, programming languages, natural language, speech recognition, map-guided interpretation of remotely-sensed imagery, and image understanding architectures.

  6. Effect of masker type and age on speech intelligibility and spatial release from masking in children and adults

    PubMed Central

    Johnstone, Patti M.; Litovsky, Ruth Y.

    2009-01-01

    Speech recognition in noisy environments improves when the speech signal is spatially separated from the interfering sound. This effect, known as spatial release from masking (SRM), was recently shown in young children. The present study compared SRM in children of ages 5–7 with adults for interferers introducing energetic, informational, and/or linguistic components. Three types of interferers were used: speech, reversed speech, and modulated white noise. Two female voices with different long-term spectra were also used. Speech reception thresholds (SRTs) were compared for: Quiet (target 0° front, no interferer), Front (target and interferer both 0° front), and Right (interferer 90° right, target 0° front). Children had higher SRTs and greater masking than adults. When spatial cues were not available, adults, but not children, were able to use differences in interferer type to separate the target from the interferer. Both children and adults showed SRM. Children, unlike adults, demonstrated large amounts of SRM for a time-reversed speech interferer. In conclusion, masking and SRM vary with the type of interfering sound, and this variation interacts with age; SRM may not depend on the spectral peculiarities of a particular type of voice when the target speech and interfering speech are different sex talkers. PMID:17069314

  7. Impact of Clear, Loud, and Slow Speech on Scaled Intelligibility and Speech Severity in Parkinson's Disease and Multiple Sclerosis

    ERIC Educational Resources Information Center

    Tjaden, Kris; Sussman, Joan E.; Wilding, Gregory E.

    2014-01-01

    Purpose: The perceptual consequences of rate reduction, increased vocal intensity, and clear speech were studied in speakers with multiple sclerosis (MS), Parkinson's disease (PD), and healthy controls. Method: Seventy-eight speakers read sentences in habitual, clear, loud, and slow conditions. Sentences were equated for peak amplitude and…

  8. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably.

  9. Northeast Artificial Intelligence Consortium annual report. 1988 artificial intelligence applications to speech recognition. Volume 8. Interim report, January-December 1988

    SciTech Connect

    Rhody, H.E.; Ridley, T.R.; Biles, J.A.

    1989-10-01

    This report describes progress that has been made in the fourth year of the existence of the NAIC on the technical research tasks undertaken at the member universities. The topics covered in general are: versatile expert system for equipment maintenance, distributed AI for communications system control, automatic photointerpretation, time-oriented problem solving, speech understanding systems, knowledge base maintenance, hardware architectures for very large systems, knowledge-based reasoning and planning, and a knowledge acquisition, assistance, and explanation system. The specific topic for this volume is the design and implementation of a knowledge-based system to read speech spectrograms.

  10. Intelligibility as a Clinical Outcome Measure Following Intervention with Children with Phonologically Based Speech-Sound Disorders

    ERIC Educational Resources Information Center

    Lousada, M.; Jesus, Luis M. T.; Hall, A.; Joffe, V.

    2014-01-01

    Background: The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological…

  11. Infant Perception of Atypical Speech Signals

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  12. Development and evaluation of a Dutch diagnostic rhyme test for assessing the intelligibility of speech communication channels

    NASA Astrophysics Data System (ADS)

    Steeneken, H. J. M.

    1982-06-01

    A subjective intelligibility measuring method, based on a two-alternative forced-choice test, was developed for the Dutch language. This test is comparable with the American Diagnostic Rhyme Test as developed by Voiers. The fundamentals and phonetic background of the Dutch implementation are given. The reproducibility and the relation to other intelligibility measures were studied for 40 different reference channels. The results show that the subjects need not be trained and that the measuring results are comparable with results obtained with phonetically balanced word lists.

  13. Developing and evaluating a wireless speech-and-touch-based interface for intelligent comprehensive triage support systems.

    PubMed

    Chang, Polun; Sheng, Yu-Hsiang; Sang, Yiing-Yiing; Wang, Da-Wei; Hsu, Yueh-Shuang; Hou, I-Ching

    2006-01-01

    Continuous speech recognition (CSR) technology appears promising in mobile nursing but is not yet well studied. We developed and evaluated bimodal CSR and touchscreen triage support systems in the Emergency department (ED) of a medical center with 2700 beds in 2004-5. Evaluation results show that the average accuracy rates of systems ranged from 94 to 98%. Results suggest that the ED nurses were significantly more willing to use CSR combined with touchscreen systems than the others, such as PDA and CSR alone. A more flexible interface, not efficiency, might be the main reason for this finding. PMID:17102352

  14. SILENT SPEECH DURING SILENT READING.

    ERIC Educational Resources Information Center

    MCGUIGAN, FRANK J.

    EFFORTS WERE MADE IN THIS STUDY TO (1) RELATE THE AMOUNT OF SILENT SPEECH DURING SILENT READING TO LEVEL OF READING PROFICIENCY, INTELLIGENCE, AGE, AND GRADE PLACEMENT OF SUBJECTS, AND (2) DETERMINE WHETHER THE AMOUNT OF SILENT SPEECH DURING SILENT READING IS AFFECTED BY THE LEVEL OF DIFFICULTY OF PROSE READ AND BY THE READING OF A FOREIGN…

  15. CONTROLAB: integration of intelligent systems for speech recognition, image processing, and trajectory control with obstacle avoidance aiming at robotics applications

    NASA Astrophysics Data System (ADS)

    Aude, Eliana P. L.; Silveira, Julio T. C.; Silva, Fabricio A. B.; Martins, Mario F.; Serdeira, Henrique; Lopes, Emerson P.

    1997-12-01

    CONTROLAB is an environment which integrates intelligent systems and control algorithms aiming at applications in the area of robotics. Within CONTROLAB, two neural network architectures based on the backpropagation and the recursive models are proposed for the implementation of a robust speaker-independent word recognition system. The robustness of the system using the backpropagation network has been largely verified through use by children and adults in totally uncontrolled environments such as large public halls for the exhibition of new technology products. Experimental results with the recursive network show that it is able to overcome the backpropagation network major drawback, the frequent generation of false alarms. In addition, within CONTROLAB, the trajectory to be followed by a robot arm under self-tuning control is determined by a system which uses either VGRAPH or PFIELD algorithms to avoid obstacles detected by the computer vision system. The performance of the second algorithm is greatly improved when it is applied under the control of a rule-based system. An application in which a SCARA robot arm is commanded by voice to pick up a specific tool placed on a table among other tools and obstacles is currently running. This application is used to evaluate the performance of each sub-system within CONTROLAB.

  16. Acoustics of Clear Speech: Effect of Instruction

    ERIC Educational Resources Information Center

    Lam, Jennifer; Tjaden, Kris; Wilding, Greg

    2012-01-01

    Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…

  17. Audiovisual Asynchrony Detection in Human Speech

    ERIC Educational Resources Information Center

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  18. Cognitive Functions in Childhood Apraxia of Speech

    ERIC Educational Resources Information Center

    Nijland, Lian; Terband, Hayo; Maassen, Ben

    2015-01-01

    Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…

  19. Speech Perception in Individuals with Auditory Neuropathy

    ERIC Educational Resources Information Center

    Zeng, Fan-Gang; Liu, Sheng

    2006-01-01

    Purpose: Speech perception in participants with auditory neuropathy (AN) was systematically studied to answer the following 2 questions: Does noise present a particular problem for people with AN: Can clear speech and cochlear implants alleviate this problem? Method: The researchers evaluated the advantage in intelligibility of clear speech over…

  20. Universal Nonverbal Intelligence Test.

    ERIC Educational Resources Information Center

    Bracken, Bruce A.; McCallum, R. Steve

    This kit presents all components of the Universal Nonverbal Intelligence Test (UNIT), a newly developed instrument designed to measure the general intelligence and cognitive abilities of children and adolescents (ages 5 through 17) who may be disadvantaged by traditional verbal and language-loaded measures such as children with speech, language,…

  1. Speech transmission index from running speech: A neural network approach

    NASA Astrophysics Data System (ADS)

    Li, F. F.; Cox, T. J.

    2003-04-01

    Speech transmission index (STI) is an important objective parameter concerning speech intelligibility for sound transmission channels. It is normally measured with specific test signals to ensure high accuracy and good repeatability. Measurement with running speech was previously proposed, but accuracy is compromised and hence applications limited. A new approach that uses artificial neural networks to accurately extract the STI from received running speech is developed in this paper. Neural networks are trained on a large set of transmitted speech examples with prior knowledge of the transmission channels' STIs. The networks perform complicated nonlinear function mappings and spectral feature memorization to enable accurate objective parameter extraction from transmitted speech. Validations via simulations demonstrate the feasibility of this new method on a one-net-one-speech extract basis. In this case, accuracy is comparable with normal measurement methods. This provides an alternative to standard measurement techniques, and it is intended that the neural network method can facilitate occupied room acoustic measurements.

  2. The Effectiveness of Clear Speech as a Masker

    ERIC Educational Resources Information Center

    Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

    2010-01-01

    Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…

  3. Overview of speech technology of the 80's

    SciTech Connect

    Crook, S.B.

    1981-01-01

    The author describes the technology innovations necessary to accommodate the market need which is the driving force toward greater perceived computer intelligence. The author discusses aspects of both speech synthesis and speech recognition.

  4. Speech & Language Therapy for Children and Adolescents with Down Syndrome

    MedlinePlus

    ... to Better Speech for Children with Down Syndrome Blueberry Shoes Productions. (2005) Try Reading Again: How to ... Did You Say? A Guide to Speech Intelligibility. Blueberry Shoes Productions. (2006) Resources New & Expectant Parents Where ...

  5. Determining the threshold for usable speech within co-channel speech with the SPHINX automated speech recognition system

    NASA Astrophysics Data System (ADS)

    Hicks, William T.; Yantorno, Robert E.

    2004-10-01

    Much research has been and is continuing to be done in the area of separating the original utterances of two speakers from co-channel speech. This is very important in the area of automated speech recognition (ASR), where the current state of technology is not nearly as accurate as human listeners when the speech is co-channel. It is desired to determine what types of speech (voiced, unvoiced, and silence) and at what target to interference ratio (TIR) two speakers can speak at the same time and not reduce speech intelligibility of the target speaker (referred to as usable speech). Knowing which segments of co-channel speech are usable in ASR can be used to improve the reconstruction of single speaker speech. Tests were performed using the SPHINX ASR software and the TIDIGITS database. It was found that interfering voiced speech with a TIR of 6 dB or greater (on a per frame basis) did not significantly reduce the intelligibility of the target speaker in co-channel speech. It was further found that interfering unvoiced speech with a TIR of 18 dB or greater (on a per frame basis) did not significantly reduce the intelligibility of the target speaker in co-channel speech.

  6. The neural bases of difficult speech comprehension and speech production: Two Activation Likelihood Estimation (ALE) meta-analyses.

    PubMed

    Adank, Patti

    2012-07-01

    The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension of less intelligible/distorted speech with more intelligible speech. Meta-analysis 2 (21 studies) identified areas associated with speech production. The results indicate that difficult comprehension involves increased reliance of cortical regions in which comprehension and production overlapped (bilateral anterior Superior Temporal Sulcus (STS) and anterior Supplementary Motor Area (pre-SMA)) and in an area associated with intelligibility processing (left posterior MTG), and second involves increased reliance on cortical areas associated with general executive processes (bilateral anterior insulae). Comprehension of distorted speech may be supported by a hybrid neural mechanism combining increased involvement of areas associated with general executive processing and areas shared between comprehension and production. PMID:22633697

  7. Alternative Speech Communication System for Persons with Severe Speech Disorders

    NASA Astrophysics Data System (ADS)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  8. Single Word and Sentence Intelligibility in Children with Cochlear Implants

    ERIC Educational Resources Information Center

    Khwaileh, Fadwa A.; Flipsen, Peter, Jr.

    2010-01-01

    This study examined the intelligibility of speech produced by 17 children (aged 4-11 years) with cochlear implants. Stimulus items included sentences from the Beginners' Intelligibility Test (BIT) and words from the Children Speech Intelligibility Measure (CSIM). Naive listeners responded by writing sentences heard or with two types of responses…

  9. Intelligibility and the Listener: The Role of Lexical Stress

    ERIC Educational Resources Information Center

    Field, John

    2005-01-01

    For some 30 years, intelligibility has been recognized as an appropriate goal for pronunciation instruction, yet remarkably little is known about the factors that make a language learner's speech intelligible. Studies have traced correlations between features of nonnative speech and native speakers' intelligibility judgements. They have tended to…

  10. Somatosensory basis of speech production.

    PubMed

    Tremblay, Stéphanie; Shiller, Douglas M; Ostry, David J

    2003-06-19

    The hypothesis that speech goals are defined acoustically and maintained by auditory feedback is a central idea in speech production research. An alternative proposal is that speech production is organized in terms of control signals that subserve movements and associated vocal-tract configurations. Indeed, the capacity for intelligible speech by deaf speakers suggests that somatosensory inputs related to movement play a role in speech production-but studies that might have documented a somatosensory component have been equivocal. For example, mechanical perturbations that have altered somatosensory feedback have simultaneously altered acoustics. Hence, any adaptation observed under these conditions may have been a consequence of acoustic change. Here we show that somatosensory information on its own is fundamental to the achievement of speech movements. This demonstration involves a dissociation of somatosensory and auditory feedback during speech production. Over time, subjects correct for the effects of a complex mechanical load that alters jaw movements (and hence somatosensory feedback), but which has no measurable or perceptible effect on acoustic output. The findings indicate that the positions of speech articulators and associated somatosensory inputs constitute a goal of speech movements that is wholly separate from the sounds produced. PMID:12815431

  11. Speech Rates of Turkish Prelingually Hearing-Impaired Children

    ERIC Educational Resources Information Center

    Girgin, M. Cem

    2008-01-01

    The aim of training children with hearing impairment in the auditory oral approach is to develop good speaking abilities. However, children with profound hearing-impairment show a wide range of spoken language abilities, some having highly intelligible speech while others have unintelligible speech. This is due to errors in speech production.…

  12. Prosodic Features and Speech Naturalness in Individuals with Dysarthria

    ERIC Educational Resources Information Center

    Klopfenstein, Marie I.

    2012-01-01

    Despite the importance of speech naturalness to treatment outcomes, little research has been done on what constitutes speech naturalness and how to best maximize naturalness in relationship to other treatment goals like intelligibility. In addition, previous literature alludes to the relationship between prosodic aspects of speech and speech…

  13. Effect of Fundamental Frequency on Judgments of Electrolaryngeal Speech

    ERIC Educational Resources Information Center

    Nagle, Kathy F.; Eadie, Tanya L.; Wright, Derek R.; Sumida, Yumi A.

    2012-01-01

    Purpose: To determine (a) the effect of fundamental frequency (f0) on speech intelligibility, acceptability, and perceived gender in electrolaryngeal (EL) speakers, and (b) the effect of known gender on speech acceptability in EL speakers. Method: A 2-part study was conducted. In Part 1, 34 healthy adults provided speech recordings using…

  14. Perception of Synthetic and Natural Speech by Adults with Visual Impairments

    ERIC Educational Resources Information Center

    Papadopoulos, Konstantinos; Koutsoklenis, Athanasios; Katemidou, Evangelia; Okalidou, Areti

    2009-01-01

    This study investigated the intelligibility and comprehensibility of natural speech in comparison to synthetic speech. The results demonstrate the type of errors; the relationship between intelligibility and comprehensibility; and the correlation between intelligibility and comprehensibility and key factors, such as the frequency of use of…

  15. Production and perception of clear speech

    NASA Astrophysics Data System (ADS)

    Bradlow, Ann R.

    2003-04-01

    When a talker believes that the listener is likely to have speech perception difficulties due to a hearing loss, background noise, or a different native language, she or he will typically adopt a clear speaking style. Previous research has established that, with a simple set of instructions to the talker, ``clear speech'' can be produced by most talkers under laboratory recording conditions. Furthermore, there is reliable evidence that adult listeners with either impaired or normal hearing typically find clear speech more intelligible than conversational speech. Since clear speech production involves listener-oriented articulatory adjustments, a careful examination of the acoustic-phonetic and perceptual consequences of the conversational-to-clear speech transformation can serve as an effective window into talker- and listener-related forces in speech communication. Furthermore, clear speech research has considerable potential for the development of speech enhancement techniques. After reviewing previous and current work on the acoustic properties of clear versus conversational speech, this talk will present recent data from a cross-linguistic study of vowel production in clear speech and a cross-population study of clear speech perception. Findings from these studies contribute to an evolving view of clear speech production and perception as reflecting both universal, auditory and language-specific, phonological contrast enhancement features.

  16. Perceptual Learning of Interrupted Speech

    PubMed Central

    Benard, Michel Ruben; Başkent, Deniz

    2013-01-01

    The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations. PMID:23469266

  17. Speech Development

    MedlinePlus

    ... W View More… Donate Donor Spotlight Fundraising Ideas Vehicle Donation Volunteer Efforts Speech Development skip to submenu ... Lip and Palate . Bzoch (1997). Cleft Palate Speech Management: A Multidisciplinary Approach . Shprintzen, Bardach (1995). Cleft Palate: ...

  18. Speech Problems

    MedlinePlus

    ... a person's ability to speak clearly. Some Common Speech Disorders Stuttering is a problem that interferes with fluent ... is a language disorder, while stuttering is a speech disorder. A person who stutters has trouble getting out ...

  19. Investigation of the optimum acoustical conditions for speech using auralization

    NASA Astrophysics Data System (ADS)

    Yang, Wonyoung; Hodgson, Murray

    2001-05-01

    Speech intelligibility is mainly affected by reverberation and by signal-to-noise level difference, the difference between the speech-signal and background-noise levels at a receiver. An important question for the design of rooms for speech (e.g., classrooms) is, what are the optimal values of these factors? This question has been studied experimentally and theoretically. Experimental studies found zero optimal reverberation time, but theoretical predictions found nonzero reverberation times. These contradictory results are partly caused by the different ways of accounting for background noise. Background noise sources and their locations inside the room are the most detrimental factors in speech intelligibility. However, noise levels also interact with reverberation in rooms. In this project, two major room-acoustical factors for speech intelligibility were controlled using speech and noise sources of known relative output levels located in a virtual room with known reverberation. Speech intelligibility test signals were played in the virtual room and auralized for listeners. The Modified Rhyme Test (MRT) and babble noise were used to measure subjective speech intelligibility quality. Optimal reverberation times, and the optimal values of other speech intelligibility metrics, for normal-hearing people and for hard-of-hearing people, were identified and compared.

  20. VISIBLE SPEECH.

    ERIC Educational Resources Information Center

    POTTER, RALPH K.; AND OTHERS

    A CORRECTED REPUBLICATION OF THE 1947 EDITION, THE BOOK DESCRIBES A FORM OF VISIBLE SPEECH OBTAINED BY THE RECORDING OF AN ANALYSIS OF SPEECH SOMEWHAT SIMILAR TO THE ANALYSIS PERFORMED BY THE EAR. ORIGINALLY INTENDED TO PRESENT AN EXPERIMENTAL TRAINING PROGRAM IN THE READING OF VISIBLE SPEECH AND EXPANDED TO INCLUDE MATERIAL OF INTEREST TO VARIOUS…

  1. The design of a device for hearer and feeler differentiation, part A. [speech modulated hearing device

    NASA Technical Reports Server (NTRS)

    Creecy, R.

    1974-01-01

    A speech modulated white noise device is reported that gives the rhythmic characteristics of a speech signal for intelligible reception by deaf persons. The signal is composed of random amplitudes and frequencies as modulated by the speech envelope characteristics of rhythm and stress. Time intensity parameters of speech are conveyed through the vibro-tactile sensation stimuli.

  2. Intensive Speech and Language Therapy for Older Children with Cerebral Palsy: A Systems Approach

    ERIC Educational Resources Information Center

    Pennington, Lindsay; Miller, Nick; Robson, Sheila; Steen, Nick

    2010-01-01

    Aim: To investigate whether speech therapy using a speech systems approach to controlling breath support, phonation, and speech rate can increase the speech intelligibility of children with dysarthria and cerebral palsy (CP). Method: Sixteen children with dysarthria and CP participated in a modified time series design. Group characteristics were…

  3. Working Papers in Speech-Language Pathology and Audiology. Volume XII.

    ERIC Educational Resources Information Center

    City Univ. of New York, Flushing. Queens Coll. Dept. of Communication Arts and Sciences.

    Seven papers report on speech language pathology and audiology studies performed by graduate students. The first paper reports on intelligibility of two popular synthetic speech systems used in communication aids for the speech impaired, the Votrax Personal Speech System and the Echo II synthesizer. The second paper reports facilitation of tense…

  4. Got EQ?: Increasing Cultural and Clinical Competence through Emotional Intelligence

    ERIC Educational Resources Information Center

    Robertson, Shari A.

    2007-01-01

    Cultural intelligence has been described across three parameters of human behavior: cognitive intelligence, emotional intelligence (EQ), and physical intelligence. Each contributes a unique and important perspective to the ability of speech-language pathologists and audiologists to provide benefits to their clients regardless of cultural…

  5. [Improving speech comprehension using a new cochlear implant speech processor].

    PubMed

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  6. Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Klein, Harriet B.; Liu-Shea, May

    2009-01-01

    Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

  7. Dramatic Effects of Speech Task on Motor and Linguistic Planning in Severely Dysfluent Parkinsonian Speech

    ERIC Educational Resources Information Center

    Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

    2012-01-01

    In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…

  8. Breath-Group Intelligibility in Dysarthria: Characteristics and Underlying Correlates

    ERIC Educational Resources Information Center

    Yunusova, Yana; Weismer, Gary; Kent, Ray D.; Rusche, Nicole M.

    2005-01-01

    Purpose: This study was designed to determine whether within-speaker fluctuations in speech intelligibility occurred among speakers with dysarthria who produced a reading passage, and, if they did, whether selected linguistic and acoustic variables predicted the variations in speech intelligibility. Method: Participants with dysarthria included a…

  9. Effects of Speaking Task on Intelligibility in Parkinson's Disease

    ERIC Educational Resources Information Center

    Tjaden, Kris; Wilding, Greg

    2011-01-01

    Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare…

  10. Speech levels in meeting rooms and the probability of speech privacy problems.

    PubMed

    Bradley, J S; Gover, B N

    2010-02-01

    Speech levels were measured in a large number of meetings and meeting rooms to better understand their influence on the speech privacy of closed meeting rooms. The effects of room size and number of occupants on average speech levels, for meetings with and without sound amplification, were investigated. The characteristics of the statistical variations of speech levels were determined in terms of speech levels measured over 10 s intervals at locations inside, but near the periphery of the meeting rooms. A procedure for predicting the probability of speech being audible or intelligible at points outside meeting rooms is proposed. It is based on the statistics of meeting room speech levels, in combination with the sound insulation characteristics of the room and the ambient noise levels at locations outside the room. PMID:20136204

  11. Production and perception of clear speech in Croatian and English

    NASA Astrophysics Data System (ADS)

    Smiljanić, Rajka; Bradlow, Ann R.

    2005-09-01

    Previous research has established that naturally produced English clear speech is more intelligible than English conversational speech. The major goal of this paper was to establish the presence of the clear speech effect in production and perception of a language other than English, namely Croatian. A systematic investigation of the conversational-to-clear speech transformations across languages with different phonological properties (e.g., large versus small vowel inventory) can provide a window into the interaction of general auditory-perceptual and phonological, structural factors that contribute to the high intelligibility of clear speech. The results of this study showed that naturally produced clear speech is a distinct, listener-oriented, intelligibility-enhancing mode of speech production in both languages. Furthermore, the acoustic-phonetic features of the conversational-to-clear speech transformation revealed cross-language similarities in clear speech production strategies. In both languages, talkers exhibited a decrease in speaking rate and an increase in pitch range, as well as an expansion of the vowel space. Notably, the findings of this study showed equivalent vowel space expansion in English and Croatian clear speech, despite the difference in vowel inventory size across the two languages, suggesting that the extent of vowel contrast enhancement in hyperarticulated clear speech is independent of vowel inventory size.

  12. Auditory free classification of nonnative speech.

    PubMed

    Atagi, Eriko; Bent, Tessa

    2013-11-01

    Through experience with speech variability, listeners build categories of indexical speech characteristics including categories for talker, gender, and dialect. The auditory free classification task-a task in which listeners freely group talkers based on audio samples-has been a useful tool for examining listeners' representations of some of these characteristics including regional dialects and different languages. The free classification task was employed in the current study to examine the perceptual representation of nonnative speech. The category structure and salient perceptual dimensions of nonnative speech were investigated from two perspectives: general similarity and perceived native language background. Talker intelligibility and whether native talkers were included were manipulated to test stimulus set effects. Results showed that degree of accent was a highly salient feature of nonnative speech for classification based on general similarity and on perceived native language background. This salience, however, was attenuated when listeners were listening to highly intelligible stimuli and attending to the talkers' native language backgrounds. These results suggest that the context in which nonnative speech stimuli are presented-such as the listeners' attention to the talkers' native language and the variability of stimulus intelligibility-can influence listeners' perceptual organization of nonnative speech. PMID:24363470

  13. Perceptual evaluation of motor speech following treatment for childhood cerebellar tumour.

    PubMed

    Cornwell, Petrea L; Murdoch, Bruce E; Ward, Elizabeth C; Kellie, Stewart

    2003-12-01

    The speech characteristics, oromotor function and speech intelligibility of a group of children treated for cerebellar tumour (CT) was investigated perceptually. Assessment of these areas was performed on 11 children treated for CT with dysarthric speech as well as 21 non-neurologically impaired controls matched for age and sex to obtain a comprehensive perceptual profile of their speech and oromotor mechanism. Contributing to the perception of dysarthria were a number of deviant speech dimensions including imprecision of consonants, hoarseness and decreased pitch variation, as well as a reduction in overall speech intelligibility for both sentences and connected speech. Oromotor assessment revealed deficits in lip, tongue and laryngeal function, particularly relating to deficits in timing and coordination of movements. The most salient features of the dysarthria seen in children treated for CT were the mild nature of the speech disorder and clustering of speech deficits in the prosodic, phonatory and articulatory aspects of speech production. PMID:14977025

  14. Speech Communication.

    ERIC Educational Resources Information Center

    Anderson, Betty

    The communications approach to teaching speech to high school students views speech as the study of the communication process in order to develop an awareness of and a sensitivity to the variables that affect human interaction. In using this approach the student is encouraged to try out as many types of messages using as many techniques and…

  15. Speech Aids

    NASA Technical Reports Server (NTRS)

    1987-01-01

    Designed to assist deaf and hearing impaired-persons in achieving better speech, Resnick Worldwide Inc.'s device provides a visual means of cuing the deaf as a speech-improvement measure. This is done by electronically processing the subjects' sounds and comparing them with optimum values which are displayed for comparison.

  16. Symbolic Speech

    ERIC Educational Resources Information Center

    Podgor, Ellen S.

    1976-01-01

    The concept of symbolic speech emanates from the 1967 case of United States v. O'Brien. These discussions of flag desecration, grooming and dress codes, nude entertainment, buttons and badges, and musical expression show that the courts place symbolic speech in different strata from verbal communication. (LBH)

  17. Rate dependent speech processing can be speech specific: Evidence from the perceptual disappearance of words under changes in context speech rate.

    PubMed

    Pitt, Mark A; Szostak, Christine; Dilley, Laura C

    2016-01-01

    The perception of reduced syllables, including function words, produced in casual speech can be made to disappear by slowing the rate at which surrounding words are spoken (Dilley & Pitt, Psychological Science, 21(11), 1664-1670. doi: 10.1177/0956797610384743 , 2010). The current study explored the domain generality of this speech-rate effect, asking whether it is induced by temporal information found only in speech. Stimuli were short word sequences (e.g., minor or child) appended to precursors that were clear speech, degraded speech (low-pass filtered or sinewave), or tone sequences, presented at a spoken rate and a slowed rate. Across three experiments, only precursors heard as intelligible speech generated a speech-rate effect (fewer reports of function words with a slowed context), suggesting that rate-dependent speech processing can be domain specific. PMID:26392395

  18. Towards A Clinical Tool For Automatic Intelligibility Assessment.

    PubMed

    Berisha, Visar; Utianski, Rene; Liss, Julie

    2013-01-01

    An important, yet under-explored, problem in speech processing is the automatic assessment of intelligibility for pathological speech. In practice, intelligibility assessment is often done through subjective tests administered by speech pathologists; however research has shown that these tests are inconsistent, costly, and exhibit poor reliability. Although some automatic methods for intelligibility assessment for telecommunications exist, research specific to pathological speech has been limited. Here, we propose an algorithm that captures important multi-scale perceptual cues shown to correlate well with intelligibility. Nonlinear classifiers are trained at each time scale and a final intelligibility decision is made using ensemble learning methods from machine learning. Preliminary results indicate a marked improvement in intelligibility assessment over published baseline results. PMID:25004985

  19. Towards A Clinical Tool For Automatic Intelligibility Assessment

    PubMed Central

    Berisha, Visar; Utianski, Rene; Liss, Julie

    2014-01-01

    An important, yet under-explored, problem in speech processing is the automatic assessment of intelligibility for pathological speech. In practice, intelligibility assessment is often done through subjective tests administered by speech pathologists; however research has shown that these tests are inconsistent, costly, and exhibit poor reliability. Although some automatic methods for intelligibility assessment for telecommunications exist, research specific to pathological speech has been limited. Here, we propose an algorithm that captures important multi-scale perceptual cues shown to correlate well with intelligibility. Nonlinear classifiers are trained at each time scale and a final intelligibility decision is made using ensemble learning methods from machine learning. Preliminary results indicate a marked improvement in intelligibility assessment over published baseline results. PMID:25004985

  20. Measures to Evaluate the Effects of DBS on Speech Production

    PubMed Central

    Weismer, Gary; Yunusova, Yana; Bunton, Kate

    2011-01-01

    The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066

  1. Using on-line altered auditory feedback treating Parkinsonian speech

    NASA Astrophysics Data System (ADS)

    Wang, Emily; Verhagen, Leo; de Vries, Meinou H.

    2005-09-01

    Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].

  2. An integrated approach to improving noisy speech perception

    NASA Astrophysics Data System (ADS)

    Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail

    2002-05-01

    For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.

  3. Accuracy of Repetition of Digitized and Synthesized Speech for Young Children in Background Noise

    ERIC Educational Resources Information Center

    Drager, Kathryn D. R.; Clark-Serpentine, Elizabeth A.; Johnson, Kate E.; Roeser, Jennifer L.

    2006-01-01

    Purpose: The present study investigated the intelligibility of digitized and synthesized speech output in background noise for children 3-5 years old. The purpose of the study was to determine whether there was a difference in the intelligibility (ability to repeat) of 3 types of speech output (digitized, DECTalk synthesized, and MacinTalk…

  4. Across-formant integration and speech intelligibility: Effects of acoustic source properties in the presence and absence of a contralateral interferer.

    PubMed

    Summers, Robert J; Bailey, Peter J; Roberts, Brian

    2016-08-01

    The role of source properties in across-formant integration was explored using three-formant (F1+F2+F3) analogues of natural sentences (targets). In experiment 1, F1+F3 were harmonic analogues (H1+H3) generated using a monotonous buzz source and second-order resonators; in experiment 2, F1+F3 were tonal analogues (T1+T3). F2 could take either form (H2 or T2). Target formants were always presented monaurally; the receiving ear was assigned randomly on each trial. In some conditions, only the target was present; in others, a competitor for F2 (F2C) was presented contralaterally. Buzz-excited or tonal competitors were created using the time-reversed frequency and amplitude contours of F2. Listeners must reject F2C to optimize keyword recognition. Whether or not a competitor was present, there was no effect of source mismatch between F1+F3 and F2. The impact of adding F2C was modest when it was tonal but large when it was harmonic, irrespective of whether F2C matched F1+F3. This pattern was maintained when harmonic and tonal counterparts were loudness-matched (experiment 3). Source type and competition, rather than acoustic similarity, governed the phonetic contribution of a formant. Contrary to earlier research using dichotic targets, requiring across-ear integration to optimize intelligibility, H2C was an equally effective informational masker for H2 as for T2. PMID:27586751

  5. The Amount of English Use: Effects on L2 Speech

    ERIC Educational Resources Information Center

    Vo, Son Ca; Vo, Yen Thi Hoang; Vo, Quyen Thanh

    2014-01-01

    The amount of second language (L2) use has significant influence on native speakers' comprehension of L2 learners' speech. Nonetheless, few empirical studies examine how differences in the amount of language use affect the intelligibility and comprehensibility of nonnative speakers' reading and spontaneous speech. This study aims to…

  6. Linkage of Speech Sound Disorder to Reading Disability Loci

    ERIC Educational Resources Information Center

    Smith, Shelley D.; Pennington, Bruce F.; Boada, Richard; Shriberg, Lawrence D.

    2005-01-01

    Background: Speech sound disorder (SSD) is a common childhood disorder characterized by developmentally inappropriate errors in speech production that greatly reduce intelligibility. SSD has been found to be associated with later reading disability (RD), and there is also evidence for both a cognitive and etiological overlap between the two…

  7. Reconstructing speech from human auditory cortex.

    PubMed

    Pasley, Brian N; David, Stephen V; Mesgarani, Nima; Flinker, Adeen; Shamma, Shihab A; Crone, Nathan E; Knight, Robert T; Chang, Edward F

    2012-01-01

    How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex. PMID:22303281

  8. Reconstructing Speech from Human Auditory Cortex

    PubMed Central

    Pasley, Brian N.; David, Stephen V.; Mesgarani, Nima; Flinker, Adeen; Shamma, Shihab A.; Crone, Nathan E.; Knight, Robert T.; Chang, Edward F.

    2012-01-01

    How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex. PMID:22303281

  9. Speech Enhancement based on Compressive Sensing Algorithm

    NASA Astrophysics Data System (ADS)

    Sulong, Amart; Gunawan, Teddy S.; Khalifa, Othman O.; Chebil, Jalel

    2013-12-01

    There are various methods, in performance of speech enhancement, have been proposed over the years. The accurate method for the speech enhancement design mainly focuses on quality and intelligibility. The method proposed with high performance level. A novel speech enhancement by using compressive sensing (CS) is a new paradigm of acquiring signals, fundamentally different from uniform rate digitization followed by compression, often used for transmission or storage. Using CS can reduce the number of degrees of freedom of a sparse/compressible signal by permitting only certain configurations of the large and zero/small coefficients, and structured sparsity models. Therefore, CS is significantly provides a way of reconstructing a compressed version of the speech in the original signal by taking only a small amount of linear and non-adaptive measurement. The performance of overall algorithms will be evaluated based on the speech quality by optimise using informal listening test and Perceptual Evaluation of Speech Quality (PESQ). Experimental results show that the CS algorithm perform very well in a wide range of speech test and being significantly given good performance for speech enhancement method with better noise suppression ability over conventional approaches without obvious degradation of speech quality.

  10. The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

    2011-01-01

    In a sample of 46 children aged 4-7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants' speech, prosody, and voice were compared with data from 40 typically-developing children, 13…

  11. Hemispheric Asymmetries in Speech Perception: Sense, Nonsense and Modulations

    PubMed Central

    Rosen, Stuart; Wise, Richard J. S.; Chadha, Shabneet; Conway, Eleanor-Jayne; Scott, Sophie K.

    2011-01-01

    Background The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding ‘rapid temporal processing’. Methodology A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech) which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET) was used to compare which brain regions were active when participants listened to the different sounds. Conclusions Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible) was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features. PMID:21980349

  12. The effects of noise on speech and warning signals

    NASA Astrophysics Data System (ADS)

    Suter, Alice H.

    1989-06-01

    To assess the effects of noise on speech communication it is necessary to examine certain characteristics of the speech signal. Speech level can be measured by a variety of methods, none of which has yet been standardized, and it should be kept in mind that vocal effort increases with background noise level and with different types of activity. Noise and filtering commonly degrade the speech signal, especially as it is transmitted through communications systems. Intelligibility is also adversely affected by distance, reverberation, and monaural listening. Communication systems currently in use may cause strain and delays on the part of the listener, but there are many possibilities for improvement. Individuals who need to communicate in noise may be subject to voice disorders. Shouted speech becomes progressively less intelligible at high voice levels, but improvements can be realized when talkers use clear speech. Tolerable listening levels are lower for negative than for positive S/Ns, and comfortable listening levels should be at a S/N of at least 5 dB, and preferably above 10 dB. Popular methods to predict speech intelligibility in noise include the Articulation Index, Speech Interference Level, Speech Transmission Index, and the sound level meter's A-weighting network. This report describes these methods, discussing certain advantages and disadvantages of each, and shows their interrelations.

  13. Intelligent interfaces for expert systems

    NASA Technical Reports Server (NTRS)

    Villarreal, James A.; Wang, Lui

    1988-01-01

    Vital to the success of an expert system is an interface to the user which performs intelligently. A generic intelligent interface is being developed for expert systems. This intelligent interface was developed around the in-house developed Expert System for the Flight Analysis System (ESFAS). The Flight Analysis System (FAS) is comprised of 84 configuration controlled FORTRAN subroutines that are used in the preflight analysis of the space shuttle. In order to use FAS proficiently, a person must be knowledgeable in the areas of flight mechanics, the procedures involved in deploying a certain payload, and an overall understanding of the FAS. ESFAS, still in its developmental stage, is taking into account much of this knowledge. The generic intelligent interface involves the integration of a speech recognizer and synthesizer, a preparser, and a natural language parser to ESFAS. The speech recognizer being used is capable of recognizing 1000 words of connected speech. The natural language parser is a commercial software package which uses caseframe instantiation in processing the streams of words from the speech recognizer or the keyboard. The systems configuration is described along with capabilities and drawbacks.

  14. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    PubMed

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. PMID:23891732

  15. Free Speech Yearbook: 1972.

    ERIC Educational Resources Information Center

    Tedford, Thomas L., Ed.

    This book is a collection of essays on free speech issues and attitudes, compiled by the Commission on Freedom of Speech of the Speech Communication Association. Four articles focus on freedom of speech in classroom situations as follows: a philosophic view of teaching free speech, effects of a course on free speech on student attitudes,…

  16. Speech analyzer

    NASA Technical Reports Server (NTRS)

    Lokerson, D. C. (Inventor)

    1977-01-01

    A speech signal is analyzed by applying the signal to formant filters which derive first, second and third signals respectively representing the frequency of the speech waveform in the first, second and third formants. A first pulse train having approximately a pulse rate representing the average frequency of the first formant is derived; second and third pulse trains having pulse rates respectively representing zero crossings of the second and third formants are derived. The first formant pulse train is derived by establishing N signal level bands, where N is an integer at least equal to two. Adjacent ones of the signal bands have common boundaries, each of which is a predetermined percentage of the peak level of a complete cycle of the speech waveform.

  17. Speech Research

    NASA Astrophysics Data System (ADS)

    Several articles addressing topics in speech research are presented. The topics include: exploring the functional significance of physiological tremor: A biospectroscopic approach; differences between experienced and inexperienced listeners to deaf speech; a language-oriented view of reading and its disabilities; Phonetic factors in letter detection; categorical perception; Short-term recall by deaf signers of American sign language; a common basis for auditory sensory storage in perception and immediate memory; phonological awareness and verbal short-term memory; initiation versus execution time during manual and oral counting by stutterers; trading relations in the perception of speech by five-year-old children; the role of the strap muscles in pitch lowering; phonetic validation of distinctive features; consonants and syllable boundaires; and vowel information in postvocalic frictions.

  18. Artificial intelligence: Principles and applications

    SciTech Connect

    Yazdami, M.

    1985-01-01

    The book covers the principles of AI, the main areas of application, as well as considering some of the social implications. The applications chapters have a common format structured as follows: definition of the topic; approach with conventional computing techniques; why 'intelligence' would provide a better approach; and how AI techniques would be used and the limitations. The contents discussed are: Principles of artificial intelligence; AI programming environments; LISP, list processing and pattern-making; AI programming with POP-11; Computer processing of natural language; Speech synthesis and recognition; Computer vision; Artificial intelligence and robotics; The anatomy of expert systems - Forsyth; Machine learning; Memory models of man and machine; Artificial intelligence and cognitive psychology; Breaking out of the chinese room; Social implications of artificial intelligence; and Index.

  19. Linking Speech Perception and Neurophysiology: Speech Decoding Guided by Cascaded Oscillators Locked to the Input Rhythm

    PubMed Central

    Ghitza, Oded

    2011-01-01

    The premise of this study is that current models of speech perception, which are driven by acoustic features alone, are incomplete, and that the role of decoding time during memory access must be incorporated to account for the patterns of observed recognition phenomena. It is postulated that decoding time is governed by a cascade of neuronal oscillators, which guide template-matching operations at a hierarchy of temporal scales. Cascaded cortical oscillations in the theta, beta, and gamma frequency bands are argued to be crucial for speech intelligibility. Intelligibility is high so long as these oscillations remain phase locked to the auditory input rhythm. A model (Tempo) is presented which is capable of emulating recent psychophysical data on the intelligibility of speech sentences as a function of “packaging” rate (Ghitza and Greenberg, 2009). The data show that intelligibility of speech that is time-compressed by a factor of 3 (i.e., a high syllabic rate) is poor (above 50% word error rate), but is substantially restored when the information stream is re-packaged by the insertion of silent gaps in between successive compressed-signal intervals – a counterintuitive finding, difficult to explain using classical models of speech perception, but emerging naturally from the Tempo architecture. PMID:21743809

  20. Speech reception thresholds in various interference conditions

    NASA Astrophysics Data System (ADS)

    Carr, Suzanne P.; Colburn, H. Steven

    2001-05-01

    Speech intelligibility is integral to human verbal communication; however, our understanding of the effects of competing noise, room reverberation, and frequency range restriction is incomplete. Using virtual stimuli, the dependence of intelligibility threshold levels on the extent of room reverberation, the relative locations of speech target and masking noise, and the available frequency content of the speech and the masking noise is explored. Speech-shaped masking noise and target sentences have three spectral conditions: wideband, high pass above 2-kHz, and low pass below 2-kHz. The 2-kHz cutoff was chosen to approximately bisect the range of frequencies most important in speech, and the high pass noise condition simulates high-frequency hearing loss. Reverberation conditions include a pseudo-anechoic case, a moderately reverberant ``classroom'' case, and a very reverberant ``bathroom'' case. Both binaural and monaural intelligibility are measured. Preliminary results show that source separation decreases thresholds, reverberation increases thresholds, and low frequency noise reverberates more in the rooms, contributing to increasing thresholds along with the effects of the upward spread of masking. The energetic effects of reverberation are explored. [Work supported by NIH DC00100.

  1. Enhancement of Electrolaryngeal Speech by Adaptive Filtering.

    ERIC Educational Resources Information Center

    Espy-Wilson, Carol Y.; Chari, Venkatesh R.; MacAuslan, Joel M.; Huang, Caroline B.; Walsh, Michael J.

    1998-01-01

    A study tested the quality and intelligibility, as judged by several listeners, of four users' electrolaryngeal speech, with and without filtering to compensate for perceptually objectionable acoustic characteristics. Results indicated that an adaptive filtering technique produced a noticeable improvement in the quality of the Transcutaneous…

  2. Speech Recognition for A Digital Video Library.

    ERIC Educational Resources Information Center

    Witbrock, Michael J.; Hauptmann, Alexander G.

    1998-01-01

    Production of the meta-data supporting the Informedia Digital Video Library interface is automated using techniques derived from artificial intelligence research. Speech recognition and natural-language processing, information retrieval, and image analysis are applied to produce an interface that helps users locate information and navigate more…

  3. Speech Production in Hearing-Impaired Children.

    ERIC Educational Resources Information Center

    Gold, Toni

    1980-01-01

    Investigations in recent years have indicated that only about 20% of the speech output of the deaf is understood by the "person on the street." This lack of intelligibility has been associated with some frequently occurring segmental and suprasegmental errors. Journal Availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY…

  4. Gaze Patterns and Audiovisual Speech Enhancement

    ERIC Educational Resources Information Center

    Yi, Astrid; Wong, Willy; Eizenman, Moshe

    2013-01-01

    Purpose: In this study, the authors sought to quantify the relationships between speech intelligibility (perception) and gaze patterns under different auditory-visual conditions. Method: Eleven subjects listened to low-context sentences spoken by a single talker while viewing the face of one or more talkers on a computer display. Subjects either…

  5. The Acquisition of Verbal Communication Skills by Severely Hearing-Impaired Children through the Modified Cued Speech-Phonetic Alphabet Method.

    ERIC Educational Resources Information Center

    Duffy, John K.

    The paper describes the potential of cued speech to provide verbal language and intelligible speech to severely hearing impaired students. The approach, which combines auditory-visual-oral and manual cues, is designed as a visual supplement to normal speech. The paper traces the development of cued speech and discusses modifications made to the R.…

  6. Noise suppression methods for robust speech processing

    NASA Astrophysics Data System (ADS)

    Boll, S. F.; Kajiya, J.; Youngberg, J.; Petersen, T. L.; Ravindra, H.; Done, W.; Cox, B. V.; Cohen, E.

    1981-04-01

    Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments during the reporting period for the research program funded to develop real-time, compressed speech analysis-synthesis algorithms whose performance is invariant under signal contamination. Fulfillment of this requirement is necessary to insure reliable secure compressed speech transmission within realistic military command and control environments. Overall contributions resulting from this research program include the understanding of how environmental noise degrades narrow band, coded speech, development of appropriate real-time noise suppression algorithms, and development of speech parameter identification methods that consider signal contamination as a fundamental element in the estimation process. This report describes the research and results in the areas of noise suppression using the dual input adaptive noise cancellation articulation rate change techniques, spectral subtraction and a description of an experiment which demonstrated that the spectral substraction noise suppression algorithm can improve the intelligibility of 2400 bps, LPC-10 coded, helicopter speech by 10.6 points. In addition summaries are included of prior studies in Constant-Q signal analysis and synthesis, perceptual modelling, speech activity detection, and pole-zero modelling of noisy signals. Three recent studies in speech modelling using the critical band analysis-synthesis transform and using splines are then presented. Finally a list of major publications generated under this contract is given.

  7. Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

    NASA Astrophysics Data System (ADS)

    Kayasith, Prakasith; Theeramunkong, Thanaruk

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  8. Speech Improvement.

    ERIC Educational Resources Information Center

    Gordon, Morton J.

    This book serves as a guide for the native and non-native speaker of English in overcoming various problems in articulation, rhythm, and intonation. It is also useful in group therapy speech programs. Forty-five practice chapters offer drill materials for all the vowels, diphthongs, and consonants of American English plus English stress and…

  9. Prediction and constraint in audiovisual speech perception.

    PubMed

    Peelle, Jonathan E; Sommers, Mitchell S

    2015-07-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration

  10. Prediction and constraint in audiovisual speech perception

    PubMed Central

    Peelle, Jonathan E.; Sommers, Mitchell S.

    2015-01-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported

  11. Dual task performance with LPC (Linear Predictive Coding) degraded speech in a sentence verification task

    NASA Astrophysics Data System (ADS)

    Schmidt-Nielsen, Astrid; Kallman, Howard J.; Meijer, Corinne

    1989-10-01

    The results of a preliminary study on the effects of reduced speech intelligibility on dual task performance are reported. The speech task was a sentence verification task, and the speech degradation was accomplished using a narrowband digital voice transmission system operating with and without random bit errors. The second task was a visual picture sorting task. There was a dual task decrement on the sorting task, and in addition, there was a further decrease in sorts per minute as the speech was increasingly degraded. Reaction time for the speech task increased with the concurrent sorting task, but the dual task condition did not affect speech task error rates.

  12. Intelligibility of 4-Year-Old Children with and without Cerebral Palsy

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Schueler, Brynn; Schultz, Laurel; DuHadway, Caitlin

    2012-01-01

    Purpose: The authors examined speech intelligibility in typically developing (TD) children and 3 groups of children with cerebral palsy (CP) who were classified into speech/language profile groups following Hustad, Gorton, and Lee (2010). Questions addressed differences in transcription intelligibility scores among groups, the effects of utterance…

  13. Speech coding

    NASA Astrophysics Data System (ADS)

    Gersho, Allen

    1990-05-01

    Recent advances in algorithms and techniques for speech coding now permit high quality voice reproduction at remarkably low bit rates. The advent of powerful single-ship signal processors has made it cost effective to implement these new and sophisticated speech coding algorithms for many important applications in voice communication and storage. Some of the main ideas underlying the algorithms of major interest today are reviewed. The concept of removing redundancy by linear prediction is reviewed, first in the context of predictive quantization or DPCM. Then linear predictive coding, adaptive predictive coding, and vector quantization are discussed. The concepts of excitation coding via analysis-by-synthesis, vector sum excitation codebooks, and adaptive postfiltering are explained. The main idea of vector excitation coding (VXC) or code excited linear prediction (CELP) are presented. Finally low-delay VXC coding and phonetic segmentation for VXC are described.

  14. Children with Comorbid Speech Sound Disorder and Specific Language Impairment Are at Increased Risk for Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    McGrath, Lauren M.; Hutaff-Lee, Christa; Scott, Ashley; Boada, Richard; Shriberg, Lawrence D.; Pennington, Bruce F.

    2008-01-01

    This study focuses on the comorbidity between attention-deficit/hyperactivity disorder (ADHD) symptoms and speech sound disorder (SSD). SSD is a developmental disorder characterized by speech production errors that impact intelligibility. Previous research addressing this comorbidity has typically used heterogeneous groups of speech-language…

  15. Speech and Communication Disorders

    MedlinePlus

    ... or understand speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism spectrum disorder Brain injury Stroke Some speech and ...

  16. Speech impairment (adult)

    MedlinePlus

    Language impairment; Impairment of speech; Inability to speak; Aphasia; Dysarthria; Slurred speech; Dysphonia voice disorders ... disorders develop gradually, but anyone can develop a speech and ... suddenly, usually in a trauma. APHASIA Alzheimer disease ...

  17. Speech impairment (adult)

    MedlinePlus

    Language impairment; Impairment of speech; Inability to speak; Aphasia; Dysarthria; Slurred speech; Dysphonia voice disorders ... Common speech and language disorders include: APHASIA Aphasia is ... understand or express spoken or written language. It commonly ...

  18. A music perception disorder (congenital amusia) influences speech comprehension.

    PubMed

    Liu, Fang; Jiang, Cunmei; Wang, Bei; Xu, Yi; Patel, Aniruddh D

    2015-01-01

    This study investigated the underlying link between speech and music by examining whether and to what extent congenital amusia, a musical disorder characterized by degraded pitch processing, would impact spoken sentence comprehension for speakers of Mandarin, a tone language. Sixteen Mandarin-speaking amusics and 16 matched controls were tested on the intelligibility of news-like Mandarin sentences with natural and flat fundamental frequency (F0) contours (created via speech resynthesis) under four signal-to-noise (SNR) conditions (no noise, +5, 0, and -5dB SNR). While speech intelligibility in quiet and extremely noisy conditions (SNR=-5dB) was not significantly compromised by flattened F0, both amusic and control groups achieved better performance with natural-F0 sentences than flat-F0 sentences under moderately noisy conditions (SNR=+5 and 0dB). Relative to normal listeners, amusics demonstrated reduced speech intelligibility in both quiet and noise, regardless of whether the F0 contours of the sentences were natural or flattened. This deficit in speech intelligibility was not associated with impaired pitch perception in amusia. These findings provide evidence for impaired speech comprehension in congenital amusia, suggesting that the deficit of amusics extends beyond pitch processing and includes segmental processing. PMID:25445781

  19. Prevalence and pattern of perceived intelligibility changes in Parkinson's disease

    PubMed Central

    Miller, Nick; Allcock, Liesl; Jones, Diana; Noble, Emma; Hildreth, Anthony J; Burn, David J

    2007-01-01

    Background Changes to spoken communication are inevitable in Parkinson's disease (PD). It remains unclear what consequences changes have for intelligibility of speech. Aims To establish the prevalence of impaired speech intelligibility in people with PD and the relationship of intelligibility decline to indicators of disease progression. Methods 125 speakers with PD and age matched unaffected controls completed a diagnostic intelligibility test and described how to carry out a common daily activity in an “off drug” state. Listeners unfamiliar with dysarthric speech evaluated responses. Results 69.6% (n = 87) of people with PD fell below the control mean of unaffected speakers (n = 40), 51.2% (n = 64) by more than −1 SD below. 48% (n = 60) were perceived as worse than the lowest unaffected speaker for how disordered speech sounded. 38% (n = 47) placed speech changes among their top four concerns regarding their PD. Intelligibility level did not correlate significantly with age or disease duration and only weakly with stage and severity of PD. There were no significant differences between participants with tremor dominant versus postural instability/gait disorder motor phenotypes of PD. Conclusions Speech intelligibility is significantly reduced in PD; it can be among the main concerns of people with PD, but it is not dependent on disease severity, duration or motor phenotype. Patients' own perceptions of the extent of change do not necessarily reflect objective measures. PMID:17400592

  20. Optimal speech level for speech transmission in a noisy environment for young adults and aged persons

    NASA Astrophysics Data System (ADS)

    Sato, Hayato; Ota, Ryo; Morimoto, Masayuki; Sato, Hiroshi

    2005-04-01

    Assessing sound environment of classrooms for the aged is a very important issue, because classrooms can be used by the aged for their lifelong learning, especially in the aged society. Hence hearing loss due to aging is a considerable factor for classrooms. In this study, the optimal speech level in noisy fields for both young adults and aged persons was investigated. Listening difficulty ratings and word intelligibility scores for familiar words were used to evaluate speech transmission performance. The results of the tests demonstrated that the optimal speech level for moderate background noise (i.e., less than around 60 dBA) was fairly constant. Meanwhile, the optimal speech level depended on the speech-to-noise ratio when the background noise level exceeded around 60 dBA. The minimum required speech level to minimize difficulty ratings for the aged was higher than that for the young. However, the minimum difficulty ratings for both the young and the aged were given in the range of speech level of 70 to 80 dBA of speech level.

  1. Speech research

    NASA Astrophysics Data System (ADS)

    1992-06-01

    Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.

  2. Neural Oscillations Carry Speech Rhythm through to Comprehension

    PubMed Central

    Peelle, Jonathan E.; Davis, Matthew H.

    2012-01-01

    A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners’ processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope contains important markers for place and manner of articulation, segmental information, and speech rate. Behavioral studies demonstrate that amplitude envelope information is relied upon by listeners and plays a key role in speech intelligibility. Extending behavioral findings, data from neuroimaging – particularly electroencephalography (EEG) and magnetoencephalography (MEG) – point to phase locking by ongoing cortical oscillations to low-frequency information (~4–8 Hz) in the speech envelope. This phase modulation effectively encodes a prediction of when important events (such as stressed syllables) are likely to occur, and acts to increase sensitivity to these relevant acoustic cues. We suggest a framework through which such neural entrainment to speech rhythm can explain effects of speech rate on word and segment perception (i.e., that the perception of phonemes and words in connected speech is influenced by preceding speech rate). Neuroanatomically, acoustic amplitude modulations are processed largely bilaterally in auditory cortex, with intelligible speech resulting in differential recruitment of left-hemisphere regions. Notable among these is lateral anterior temporal cortex, which we propose functions in a domain-general fashion to support ongoing memory and integration of meaningful input. Together, the reviewed evidence suggests that low-frequency oscillations in the acoustic speech signal form the foundation of a rhythmic hierarchy supporting spoken language, mirrored by phase-locked oscillations in the human brain. PMID:22973251

  3. Some applications of the Speech Transmission Index (STI) in auditoria

    NASA Astrophysics Data System (ADS)

    Steeneken, H. J. M.; Houtgast, T.

    1982-02-01

    The use of an objective method of measuring speech intelligibility in auditoria is illustrated. The applications involve the mapping of iso-intelligibility contours for tracing areas with poor intelligibility, and also for assessing the gain of a public address system. The method, based on the modulation transfer function (MTF), presents valuable diagnostic information about the effect of reverberation, noise, echoes and of public address systems on intelligibility. The measuring time is about 3 minutes for the MTFs of the octave bands 500 Hz and 2000 Hz.

  4. Speech recognition and understanding

    SciTech Connect

    Vintsyuk, T.K.

    1983-05-01

    This article discusses the automatic processing of speech signals with the aim of finding a sequence of works (speech recognition) or a concept (speech understanding) being transmitted by the speech signal. The goal of the research is to develop an automatic typewriter that will automatically edit and type text under voice control. A dynamic programming method is proposed in which all possible class signals are stored, after which the presented signal is compared to all the stored signals during the recognition phase. Topics considered include element-by-element recognition of words of speech, learning speech recognition, phoneme-by-phoneme speech recognition, the recognition of connected speech, understanding connected speech, and prospects for designing speech recognition and understanding systems. An application of the composition dynamic programming method for the solution of basic problems in the recognition and understanding of speech is presented.

  5. Artificial Intelligence.

    ERIC Educational Resources Information Center

    Waltz, David L.

    1982-01-01

    Describes kinds of results achieved by computer programs in artificial intelligence. Topics discussed include heuristic searches, artificial intelligence/psychology, planning program, backward chaining, learning (focusing on Winograd's blocks to explore learning strategies), concept learning, constraint propagation, language understanding…

  6. Speech and oral motor profile after childhood hemispherectomy.

    PubMed

    Liégeois, Frédérique; Morgan, Angela T; Stewart, Lorna H; Helen Cross, J; Vogel, Adam P; Vargha-Khadem, Faraneh

    2010-08-01

    Hemispherectomy (disconnection or removal of an entire cerebral hemisphere) is a rare surgical procedure used for the relief of drug-resistant epilepsy in children. After hemispherectomy, contralateral hemiplegia persists whereas gross expressive and receptive language functions can be remarkably spared. Motor speech deficits have rarely been examined systematically, thus limiting the accuracy of postoperative prognosis. We describe the speech profiles of hemispherectomized participants characterizing their intelligibility, articulation, phonological speech errors, dysarthric features, and execution and sequencing of orofacial speech and non-speech movements. Thirteen participants who had undergone hemispherectomy (six left, seven right; nine with congenital, four with acquired hemiplegia; operated between four months and 13 years) were investigated. Results showed that all participants were intelligible but showed a mild dysarthric profile characterized by neuromuscular asymmetry and reduced quality and coordination of movements, features that are characteristic of adult-onset unilateral upper motor neuron dysarthria, flaccid-ataxic variant. In addition, one left and four right hemispherectomy cases presented with impaired production of speech and non-speech sequences. No participant showed evidence of verbal or oral dyspraxia. It is concluded that mild dysarthria is persistent after left or right hemispherectomy, irrespective of age at onset of hemiplegia. These results indicate incomplete functional re-organization for the control of fine speech motor movements throughout childhood, and provide no evidence of hemispheric differences. PMID:20096448

  7. An Exploration of Listener Variability in Intelligibility Judgments

    ERIC Educational Resources Information Center

    McHenry, Monica

    2011-01-01

    Purpose: This study was designed to assess potential contributors to listener variability in judgments of intelligibility. Method: A total of 228 unfamiliar everyday listeners judged speech samples from 3 individuals with dysarthria. Samples were the single-word phonetic contrast test, the Sentence Intelligibility Test, an unpredictable sentence…

  8. Artificial Intelligence.

    ERIC Educational Resources Information Center

    Information Technology Quarterly, 1985

    1985-01-01

    This issue of "Information Technology Quarterly" is devoted to the theme of "Artificial Intelligence." It contains two major articles: (1) Artificial Intelligence and Law" (D. Peter O'Neill and George D. Wood); (2) "Artificial Intelligence: A Long and Winding Road" (John J. Simon, Jr.). In addition, it contains two sidebars: (1) "Calculating and…

  9. Artificial Intelligence.

    ERIC Educational Resources Information Center

    Thornburg, David D.

    1986-01-01

    Overview of the artificial intelligence (AI) field provides a definition; discusses past research and areas of future research; describes the design, functions, and capabilities of expert systems and the "Turing Test" for machine intelligence; and lists additional sources for information on artificial intelligence. Languages of AI are also briefly…

  10. Competitive Intelligence.

    ERIC Educational Resources Information Center

    Bergeron, Pierrette; Hiller, Christine A.

    2002-01-01

    Reviews the evolution of competitive intelligence since 1994, including terminology and definitions and analytical techniques. Addresses the issue of ethics; explores how information technology supports the competitive intelligence process; and discusses education and training opportunities for competitive intelligence, including core competencies…

  11. Organisational Intelligence

    ERIC Educational Resources Information Center

    Yolles, Maurice

    2005-01-01

    Purpose: Seeks to explore the notion of organisational intelligence as a simple extension of the notion of the idea of collective intelligence. Design/methodology/approach: Discusses organisational intelligence using previous research, which includes the Purpose, Properties and Practice model of Dealtry, and the Viable Systems model. Findings: The…

  12. VOT in speech-disordered individuals: History, theory, data, reminiscence

    NASA Astrophysics Data System (ADS)

    Weismer, Gary

    2001-05-01

    Forty years ago Lisker and Abramson published their landmark paper on VOT; the speech-research world has never been the same. The concept of VOT as a measure relevant to phonology, speech physiology, and speech perception made it a prime choice for scientists who saw an opportunity to exploit the techniques and analytic frameworks of ``speech science'' in the study of speech disorders. Modifications of VOT in speech disorders have been used to draw specific inferences concerning phonological representations, glottal-supraglottal timing, and speech intelligibility. This presentation will provide a review of work on VOT in speech disorders, including (among others) stuttering, hearing impairment, and neurogenic disorders. An attempt will be made to collect published data in summary graphic form, and to discuss their implications. Emphasis will be placed on how VOT has been used to inform theories of disordered speech production. I will close with some personal comments about the influence (unbeknowest to them) these two outstanding scientists had on me in the 1970s, when under the spell of their work I first became aware that the world of speech research did not start and end with moving parts.

  13. Danish auroral science history

    NASA Astrophysics Data System (ADS)

    Stauning, P.

    2011-01-01

    Danish auroral science history begins with the early auroral observations made by the Danish astronomer Tycho Brahe during the years from 1582 to 1601 preceding the Maunder minimum in solar activity. Included are also the brilliant observations made by another astronomer, Ole Rømer, from Copenhagen in 1707, as well as the early auroral observations made from Greenland by missionaries during the 18th and 19th centuries. The relations between auroras and geomagnetic variations were analysed by H. C. Ørsted, who also played a vital role in the development of Danish meteorology that came to include comprehensive auroral observations from Denmark, Iceland and Greenland as well as auroral and geomagnetic research. The very important auroral investigations made by Sophus Tromholt are outlined. His analysis from 1880 of auroral observations from Greenland prepared for the significant contributions from the Danish Meteorological Institute, DMI, (founded in 1872) to the first International Polar Year 1882/83, where an expedition headed by Adam Paulsen was sent to Greenland to conduct auroral and geomagnetic observations. Paulsen's analyses of the collected data gave many important results but also raised many new questions that gave rise to auroral expeditions to Iceland in 1899 to 1900 and to Finland in 1900 to 1901. Among the results from these expeditions were 26 unique paintings of the auroras made by the artist painter, Harald Moltke. The expedition to Finland was headed by Dan la Cour, who later as director of the DMI came to be in charge of the comprehensive international geomagnetic and auroral observations made during the Second International Polar Year in 1932/33. Finally, the article describes the important investigations made by Knud Lassen during, among others, the International Geophysical Year 1957/58 and during the International Quiet Sun Year (IQSY) in 1964/65. With his leadership the auroral and geomagnetic research at DMI reached a high international

  14. Dramatic effects of speech task on motor and linguistic planning in severely dysfluent parkinsonian speech

    PubMed Central

    Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

    2015-01-01

    In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency, and voice emerge more saliently in conversation than in repetition, reading, or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have revealed that formulaic language is more impaired than novel language. This descriptive study extends these observations to a case of severely dysfluent dysarthria due to a parkinsonian syndrome. Dysfluencies were quantified and compared for conversation, two forms of repetition, reading, recited speech, and singing. Other measures examined phonetic inventories, word forms, and formulaic language. Phonetic, syllabic, and lexical dysfluencies were more abundant in conversation than in other task conditions. Formulaic expressions in conversation were reduced compared to normal speakers. A proposed explanation supports the notion that the basal ganglia contribute to formulation of internal models for execution of speech. PMID:22774929

  15. Auditory cortex activation to natural speech and simulated cochlear implant speech measured with functional near-infrared spectroscopy.

    PubMed

    Pollonini, Luca; Olds, Cristen; Abaya, Homer; Bortfeld, Heather; Beauchamp, Michael S; Oghalai, John S

    2014-03-01

    The primary goal of most cochlear implant procedures is to improve a patient's ability to discriminate speech. To accomplish this, cochlear implants are programmed so as to maximize speech understanding. However, programming a cochlear implant can be an iterative, labor-intensive process that takes place over months. In this study, we sought to determine whether functional near-infrared spectroscopy (fNIRS), a non-invasive neuroimaging method which is safe to use repeatedly and for extended periods of time, can provide an objective measure of whether a subject is hearing normal speech or distorted speech. We used a 140 channel fNIRS system to measure activation within the auditory cortex in 19 normal hearing subjects while they listed to speech with different levels of intelligibility. Custom software was developed to analyze the data and compute topographic maps from the measured changes in oxyhemoglobin and deoxyhemoglobin concentration. Normal speech reliably evoked the strongest responses within the auditory cortex. Distorted speech produced less region-specific cortical activation. Environmental sounds were used as a control, and they produced the least cortical activation. These data collected using fNIRS are consistent with the fMRI literature and thus demonstrate the feasibility of using this technique to objectively detect differences in cortical responses to speech of different intelligibility. PMID:24342740

  16. Careers in Speech Communication.

    ERIC Educational Resources Information Center

    Speech Communication Association, New York, NY.

    Brief discussions in this pamphlet suggest educational and career opportunities in the following fields of speech communication: rhetoric, public address, and communication; theatre, drama, and oral interpretation; radio, television, and film; speech pathology and audiology; speech science, phonetics, and linguistics; and speech education.…

  17. Opportunities in Speech Pathology.

    ERIC Educational Resources Information Center

    Newman, Parley W.

    The importance of speech is discussed and speech pathology is described. Types of communication disorders considered are articulation disorders, aphasia, facial deformity, hearing loss, stuttering, delayed speech, voice disorders, and cerebral palsy; examples of five disorders are given. Speech pathology is investigated from these aspects: the…

  18. Models of speech synthesis.

    PubMed Central

    Carlson, R

    1995-01-01

    The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community. PMID:7479805

  19. Self-Reported Speech Problems in Adolescents and Young Adults with 22q11.2 Deletion Syndrome: A Cross-Sectional Cohort Study

    PubMed Central

    Vorstman, Jacob AS; Kon, Moshe; Mink van der Molen, Aebele B

    2014-01-01

    Background Speech problems are a common clinical feature of the 22q11.2 deletion syndrome. The objectives of this study were to inventory the speech history and current self-reported speech rating of adolescents and young adults, and examine the possible variables influencing the current speech ratings, including cleft palate, surgery, speech and language therapy, intelligence quotient, and age at assessment. Methods In this cross-sectional cohort study, 50 adolescents and young adults with the 22q11.2 deletion syndrome (ages, 12-26 years, 67% female) filled out questionnaires. A neuropsychologist administered an age-appropriate intelligence quotient test. The demographics, histories, and intelligence of patients with normal speech (speech rating=1) were compared to those of patients with different speech (speech rating>1). Results Of the 50 patients, a minority (26%) had a cleft palate, nearly half (46%) underwent a pharyngoplasty, and all (100%) had speech and language therapy. Poorer speech ratings were correlated with more years of speech and language therapy (Spearman's correlation= 0.418, P=0.004; 95% confidence interval, 0.145-0.632). Only 34% had normal speech ratings. The groups with normal and different speech were not significantly different with respect to the demographic variables; a history of cleft palate, surgery, or speech and language therapy; and the intelligence quotient. Conclusions All adolescents and young adults with the 22q11.2 deletion syndrome had undergone speech and language therapy, and nearly half of them underwent pharyngoplasty. Only 34% attained normal speech ratings. Those with poorer speech ratings had speech and language therapy for more years. PMID:25276637

  20. Fifty years of progress in speech synthesis

    NASA Astrophysics Data System (ADS)

    Schroeter, Juergen

    2004-10-01

    A common opinion is that progress in speech synthesis should be easier to discern than in other areas of speech communication: you just have to listen to the speech! Unfortunately, things are more complicated. It can be said, however, that early speech synthesis efforts were primarily concerned with providing intelligible speech, while, more recently, ``naturalness'' has been the focus. The field had its ``electronic'' roots in Homer Dudley's 1939 ``Voder,'' and it advanced in the 1950s and 1960s through progress in a number of labs including JSRU in England, Haskins Labs in the U.S., and Fant's Lab in Sweden. In the 1970s and 1980s significant progress came from efforts at Bell Labs (under Jim Flanagan's leadership) and at MIT (where Dennis Klatt created one of the first commercially viable systems). Finally, over the past 15 years, the methods of unit-selection synthesis were devised, primarily at ATR in Japan, and were advanced by work at AT&T Labs, Univ. of Edinburgh, and ATR. Today, TTS systems are able to ``convince some of the listeners some of the time'' that synthetic speech is as natural as live recordings. Ongoing efforts aim at replacing ``some'' with ``most'' for a wide range of real-world applications.

  1. Multichannel spatial auditory display for speech communications

    NASA Technical Reports Server (NTRS)

    Begault, D. R.; Erbe, T.; Wenzel, E. M. (Principal Investigator)

    1994-01-01

    A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.

  2. Acoustic assessment of speech privacy curtains in two nursing units.

    PubMed

    Pope, Diana S; Miller-Klein, Erik T

    2016-01-01

    Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959

  3. Acoustic assessment of speech privacy curtains in two nursing units

    PubMed Central

    Pope, Diana S.; Miller-Klein, Erik T.

    2016-01-01

    Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959

  4. Phonetic Intelligibility Testing in Adults with Down Syndrome

    PubMed Central

    Bunton, Kate; Leddy, Mark; Miller, Jon

    2009-01-01

    The purpose of the study was to document speech intelligibility deficits for a group of five adult males with Down syndrome, and use listener based error profiles to identify phonetic dimensions underlying reduced intelligibility. Phonetic error profiles were constructed for each speaker using the Kent, Weismer, Kent, and Rosenbek (1989) word intelligibility test. The test was designed to allow for identification of reasons for the intelligibility deficit, quantitative analyses at varied levels, and sensitivity to potential speech deficits across populations. Listener generated profiles were calculated based on a multiple-choice task and a transcription task. The most disrupted phonetic features, across listening task, involved simplification of clusters in both the word initial and word final position, and contrasts involving tongue-posture, control, and timing (e.g., high-low vowel, front-back vowel, and place of articulation for stops and fricatives). Differences between speakers in the ranking of these phonetic features was found, however, the mean error proportion for the six most severely affected features correlated highly with the overall intelligibility score (0.88 based on multiple-choice task, .94 for the transcription task). The phonetic feature analyses are an index that may help clarify the suspected motor speech basis for the speech intelligibility deficits seen in adults with Down syndrome and may lead to improved speech management in these individuals. PMID:17692179

  5. The Auditory-Brainstem Response to Continuous, Non-repetitive Speech Is Modulated by the Speech Envelope and Reflects Speech Processing

    PubMed Central

    Reichenbach, Chagit S.; Braiman, Chananel; Schiff, Nicholas D.; Hudspeth, A. J.; Reichenbach, Tobias

    2016-01-01

    The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem. The ABR is also employed to investigate the auditory brainstem in a multitude of tasks related to hearing, such as processing speech or selectively focusing on one speaker in a noisy environment. Such research measures the response of the brainstem to short speech signals such as vowels or words. Because the voltage signal of the ABR has a tiny amplitude, several hundred to a thousand repetitions of the acoustic signal are needed to obtain a reliable response. The large number of repetitions poses a challenge to assessing cognitive functions due to neural adaptation. Here we show that continuous, non-repetitive speech, lasting several minutes, may be employed to measure the ABR. Because the speech is not repeated during the experiment, the precise temporal form of the ABR cannot be determined. We show, however, that important structural features of the ABR can nevertheless be inferred. In particular, the brainstem responds at the fundamental frequency of the speech signal, and this response is modulated by the envelope of the voiced parts of speech. We accordingly introduce a novel measure that assesses the ABR as modulated by the speech envelope, at the fundamental frequency of speech and at the characteristic latency of the response. This measure has a high signal-to-noise ratio and can hence be employed effectively to measure the ABR to continuous speech. We use this novel measure to show that the ABR is weaker to intelligible speech than to unintelligible, time-reversed speech. The methods presented here can be employed for further research on speech processing in the auditory brainstem and can lead to the development of future clinical diagnosis of brainstem function. PMID:27303286

  6. Rehabilitation of impaired speech function (dysarthria, dysglossia)

    PubMed Central

    Schröter-Morasch, Heidrun; Ziegler, Wolfram

    2005-01-01

    Speech disorders can result (1) from sensorimotor impairments of articulatory movements = dysarthria, or (2) from structural changes of the speech organs, in adults particularly after surgical and radiochemical treatment of tumors = dysglossia. The decrease of intelligibility, a reduced vocal stamina, the stigmatization of a conspicuous voice and manner of speech, the reduction of emotional expressivity all mean greatly diminished quality of life, restricted career opportunities and diminished social contacts. Intensive therapy based on the pathophysiological facts is absolutely essential: Functional exercise therapy plays a central role; according to symptoms and their progression it can be complemented with prosthetic and surgical approaches. In severe cases communicational aids have to be used. All rehabilitation measures have to take account of frequently associated disorders of body motor control and/or impairment of cognition and behaviour. PMID:22073063

  7. Template based low data rate speech encoder

    NASA Astrophysics Data System (ADS)

    Fransen, Lawrence

    1993-09-01

    The 2400-b/s linear predictive coder (LPC) is currently being widely deployed to support tactical voice communication over narrowband channels. However, there is a need for lower-data-rate voice encoders for special applications: improved performance in high bit-error conditions, low-probability-of-intercept (LPI) voice communication, and narrowband integrated voice/data systems. An 800-b/s voice encoding algorithm is presented which is an extension of the 2400-b/s LPC. To construct template tables, speech samples of 420 speakers uttering 8 sentences each were excerpted from the Texas Instrument - Massachusetts Institute of Technology (TIMIT) Acoustic-Phonetic Speech Data Base. Speech intelligibility of the 800-b/s voice encoding algorithm measured by the diagnostic rhyme test (DRT) is 91.5 for three male speakers. This score compares favorably with the 2400-b/s LPC of a few years ago.

  8. On Training Targets for Supervised Speech Separation

    PubMed Central

    Wang, Yuxuan; Narayanan, Arun; Wang, DeLiang

    2014-01-01

    Formulation of speech separation as a supervised learning problem has shown considerable promise. In its simplest form, a supervised learning algorithm, typically a deep neural network, is trained to learn a mapping from noisy features to a time-frequency representation of the target of interest. Traditionally, the ideal binary mask (IBM) is used as the target because of its simplicity and large speech intelligibility gains. The supervised learning framework, however, is not restricted to the use of binary targets. In this study, we evaluate and compare separation results by using different training targets, including the IBM, the target binary mask, the ideal ratio mask (IRM), the short-time Fourier transform spectral magnitude and its corresponding mask (FFT-MASK), and the Gammatone frequency power spectrum. Our results in various test conditions reveal that the two ratio mask targets, the IRM and the FFT-MASK, outperform the other targets in terms of objective intelligibility and quality metrics. In addition, we find that masking based targets, in general, are significantly better than spectral envelope based targets. We also present comparisons with recent methods in non-negative matrix factorization and speech enhancement, which show clear performance advantages of supervised speech separation. PMID:25599083

  9. Teaching Music as an Aid for Speech Training of Hearing Impaired Students.

    ERIC Educational Resources Information Center

    Shaw, Jeanne

    1989-01-01

    The article reviews existing theories and programs for teaching music to hearing-impaired students. Recent empirical evidence indicates that an auditory-based music program can increase speech intelligibility through improvement of the suprasegmental aspects of speech. (Author/DB)

  10. Visemic Processing in Audiovisual Discrimination of Natural Speech: A Simultaneous fMRI-EEG Study

    ERIC Educational Resources Information Center

    Dubois, Cyril; Otzenberger, Helene; Gounot, Daniel; Sock, Rudolph; Metz-Lutz, Marie-Noelle

    2012-01-01

    In a noisy environment, visual perception of articulatory movements improves natural speech intelligibility. Parallel to phonemic processing based on auditory signal, visemic processing constitutes a counterpart based on "visemes", the distinctive visual units of speech. Aiming at investigating the neural substrates of visemic processing in a…

  11. Acoustic Analysis of Clear Versus Conversational Speech in Individuals with Parkinson Disease

    ERIC Educational Resources Information Center

    Goberman, A.M.; Elmer, L.W.

    2005-01-01

    A number of studies have been devoted to the examination of clear versus conversational speech in non-impaired speakers. The purpose of these previous studies has been primarily to help increase speech intelligibility for the benefit of hearing-impaired listeners. The goal of the present study was to examine differences between conversational and…

  12. Effects of Alphabet-Supplemented Speech on Brain Activity of Listeners: An fMRI Study

    ERIC Educational Resources Information Center

    Fercho, Kelene; Baugh, Lee A.; Hanson, Elizabeth K.

    2015-01-01

    Purpose: The purpose of this article was to examine the neural mechanisms associated with increases in speech intelligibility brought about through alphabet supplementation. Method: Neurotypical participants listened to dysarthric speech while watching an accompanying video of a hand pointing to the 1st letter spoken of each word on an alphabet…

  13. Relationship between Speech, Oromotor, Language and Cognitive Abilities in Children with Down's Syndrome

    ERIC Educational Resources Information Center

    Cleland, Joanne; Wood, Sara; Hardcastle, William; Wishart, Jennifer; Timmins, Claire

    2010-01-01

    Background: Children and young people with Down's syndrome present with deficits in expressive speech and language, accompanied by strengths in vocabulary comprehension compared with non-verbal mental age. Intelligibility is particularly low, but whether speech is delayed or disordered is a controversial topic. Most studies suggest a delay, but no…

  14. Effects of Neurosurgical Management of Parkinson's Disease on Speech Characteristics and Oromotor Function.

    ERIC Educational Resources Information Center

    Farrell, Anna; Theodoros, Deborah; Ward, Elizabeth; Hall, Bruce; Silburn, Peter

    2005-01-01

    The present study examined the effects of neurosurgical management of Parkinson's disease (PD), including the procedures of pallidotomy, thalamotomy, and deep-brain stimulation (DBS) on perceptual speech characteristics, speech intelligibility, and oromotor function in a group of 22 participants with PD. The surgical participant group was compared…

  15. Speech and Language Deficits in Early-Treated Children with Galactosemia.

    ERIC Educational Resources Information Center

    Waisbren, Susan E.; And Others

    1983-01-01

    Intelligence and speech-language development of eight children (3.6 to 11.6 years old) with classic galactosemia were assessed by standardized tests. Each of the children had delays of early speech difficulties, and all but one had language disorders in at least one area. Available from: Journal of Pediatrics, C.V. Mosby Co., 11830 Westline…

  16. Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers

    NASA Astrophysics Data System (ADS)

    Caballero Morales, Santiago Omar; Cox, Stephen J.

    2009-12-01

    Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.

  17. Can older people remember medication reminders presented using synthetic speech?

    PubMed Central

    Wolters, Maria K; Johnson, Christine; Campbell, Pauline E; DePlacido, Christine G; McKinstry, Brian

    2015-01-01

    Reminders are often part of interventions to help older people adhere to complicated medication regimes. Computer-generated (synthetic) speech is ideal for tailoring reminders to different medication regimes. Since synthetic speech may be less intelligible than human speech, in particular under difficult listening conditions, we assessed how well older people can recall synthetic speech reminders for medications. 44 participants aged 50–80 with no cognitive impairment recalled reminders for one or four medications after a short distraction. We varied background noise, speech quality, and message design. Reminders were presented using a human voice and two synthetic voices. Data were analyzed using generalized linear mixed models. Reminder recall was satisfactory if reminders were restricted to one familiar medication, regardless of the voice used. Repeating medication names supported recall of lists of medications. We conclude that spoken reminders should build on familiar information and be integrated with other adherence support measures. PMID:25080534

  18. Auto Spell Suggestion for High Quality Speech Synthesis in Hindi

    NASA Astrophysics Data System (ADS)

    Kabra, Shikha; Agarwal, Ritika

    2014-02-01

    The goal of Text-to-Speech (TTS) synthesis in a particular language is to convert arbitrary input text to intelligible and natural sounding speech. However, for a particular language like Hindi, which is a highly confusing language (due to very close spellings), it is not an easy task to identify errors/mistakes in input text and an incorrect text degrade the quality of output speech hence this paper is a contribution to the development of high quality speech synthesis with the involvement of Spellchecker which generates spell suggestions for misspelled words automatically. Involvement of spellchecker would increase the efficiency of speech synthesis by providing spell suggestions for incorrect input text. Furthermore, we have provided the comparative study for evaluating the resultant effect on to phonetic text by adding spellchecker on to input text.

  19. Can older people remember medication reminders presented using synthetic speech?

    PubMed

    Wolters, Maria K; Johnson, Christine; Campbell, Pauline E; DePlacido, Christine G; McKinstry, Brian

    2015-01-01

    Reminders are often part of interventions to help older people adhere to complicated medication regimes. Computer-generated (synthetic) speech is ideal for tailoring reminders to different medication regimes. Since synthetic speech may be less intelligible than human speech, in particular under difficult listening conditions, we assessed how well older people can recall synthetic speech reminders for medications. 44 participants aged 50-80 with no cognitive impairment recalled reminders for one or four medications after a short distraction. We varied background noise, speech quality, and message design. Reminders were presented using a human voice and two synthetic voices. Data were analyzed using generalized linear mixed models. Reminder recall was satisfactory if reminders were restricted to one familiar medication, regardless of the voice used. Repeating medication names supported recall of lists of medications. We conclude that spoken reminders should build on familiar information and be integrated with other adherence support measures. PMID:25080534

  20. Plant intelligence

    NASA Astrophysics Data System (ADS)

    Trewavas, Anthony

    2005-09-01

    Intelligent behavior is a complex adaptive phenomenon that has evolved to enable organisms to deal with variable environmental circumstances. Maximizing fitness requires skill in foraging for necessary resources (food) in competitive circumstances and is probably the activity in which intelligent behavior is most easily seen. Biologists suggest that intelligence encompasses the characteristics of detailed sensory perception, information processing, learning, memory, choice, optimisation of resource sequestration with minimal outlay, self-recognition, and foresight by predictive modeling. All these properties are concerned with a capacity for problem solving in recurrent and novel situations. Here I review the evidence that individual plant species exhibit all of these intelligent behavioral capabilities but do so through phenotypic plasticity, not movement. Furthermore it is in the competitive foraging for resources that most of these intelligent attributes have been detected. Plants should therefore be regarded as prototypical intelligent organisms, a concept that has considerable consequences for investigations of whole plant communication, computation and signal transduction.

  1. Delayed Speech or Language Development

    MedlinePlus

    ... to Know About Zika & Pregnancy Delayed Speech or Language Development KidsHealth > For Parents > Delayed Speech or Language ... your child is right on schedule. Normal Speech & Language Development It's important to discuss early speech and ...

  2. The association between intelligence and lifespan is mostly genetic

    PubMed Central

    Arden, Rosalind; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M

    2016-01-01

    Background: Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. Methods: We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. Results: The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. Conclusions: The finding of common genetic effects between lifespan and intelligence has important implications for public

  3. Multi-time resolution analysis of speech: evidence from psychophysics

    PubMed Central

    Chait, Maria; Greenberg, Steven; Arai, Takayuki; Simon, Jonathan Z.; Poeppel, David

    2015-01-01

    How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10–40 Hz modulation frequency) and syllable-sized (2–10 Hz modulation frequency) units in speech processing. However, it is not clear how information associated with such different time scales interacts in a manner relevant for speech perception. We report behavioral experiments on speech intelligibility employing a stimulus that allows us to investigate how distinct temporal modulations in speech are treated separately and whether they are combined. We created sentences in which the slow (~4 Hz; Slow) and rapid (~33 Hz; Shigh) modulations—corresponding to ~250 and ~30 ms, the average duration of syllables and certain phonetic properties, respectively—were selectively extracted. Although Slow and Shigh have low intelligibility when presented separately, dichotic presentation of Shigh with Slow results in supra-additive performance, suggesting a synergistic relationship between low- and high-modulation frequencies. A second experiment desynchronized presentation of the Slow and Shigh signals. Desynchronizing signals relative to one another had no impact on intelligibility when delays were less than ~45 ms. Longer delays resulted in a steep intelligibility decline, providing further evidence of integration or binding of information within restricted temporal windows. Our data suggest that human speech perception uses multi-time resolution processing. Signals are concurrently analyzed on at least two separate time scales, the intermediate representations of these analyses are integrated, and the resulting bound percept has significant consequences for speech intelligibility—a view compatible with recent insights from neuroscience implicating multi-timescale auditory

  4. Acceptance speech.

    PubMed

    Carpenter, M

    1994-01-01

    In Bangladesh, the assistant administrator of USAID gave an acceptance speech at an awards ceremony on the occasion of the 25th anniversary of oral rehydration solution (ORS). The ceremony celebrated the key role of the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B) in the discovery of ORS. Its research activities over the last 25 years have brought ORS to every village in the world, preventing more than a million deaths each year. ORS is the most important medical advance of the 20th century. It is affordable and client-oriented, a true appropriate technology. USAID has provided more than US$ 40 million to ICDDR,B for diarrheal disease and measles research, urban and rural applied family planning and maternal and child health research, and vaccine development. ICDDR,B began as the relatively small Cholera Research Laboratory and has grown into an acclaimed international center for health, family planning, and population research. It leads the world in diarrheal disease research. ICDDR,B is the leading center for applied health research in South Asia. It trains public health specialists from around the world. The government of Bangladesh and the international donor community have actively joined in support of ICDDR,B. The government applies the results of ICDDR,B research to its programs to improve the health and well-being of Bangladeshis. ICDDR,B now also studies acute respiratory diseases and measles. Population and health comprise 1 of USAID's 4 strategic priorities, the others being economic growth, environment, and democracy, USAID promotes people's participation in these 4 areas and in the design and implementation of development projects. USAID is committed to the use and improvement of ORS and to complementary strategies that further reduce diarrhea-related deaths. Continued collaboration with a strong user perspective and integrated services will lead to sustainable development. PMID:12345470

  5. Speech disorders - children

    MedlinePlus

    ... deficiency; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder ... The following tests can help diagnose speech disorders: Denver ... Peabody Picture Test Revised A hearing test may also be done.

  6. Speech and Communication Disorders

    MedlinePlus

    ... speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism spectrum ...

  7. Speech disorders - children

    MedlinePlus

    ... person has problems creating or forming the speech sounds needed to communicate with others. Three common speech ... are disorders in which a person repeats a sound, word, or phrase. Stuttering may be the most ...

  8. Robust speech coding using microphone arrays

    NASA Astrophysics Data System (ADS)

    Li, Zhao

    1998-09-01

    To achieve robustness and efficiency for voice communication in noise, the noise suppression and bandwidth compression processes are combined to form a joint process using input from an array of microphones. An adaptive beamforming technique with a set of robust linear constraints and a single quadratic inequality constraint is used to preserve desired signal and to cancel directional plus ambient noise in a small room environment. This robustly constrained array processor is found to be effective in limiting signal cancelation over a wide range of input SNRs (-10 dB to +10 dB). The resulting intelligibility gains (8-10 dB) provide significant improvement to subsequent CELP coding. In addition, the desired speech activity is detected by estimating Target-to-Jammer Ratios (TJR) using subband correlations between different microphone inputs or using signals within the Generalized Sidelobe Canceler directly. These two novel techniques of speech activity detection for coding are studied thoroughly in this dissertation. Each is subsequently incorporated with the adaptive array and a 4.8 kbps CELP coder to form a Variable Bit Kate (VBR) coder with noise canceling and Spatial Voice Activity Detection (SVAD) capabilities. This joint noise suppression and bandwidth compression system demonstrates large improvements in desired speech quality after coding, accurate desired speech activity detection in various types of interference, and a reduction in the information bits required to code the speech.

  9. Speech tests as measures of outcome.

    PubMed

    Gatehouse, S

    1998-01-01

    Speech tests comprise an important and integral part of any assessment of the effectiveness of intervention for hearing disability and handicap. Particularly when considering hearing aid services for adult listeners, careful consideration has to be given to the particular form and application of inferences drawn from speech identification procedures if erroneous conclusions are to be avoided. It is argued that four such components relate to the statistical properties and discriminatory leverage of speech identification procedures, the choice of presentation level and conditions in regard to the auditory environment experienced by hearing-impaired clients, the extent to which speech tests based on segmental intelligibility provide appropriate information in relationship to perceived disabilities and handicaps, and the ways in which speech identification procedures to evaluate the potential benefits of signal-processing schemes for hearing aids are dependent upon sufficient listening experiences. Data are drawn from the literature to illuminate these points in terms of application in clinical practice and clinical evaluation exercises, and also with regard to future research needs. PMID:10209778

  10. Effects of Listening Instructions and Severity of Cleft Palate Speech on Listeners. Final Report.

    ERIC Educational Resources Information Center

    Shames, George H.; And Others

    Mothers of cleft and noncleft palate children (C- and non C-mothers) listened to a reading by a cleft palate child of a passage containing specified combinations of nasality and intelligibility. Groups were either uninstructed or instructed to listed to the content or the manner of speech; they assessed the nasality and intelligibility of the…

  11. Listener Perception of Monopitch, Naturalness, and Intelligibility for Speakers With Parkinson's Disease

    PubMed Central

    Stepp, Cara E.

    2015-01-01

    Purpose Given the potential significance of speech naturalness to functional and social rehabilitation outcomes, the objective of this study was to examine the effect of listener perceptions of monopitch on speech naturalness and intelligibility in individuals with Parkinson's disease (PD). Method Two short utterances were extracted from monologue samples of 16 speakers with PD and 5 age-matched adults without PD. Sixteen listeners evaluated these stimuli for monopitch, speech naturalness and intelligibility using the visual sort and rate method. Results Naïve listeners can reliably judge monopitch, speech naturalness, and intelligibility with minimal familiarization. While monopitch and speech intelligibility were only moderately correlated, monopitch and speech naturalness were highly correlated. Conclusions A great deal of attention is currently being paid to improvement of vocal loudness and thus speech intelligibility in PD. Our findings suggest that prosodic characteristics such as monopitch should be explored as adjuncts to this treatment of dysarthria in PD. Development of such prosodic treatments may enhance speech naturalness and thus improve quality of life. PMID:26102242

  12. The effect of speech modification on non-native listeners for matrix-style sentences.

    PubMed

    Cooke, Martin; García Lecumberri, María Luisa; Tang, Yan

    2015-02-01

    Speech can be modified to promote intelligibility in noise, but the potential benefits for non-native listeners are difficult to predict due to the additional presence of distortion introduced by speech alteration. The current study compared native and non-native listeners' keyword scores for simple sentences, unmodified and with six forms of modification. Both groups showed similar patterns of intelligibility change across conditions, with the native cohort benefiting slightly more in stationary noise. This outcome suggests that the change in masked audibility rather than distortion is the dominant factor governing listeners' responses to speech modification. PMID:25698043

  13. Artificial Intelligence.

    ERIC Educational Resources Information Center

    Wash, Darrel Patrick

    1989-01-01

    Making a machine seem intelligent is not easy. As a consequence, demand has been rising for computer professionals skilled in artificial intelligence and is likely to continue to go up. These workers develop expert systems and solve the mysteries of machine vision, natural language processing, and neural networks. (Editor)

  14. Artificial Intelligence.

    ERIC Educational Resources Information Center

    Smith, Linda C.; And Others

    1988-01-01

    A series of articles focuses on artificial intelligence research and development to enhance information systems and services. Topics discussed include knowledge base designs, expert system development tools, natural language processing, expert systems for reference services, and the role that artificial intelligence concepts should have in…

  15. Informational masking of speech produced by speech-like sounds without linguistic content.

    PubMed

    Chen, Jing; Li, Huahui; Li, Liang; Wu, Xihong; Moore, Brian C J

    2012-04-01

    This study investigated whether speech-like maskers without linguistic content produce informational masking of speech. The target stimuli were nonsense Chinese Mandarin sentences. In experiment I, the masker contained harmonics the fundamental frequency (F0) of which was sinusoidally modulated and the mean F0 of which was varied. The magnitude of informational masking was evaluated by measuring the change in intelligibility (releasing effect) produced by inducing a perceived spatial separation of the target speech and masker via the precedence effect. The releasing effect was small and was only clear when the target and masker had the same mean F0, suggesting that informational masking was small. Performance with the harmonic maskers was better than with a steady speech-shaped noise (SSN) masker. In experiments II and III, the maskers were speech-like synthesized signals, alternating between segments with harmonic structure and segments composed of SSN. Performance was much worse than for experiment I, and worse than when an SSN masker was used, suggesting that substantial informational masking occurred. The similarity of the F0 contours of the target and masker had little effect. The informational masking effect was not influenced by whether or not the noise-like segments of the masker were synchronous with the unvoiced segments of the target speech. PMID:22501069

  16. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    PubMed

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists. PMID:26016644

  17. Speech imagery recalibrates speech-perception boundaries.

    PubMed

    Scott, Mark

    2016-07-01

    The perceptual boundaries between speech sounds are malleable and can shift after repeated exposure to contextual information. This shift is known as recalibration. To date, the known inducers of recalibration are lexical (including phonotactic) information, lip-read information and reading. The experiments reported here are a proof-of-effect demonstration that speech imagery can also induce recalibration. PMID:27068050

  18. Speech and Language Delay

    MedlinePlus

    MENU Return to Web version Speech and Language Delay Overview How do I know if my child has speech delay? Every child develops at his or her ... of the same age, the problem may be speech delay. Your doctor may think your child has ...

  19. Talking Speech Input.

    ERIC Educational Resources Information Center

    Berliss-Vincent, Jane; Whitford, Gigi

    2002-01-01

    This article presents both the factors involved in successful speech input use and the potential barriers that may suggest that other access technologies could be more appropriate for a given individual. Speech input options that are available are reviewed and strategies for optimizing use of speech recognition technology are discussed. (Contains…

  20. Speech 7 through 12.

    ERIC Educational Resources Information Center

    Nederland Independent School District, TX.

    GRADES OR AGES: Grades 7 through 12. SUBJECT MATTER: Speech. ORGANIZATION AND PHYSICAL APPEARANCE: Following the foreward, philosophy and objectives, this guide presents a speech curriculum. The curriculum covers junior high and Speech I, II, III (senior high). Thirteen units of study are presented for junior high, each unit is divided into…

  1. The Tao of Speech.

    ERIC Educational Resources Information Center

    Dance, Frank E. X.

    1981-01-01

    Argues that the study of speech may present the characteristics of a "tao"--a path leading to an increase in humane being. Calls for speech teachers to profess the primacy of speech: "...the source of life of the human mind, the source of the compassion of the human spirit." (PD)

  2. Free Speech Yearbook 1978.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…

  3. Acoustic properties of naturally produced clear speech at normal speaking rates

    NASA Astrophysics Data System (ADS)

    Krause, Jean C.; Braida, Louis D.

    2004-01-01

    Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.

  4. COMPREHENSION OF COMPRESSED SPEECH BY ELEMENTARY SCHOOL CHILDREN.

    ERIC Educational Resources Information Center

    WOOD, C. DAVID

    THE EFFECTS OF FOUR VARIABLES ON THE EXTENT OF COMPREHENSION OF COMPRESSED SPEECH BY ELEMENTARY SCHOOL CHILDREN WERE INVESTIGATED. THESE VARIABLES WERE RATE OF PRESENTATION, GRADE LEVEL IN SCHOOL, INTELLIGENCE, AND AMOUNT OF PRACTICE. NINETY SUBJECTS PARTICIPATED IN THE EXPERIMENT. THE TASK FOR EACH SUBJECT WAS TO LISTEN INDIVIDUALLY TO 50 TAPE…

  5. A CURRICULUM GUIDE IN SPEECH FOR THE SECONDARY SCHOOLS.

    ERIC Educational Resources Information Center

    FLETCHER, JUANITA D.

    THE SPEECH IMPROVEMENT COURSE IS CONCERNED PRIMARILY WITH THE DEVELOPMENT OF CORRECT HABITS OF ORAL COMMUNICATION IN SECONDARY SCHOOL STUDENTS. SUCH A COURSE SHOULD BE CONCERNED WITH IMPROVEMENT IN THREE MAIN AREAS-PROJECTION, AGREEABLE QUALITY, AND INTELLIGIBILITY. THE MOST IMPORTANT UNIT OF THE PROGRAM, PERHAPS, IS ITS LAUNCHING. STUDENTS SHOULD…

  6. Commercial applications of speech interface technology: an industry at the threshold.

    PubMed

    Oberteuffer, J A

    1995-10-24

    Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines. PMID:7479717

  7. Bridging the Gap Between Speech and Language: Using Multimodal Treatment in a Child With Apraxia.

    PubMed

    Tierney, Cheryl D; Pitterle, Kathleen; Kurtz, Marie; Nakhla, Mark; Todorow, Carlyn

    2016-09-01

    Childhood apraxia of speech is a neurologic speech sound disorder in which children have difficulty constructing words and sounds due to poor motor planning and coordination of the articulators required for speech sound production. We report the case of a 3-year-old boy strongly suspected to have childhood apraxia of speech at 18 months of age who used multimodal communication to facilitate language development throughout his work with a speech language pathologist. In 18 months of an intensive structured program, he exhibited atypical rapid improvement, progressing from having no intelligible speech to achieving age-appropriate articulation. We suspect that early introduction of sign language by family proved to be a highly effective form of language development, that when coupled with intensive oro-motor and speech sound therapy, resulted in rapid resolution of symptoms. PMID:27492818

  8. Simultaneous natural speech and AAC interventions for children with childhood apraxia of speech: lessons from a speech-language pathologist focus group.

    PubMed

    Oommen, Elizabeth R; McCarthy, John W

    2015-03-01

    In childhood apraxia of speech (CAS), children exhibit varying levels of speech intelligibility depending on the nature of errors in articulation and prosody. Augmentative and alternative communication (AAC) strategies are beneficial, and commonly adopted with children with CAS. This study focused on the decision-making process and strategies adopted by speech-language pathologists (SLPs) when simultaneously implementing interventions that focused on natural speech and AAC. Eight SLPs, with significant clinical experience in CAS and AAC interventions, participated in an online focus group. Thematic analysis revealed eight themes: key decision-making factors; treatment history and rationale; benefits; challenges; therapy strategies and activities; collaboration with team members; recommendations; and other comments. Results are discussed along with clinical implications and directions for future research. PMID:25664542

  9. An Intelligibility Assessment of Toddlers with Cleft Lip and Palate Who Received and Did Not Receive Presurgical Infant Orthopedic Treatment.

    ERIC Educational Resources Information Center

    Konst, Emmy M.; Weersink-Braks, Hanny; Rietveld, Toni; Peters, Herman

    2000-01-01

    The influence of presurgical infant orthopedic treatment (PIO) on speech intelligibility was evaluated with 10 toddlers who used PIO during the first year of life and 10 who did not. Treated children were rated as exhibiting greater intelligibility, however, transcription data indicated there were not group differences in actual intelligibility.…

  10. The Binaural Masking-Level Difference of Mandarin Tone Detection and the Binaural Intelligibility-Level Difference of Mandarin Tone Recognition in the Presence of Speech-Spectrum Noise

    PubMed Central

    Ho, Cheng-Yu; Li, Pei-Chun; Chiang, Yuan-Chuan; Young, Shuenn-Tsong; Chu, Woei-Chyn

    2015-01-01

    Binaural hearing involves using information relating to the differences between the signals that arrive at the two ears, and it can make it easier to detect and recognize signals in a noisy environment. This phenomenon of binaural hearing is quantified in laboratory studies as the binaural masking-level difference (BMLD). Mandarin is one of the most commonly used languages, but there are no publication values of BMLD or BILD based on Mandarin tones. Therefore, this study investigated the BMLD and BILD of Mandarin tones. The BMLDs of Mandarin tone detection were measured based on the detection threshold differences for the four tones of the voiced vowels /i/ (i.e., /i1/, /i2/, /i3/, and /i4/) and /u/ (i.e., /u1/, /u2/, /u3/, and /u4/) in the presence of speech-spectrum noise when presented interaurally in phase (S0N0) and interaurally in antiphase (SπN0). The BILDs of Mandarin tone recognition in speech-spectrum noise were determined as the differences in the target-to-masker ratio (TMR) required for 50% correct tone recognitions between the S0N0 and SπN0 conditions. The detection thresholds for the four tones of /i/ and /u/ differed significantly (p<0.001) between the S0N0 and SπN0 conditions. The average detection thresholds of Mandarin tones were all lower in the SπN0 condition than in the S0N0 condition, and the BMLDs ranged from 7.3 to 11.5 dB. The TMR for 50% correct Mandarin tone recognitions differed significantly (p<0.001) between the S0N0 and SπN0 conditions, at –13.4 and –18.0 dB, respectively, with a mean BILD of 4.6 dB. The study showed that the thresholds of Mandarin tone detection and recognition in the presence of speech-spectrum noise are improved when phase inversion is applied to the target speech. The average BILDs of Mandarin tones are smaller than the average BMLDs of Mandarin tones. PMID:25835987

  11. Aided and Unaided Speech Supplementation Strategies: Effect of Alphabet Cues and Iconic Hand Gestures on Dysarthric Speech

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Garcia, Jane Mertz

    2005-01-01

    Purpose: This study compared the influence of speaker-implemented iconic hand gestures and alphabet cues on speech intelligibility scores and strategy helpfulness ratings for 3 adults with cerebral palsy and dysarthria who differed from one another in their overall motor abilities. Method: A total of 144 listeners (48 per speaker) orthographically…

  12. A Multisensory Cortical Network for Understanding Speech in Noise

    PubMed Central

    Bishop, Christopher W.; Miller, Lee M.

    2010-01-01

    In noisy environments, listeners tend to hear a speaker’s voice yet struggle to understand what is said. The most effective way to improve intelligibility in such conditions is to watch the speaker’s mouth movements. Here we identify the neural networks that distinguish understanding from merely hearing speech, and determine how the brain applies visual information to improve intelligibility. Using functional magnetic resonance imaging, we show that understanding speech-in-noise is supported by a network of brain areas including the left superior parietal lobule, the motor/premotor cortex, and the left anterior superior temporal sulcus (STS), a likely apex of the acoustic processing hierarchy. Multisensory integration likely improves comprehension through improved communication between the left temporal–occipital boundary, the left medial-temporal lobe, and the left STS. This demonstrates how the brain uses information from multiple modalities to improve speech comprehension in naturalistic, acoustically adverse conditions. PMID:18823249

  13. Speech outcome following palatoplasty in primary school children: do lay peer observers agree with speech pathologists?

    PubMed

    Witt, P D; Berry, L A; Marsh, J L; Grames, L M; Pilgram, T K

    1996-11-01

    The aim of this study was twofold: (1) to test the ability of normal children to discriminate the speech of children with repaired cleft palate from the speech of unaffected peers and (2) to compare these naive assessments of speech acceptability with the sophisticated assessments of speech pathologists. The study group (subjects) was composed of 21 children of school age (aged 8 to 12 years) who had undergone palatoplasty at a single cleft center and 16 matched controls. The listening team (student raters) was composed of 20 children who were matched to the subjects for age, sex, and other variables. Randomized master audio-tape recordings of the children who had undergone palatoplasty were presented in blinded fashion and random order to student raters who were inexperienced in the evaluation of patients with speech dysfunction. The same sound recordings were evaluated by an experienced panel of extramural speech pathologists whose intrarater and interrater reliabilities were known; they were not direct care providers. Additionally, the master tape was presented in blinded fashion and random order to the velopharyngeal staff at the cleft center for intramural assessment. Comparison of these assessment methodologies forms the basis of this report. Naive raters were insensitive to speech differences in the control and cleft palate groups. Differences in the mean scores for the groups never approached statistical significance, and there was adequate power to discern a difference of 0.75 on a 7-point scale. Expert raters were sensitive to differences in resonance and intelligibility in the control and cleft palate groups but not to other aspects of speech. The expert raters recommended further evaluation of cleft palate patients more often than control patients. Speech pathologists discern differences that the laity does not. Consideration should be given to the utilization of untrained listeners to add real-life significance to clinical speech assessments. Peer

  14. A Novel Method for Speech Acquisition and Enhancement by 94 GHz Millimeter-Wave Sensor

    PubMed Central

    Chen, Fuming; Li, Sheng; Li, Chuantao; Liu, Miao; Li, Zhao; Xue, Huijun; Jing, Xijing; Wang, Jianqi

    2015-01-01

    In order to improve the speech acquisition ability of a non-contact method, a 94 GHz millimeter wave (MMW) radar sensor was employed to detect speech signals. This novel non-contact speech acquisition method was shown to have high directional sensitivity, and to be immune to strong acoustical disturbance. However, MMW radar speech is often degraded by combined sources of noise, which mainly include harmonic, electrical circuit and channel noise. In this paper, an algorithm combining empirical mode decomposition (EMD) and mutual information entropy (MIE) was proposed for enhancing the perceptibility and intelligibility of radar speech. Firstly, the radar speech signal was adaptively decomposed into oscillatory components called intrinsic mode functions (IMFs) by EMD. Secondly, MIE was used to determine the number of reconstructive components, and then an adaptive threshold was employed to remove the noise from the radar speech. The experimental results show that human speech can be effectively acquired by a 94 GHz MMW radar sensor when the detection distance is 20 m. Moreover, the noise of the radar speech is greatly suppressed and the speech sounds become more pleasant to human listeners after being enhanced by the proposed algorithm, suggesting that this novel speech acquisition and enhancement method will provide a promising alternative for various applications associated with speech detection. PMID:26729126

  15. Signal Processing Methods for Removing the Effects of Whole Body Vibration upon Speech

    NASA Technical Reports Server (NTRS)

    Bitner, Rachel M.; Begault, Durand R.

    2014-01-01

    Humans may be exposed to whole-body vibration in environments where clear speech communications are crucial, particularly during the launch phases of space flight and in high-performance aircraft. Prior research has shown that high levels of vibration cause a decrease in speech intelligibility. However, the effects of whole-body vibration upon speech are not well understood, and no attempt has been made to restore speech distorted by whole-body vibration. In this paper, a model for speech under whole-body vibration is proposed and a method to remove its effect is described. The method described reduces the perceptual effects of vibration, yields higher ASR accuracy scores, and may significantly improve intelligibility. Possible applications include incorporation within communication systems to improve radio-communication systems in environments such a spaceflight, aviation, or off-road vehicle operations.

  16. Auditory-Perceptual Speech Outcomes and Quality of Life after Total Laryngectomy

    PubMed Central

    Eadie, Tanya L.; Day, Adam M. B.; Sawin, Devon E.; Lamvik, Kristin; Doyle, Philip C.

    2015-01-01

    OBJECTIVE i) To determine potential relationships between speech intelligibility, acceptability, and self-reported quality of life (QOL) after total laryngectomy; and ii) to determine whether relationships are stronger when QOL is measured by a head and neck cancer-specific or discipline-specific QOL scale. STUDY DESIGN Cross-sectional. SETTING University-based laboratory and speech clinic. SUBJECTS AND METHODS Twenty-five laryngectomized individuals completed disease-specific (University of Washington Quality of Life; UW-QOL) and discipline-specific (Voice Handicap Index-10; VHI-10) QOL scales. They also provided audio recordings that included the Sentence Intelligibility Test and a reading passage. Thirty-three listeners transcribed the SIT sentences to yield intelligibility scores. Fifteen additional listeners judged speech acceptability of the reading passage using rating scales. RESULTS QOL scores were moderate across the UW-QOL physical (mean = 77.63) and social-emotional (mean = 78.02) subscales, and VHI-10 (mean = 17.91). Speech acceptability and intelligibility varied across the samples, with acceptability only moderately related to intelligibility (r = 0.41, p < .05). Relationships were weak between ratings of intelligibility and self-reported QOL (range r = 0.00 – 0.22), and weak to moderate between acceptability with QOL (range r = 0.01 – 0.46). The only statistically significant, but moderate, relationship was found between speech acceptability with the UWQOL speech sub-score (r = 0.46, p < .05). CONCLUSION Listeners’ ratings of speech acceptability and intelligibility were not strongly predictive of disease-specific or voice-related QOL, suggesting that listener-rated and patient-reported outcomes are complementary. PMID:23008330

  17. A novel speech prosthesis for mandibular guidance therapy in hemimandibulectomy patient: A clinical report

    PubMed Central

    Adaki, Raghavendra; Shigli, Kamal; Hormuzdi, Dinshaw M.; Gali, Sivaranjani

    2016-01-01

    Treating diverse maxillofacial patients poses a challenge to the maxillofacial prosthodontist. Rehabilitation of hemimandibulectomy patients must aim at restoring mastication and other functions such as intelligible speech, swallowing, and esthetics. Prosthetic methods such as palatal ramp and mandibular guiding flange reposition the deviated mandible. Such prosthesis can also be used to restore speech in case of patients with debilitating speech following surgical resection. This clinical report gives detail of a hemimandibulectomy patient provided with an interim removable dental speech prosthesis with composite resin flange for mandibular guidance therapy. PMID:27041917

  18. Speech Impairment in Down Syndrome: A Review

    PubMed Central

    Kent, Ray D.; Vorperian, Houri K.

    2012-01-01

    Purpose This review summarizes research on disorders of speech production in Down Syndrome (DS) for the purposes of informing clinical services and guiding future research. Method Review of the literature was based on searches using Medline, Google Scholar, Psychinfo, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency and intelligibility. Conclusions The following conclusions pertain to four major areas of review: (a) Voice. Although a number of studies have been reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. (b) Speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. (c) Fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10 to 45%, compared to about 1% in the general population. Research also points to significant disturbances in prosody. (d) Intelligibility. Studies consistently show marked limitations in this area but it is only recently that research goes beyond simple rating scales. PMID:23275397

  19. Distributed Intelligence.

    ERIC Educational Resources Information Center

    McLagan, Patricia A.

    2003-01-01

    Distributed intelligence occurs when people in an organization take responsibility for creating innovations, solving problems, and making decisions. Organizations that have it excel in their markets and the global environment. (Author/JOW)

  20. Intelligent buildings

    SciTech Connect

    Atkin, B.

    1989-01-01

    The term intelligent buildings refers to today's sophisticated living environments that must support communication, energy, fire and security protection systems. This book examines a variety of topics including building automation, information technology, and systems and facilities management.

  1. The Danish vaccination register.

    PubMed

    Grove Krause, T; Jakobsen, S; Haarh, M; Mølbak, K

    2012-01-01

    Immunisation information systems (IIS) are valuable tools for monitoring vaccination coverage and for estimating vaccine effectiveness and safety. Since 2009, an advanced IIS has been developed in Denmark and will be implemented during 2012–14. This IIS is based on a database existing since 2000. The reporting of all administered vaccinations including vaccinations outside the national programme will become mandatory. Citizens will get access to data about their own vaccinations and healthcare personnel will get access to information on the vaccinations of their patients. A national concept of identification, a national solution combining a personal code and a card with codes, ensures easy and secure access to the register. From the outset, the IIS will include data on childhood vaccinations administered from 1996 and onwards. All Danish citizens have a unique identifier, a so called civil registration number, which allows the linking of information on vaccinations coming from different electronic data sources. The main challenge will be to integrate the IIS with the different electronic patient record systems currently existing at general practitioner, vaccination clinic and hospital level thereby avoiding double-entry. A need has been identified for an updated international classification of vaccine products on the market. Such a classification would also be useful for the future exchange of data on immunisations from IIS between countries. PMID:22551494

  2. Artificial intelligence, expert systems, computer vision, and natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1984-01-01

    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  3. Listener Perception of Monopitch, Naturalness, and Intelligibility for Speakers with Parkinson's Disease

    ERIC Educational Resources Information Center

    Anand, Supraja; Stepp, Cara E.

    2015-01-01

    Purpose: Given the potential significance of speech naturalness to functional and social rehabilitation outcomes, the objective of this study was to examine the effect of listener perceptions of monopitch on speech naturalness and intelligibility in individuals with Parkinson's disease (PD). Method: Two short utterances were extracted from…

  4. A Systematic Review of Cross-Linguistic and Multilingual Speech and Language Outcomes for Children with Hearing Loss

    ERIC Educational Resources Information Center

    Crowe, Kathryn; McLeod, Sharynne

    2014-01-01

    The purpose of this study was to systematically review the factors affecting the language, speech intelligibility, speech production, and lexical tone development of children with hearing loss who use spoken languages other than English. Relevant studies of children with hearing loss published between 2000 and 2011 were reviewed with reference to…

  5. Early recognition of speech

    PubMed Central

    Remez, Robert E; Thomas, Emily F

    2013-01-01

    Classic research on the perception of speech sought to identify minimal acoustic correlates of each consonant and vowel. In explaining perception, this view designated momentary components of an acoustic spectrum as cues to the recognition of elementary phonemes. This conceptualization of speech perception is untenable given the findings of phonetic sensitivity to modulation independent of the acoustic and auditory form of the carrier. The empirical key is provided by studies of the perceptual organization of speech, a low-level integrative function that finds and follows the sensory effects of speech amid concurrent events. These projects have shown that the perceptual organization of speech is keyed to modulation; fast; unlearned; nonsymbolic; indifferent to short-term auditory properties; and organization requires attention. The ineluctably multisensory nature of speech perception also imposes conditions that distinguish language among cognitive systems. WIREs Cogn Sci 2013, 4:213–223. doi: 10.1002/wcs.1213 PMID:23926454

  6. Call sign intelligibility improvement using a spatial auditory display

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.

    1993-01-01

    A spatial auditory display was used to convolve speech stimuli, consisting of 130 different call signs used in the communications protocol of NASA's John F. Kennedy Space Center, to different virtual auditory positions. An adaptive staircase method was used to determine intelligibility levels of the signal against diotic speech babble, with spatial positions at 30 deg azimuth increments. Non-individualized, minimum-phase approximations of head-related transfer functions were used. The results showed a maximal intelligibility improvement of about 6 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.

  7. Selective perceptual phase entrainment to speech rhythm in the absence of spectral energy fluctuations.

    PubMed

    Zoefel, Benedikt; VanRullen, Rufin

    2015-02-01

    Perceptual phase entrainment improves speech intelligibility by phase-locking the brain's high-excitability and low-excitability phases to relevant or irrelevant events in the speech input. However, it remains unclear whether phase entrainment to speech can be explained by a passive "following" of rhythmic changes in sound amplitude and spectral content or whether entrainment entails an active tracking of higher-level cues: in everyday speech, rhythmic fluctuations in low-level and high-level features always covary. Here, we resolve this issue by constructing novel speech/noise stimuli with intelligible speech but without systematic changes in sound amplitude and spectral content. The probability of detecting a tone pip, presented to human listeners at random moments during our speech/noise stimuli, was significantly modulated by the rhythmic changes in high-level information. Thus, perception can entrain to the speech rhythm even without concurrent fluctuations in sound amplitude or spectral content. Strikingly, the actual entrainment phase depended on the tone-pip frequency, with tone pips within and beyond the principal frequency range of the speech sound modulated in opposite fashion. This result suggests that only those neural populations processing the actually presented frequencies are set to their high-excitability phase, whereas other populations are entrained to the opposite, low-excitability phase. Furthermore, we show that the perceptual entrainment is strongly reduced when speech intelligibility is abolished by presenting speech/noise stimuli in reverse, indicating that linguistic information plays an important role for the observed perceptual entrainment. PMID:25653354

  8. Speech Alarms Pilot Study

    NASA Technical Reports Server (NTRS)

    Sandor, Aniko; Moses, Haifa

    2016-01-01

    Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.

  9. Speech input and output

    NASA Astrophysics Data System (ADS)

    Class, F.; Mangold, H.; Stall, D.; Zelinski, R.

    1981-12-01

    Possibilities for acoustical dialogs with electronic data processing equipment were investigated. Speech recognition is posed as recognizing word groups. An economical, multistage classifier for word string segmentation is presented and its reliability in dealing with continuous speech (problems of temporal normalization and context) is discussed. Speech synthesis is considered in terms of German linguistics and phonetics. Preprocessing algorithms for total synthesis of written texts were developed. A macrolanguage, MUSTER, is used to implement this processing in an acoustic data information system (ADES).

  10. Advances in speech processing

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1992-10-01

    The field of speech processing is undergoing a rapid growth in terms of both performance and applications and this is fueled by the advances being made in the areas of microelectronics, computation, and algorithm design. The use of voice for civil and military communications is discussed considering advantages and disadvantages including the effects of environmental factors such as acoustic and electrical noise and interference and propagation. The structure of the existing NATO communications network and the evolving Integrated Services Digital Network (ISDN) concept are briefly reviewed to show how they meet the present and future requirements. The paper then deals with the fundamental subject of speech coding and compression. Recent advances in techniques and algorithms for speech coding now permit high quality voice reproduction at remarkably low bit rates. The subject of speech synthesis is next treated where the principle objective is to produce natural quality synthetic speech from unrestricted text input. Speech recognition where the ultimate objective is to produce a machine which would understand conversational speech with unrestricted vocabulary, from essentially any talker, is discussed. Algorithms for speech recognition can be characterized broadly as pattern recognition approaches and acoustic phonetic approaches. To date, the greatest degree of success in speech recognition has been obtained using pattern recognition paradigms. It is for this reason that the paper is concerned primarily with this technique.

  11. Wavelet Speech Enhancement Based on Nonnegative Matrix Factorization

    NASA Astrophysics Data System (ADS)

    Wang, Syu-Siang; Chern, Alan; Tsao, Yu; Hung, Jeih-weih; Lu, Xugang; Lai, Ying-Hui; Su, Borching

    2016-08-01

    For most of the state-of-the-art speech enhancement techniques, a spectrogram is usually preferred than the respective time-domain raw data since it reveals more compact presentation together with conspicuous temporal information over a long time span. However, the short-time Fourier transform (STFT) that creates the spectrogram in general distorts the original signal and thereby limits the capability of the associated speech enhancement techniques. In this study, we propose a novel speech enhancement method that adopts the algorithms of discrete wavelet packet transform (DWPT) and nonnegative matrix factorization (NMF) in order to conquer the aforementioned limitation. In brief, the DWPT is first applied to split a time-domain speech signal into a series of subband signals without introducing any distortion. Then we exploit NMF to highlight the speech component for each subband. Finally, the enhanced subband signals are joined together via the inverse DWPT to reconstruct a noise-reduced signal in time domain. We evaluate the proposed DWPT-NMF based speech enhancement method on the MHINT task. Experimental results show that this new method behaves very well in prompting speech quality and intelligibility and it outperforms the convnenitional STFT-NMF based method.

  12. The Cleft Care UK study. Part 4: perceptual speech outcomes

    PubMed Central

    Sell, D; Mildinhall, S; Albery, L; Wills, A K; Sandy, J R; Ness, A R

    2015-01-01

    Structured Abstract Objectives To describe the perceptual speech outcomes from the Cleft Care UK (CCUK) study and compare them to the 1998 Clinical Standards Advisory Group (CSAG) audit. Setting and sample population A cross-sectional study of 248 children born with complete unilateral cleft lip and palate, between 1 April 2005 and 31 March 2007 who underwent speech assessment. Materials and methods Centre-based specialist speech and language therapists (SLT) took speech audio–video recordings according to nationally agreed guidelines. Two independent listeners undertook the perceptual analysis using the CAPS-A Audit tool. Intra- and inter-rater reliability were tested. Results For each speech parameter of intelligibility/distinctiveness, hypernasality, palatal/palatalization, backed to velar/uvular, glottal, weak and nasalized consonants, and nasal realizations, there was strong evidence that speech outcomes were better in the CCUK children compared to CSAG children. The parameters which did not show improvement were nasal emission, nasal turbulence, hyponasality and lateral/lateralization. Conclusion These results suggest that centralization of cleft care into high volume centres has resulted in improvements in UK speech outcomes in five-year-olds with unilateral cleft lip and palate. This may be associated with the development of a specialized workforce. Nevertheless, there still remains a group of children with significant difficulties at school entry. PMID:26567854

  13. Effects of speech clarity on recognition memory for spoken sentences.

    PubMed

    Van Engen, Kristin J; Chandrasekaran, Bharath; Smiljanic, Rajka

    2012-01-01

    Extensive research shows that inter-talker variability (i.e., changing the talker) affects recognition memory for speech signals. However, relatively little is known about the consequences of intra-talker variability (i.e. changes in speaking style within a talker) on the encoding of speech signals in memory. It is well established that speakers can modulate the characteristics of their own speech and produce a listener-oriented, intelligibility-enhancing speaking style in response to communication demands (e.g., when speaking to listeners with hearing impairment or non-native speakers of the language). Here we conducted two experiments to examine the role of speaking style variation in spoken language processing. First, we examined the extent to which clear speech provided benefits in challenging listening environments (i.e. speech-in-noise). Second, we compared recognition memory for sentences produced in conversational and clear speaking styles. In both experiments, semantically normal and anomalous sentences were included to investigate the role of higher-level linguistic information in the processing of speaking style variability. The results show that acoustic-phonetic modifications implemented in listener-oriented speech lead to improved speech recognition in challenging listening conditions and, crucially, to a substantial enhancement in recognition memory for sentences. PMID:22970141

  14. Speech production in noise with and without hearing protection

    NASA Astrophysics Data System (ADS)

    Tufts, Jennifer B.; Frank, Tom

    2003-08-01

    People working in noisy environments often complain of difficulty communicating when they wear hearing protection. It was hypothesized that part of the workers' communication difficulties stem from changes in speech production that occur when hearing protectors are worn. To address this possibility, overall and one-third-octave-band SPL measurements were obtained for 16 men and 16 women as they produced connected speech while wearing foam, flange, or no earplugs (open ears) in quiet and in pink noise at 60, 70, 80, 90, and 100 dB SPL. The attenuation and the occlusion effect produced by the earplugs were measured. The Speech Intelligibility Index (SII) was also calculated for each condition. The talkers produced lower overall speech levels, speech-to-noise ratios, and SII values, and less high-frequency speech energy, when they wore earplugs compared with the open-ear condition. Small differences in the speech measures between the talkers wearing foam and flange earplugs were observed. Overall, the results of the study indicate that talkers wearing earplugs (and consequently their listeners) are at a disadvantage when communicating in noise.

  15. Top-down restoration of speech in cochlear-implant users.

    PubMed

    Bhargava, Pranesh; Gaudrain, Etienne; Başkent, Deniz

    2014-03-01

    In noisy listening conditions, intelligibility of degraded speech can be enhanced by top-down restoration. Cochlear implant (CI) users have difficulty understanding speech in noisy environments. This could partially be due to reduced top-down restoration of speech, which may be related to the changes that the electrical stimulation imposes on the bottom-up cues. We tested this hypothesis using the phonemic restoration (PhR) paradigm in which speech interrupted with periodic silent intervals is perceived illusorily continuous (continuity illusion or CoI) and becomes more intelligible (PhR benefit) when the interruptions are filled with noise bursts. Using meaningful sentences, both CoI and PhR benefit were measured in CI users, and compared with those of normal-hearing (NH) listeners presented with normal speech and 8-channel noise-band vocoded speech, acoustically simulating CIs. CI users showed different patterns in both PhR benefit and CoI, compared to NH results with or without the noise-band vocoding. However, they were able to use top-down restoration under certain test conditions. This observation supports the idea that changes in bottom-up cues can impose changes to the top-down processes needed to enhance intelligibility of degraded speech. The knowledge that CI users seem to be able to do restoration under the right circumstances could be exploited in patient rehabilitation and product development. PMID:24368138

  16. The acoustics for speech of eight auditoriums in the city of Sao Paulo

    NASA Astrophysics Data System (ADS)

    Bistafa, Sylvio R.

    2002-11-01

    Eight auditoriums with a proscenium type of stage, which usually operate as dramatic theaters in the city of Sao Paulo, were acoustically surveyed in terms of their adequacy to unassisted speech. Reverberation times, early decay times, and speech levels were measured in different positions, together with objective measures of speech intelligibility. The measurements revealed reverberation time values rather uniform throughout the rooms, whereas significant variations were found in the values of the other acoustical measures with position. The early decay time was found to be better correlated with the objective measures of speech intelligibility than the reverberation time. The results from the objective measurements of speech intelligibility revealed that the speech transmission index STI, and its simplified version RaSTI, are strongly correlated with the early-to-late sound ratio C50 (1 kHz). However, it was found that the criterion value of acceptability of the latter is more easily met than the former. The results from these measurements enable to understand how the characteristics of the architectural design determine the acoustical quality for speech. Measurements of ST1-Gade were made as an attempt to validate it as an objective measure of ''support'' for the actor. The preliminary diagnosing results with ray tracing simulations will also be presented.

  17. Intelligence: Theories and Testing.

    ERIC Educational Resources Information Center

    Papanastasiou, Elena C.

    This paper reviews what is known about intelligence and the use of intelligence tests. Environmental and hereditary factors that affect performance on intelligence tests are reviewed, along with various theories that have been proposed about the basis of intelligence. Intelligence tests do not test intelligence per se but make inferences about a…

  18. Speech-Language Therapy (For Parents)

    MedlinePlus

    ... 5 Things to Know About Zika & Pregnancy Speech-Language Therapy KidsHealth > For Parents > Speech-Language Therapy Print ... with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech disorder refers ...

  19. Time-expanded speech and speech recognition in older adults.

    PubMed

    Vaughan, Nancy E; Furukawa, Izumi; Balasingam, Nirmala; Mortz, Margaret; Fausti, Stephen A

    2002-01-01

    Speech understanding deficits are common in older adults. In addition to hearing sensitivity, changes in certain cognitive functions may affect speech recognition. One such change that may impact the ability to follow a rapidly changing speech signal is processing speed. When speakers slow the rate of their speech naturally in order to speak clearly, speech recognition is improved. The acoustic characteristics of naturally slowed speech are of interest in developing time-expansion algorithms to improve speech recognition for older listeners. In this study, we tested younger normally hearing, older normally hearing, and older hearing-impaired listeners on time-expanded speech using increased duration and increased intensity of unvoiced consonants. Although all groups performed best on unprocessed speech, performance with processed speech was better with the consonant gain feature without time expansion in the noise condition and better at the slowest time-expanded rate in the quiet condition. The effects of signal processing on speech recognition are discussed. PMID:17642020

  20. Acoustic and Perceptual Consequences of Clear and Loud Speech

    PubMed Central

    Tjaden, Kris; Richards, Emily; Kuo, Christina; Wilding, Greg; Sussman, Joan

    2014-01-01

    Objective Several issues concerning F2 slope in dysarthria were addressed by obtaining speech acoustic measures and judgments of intelligibility for sentences produced in Habitual, Clear and Loud conditions by speakers with Parkinson's disease (PD) and healthy controls. Patients and Methods Acoustic measures of average and maximum F2 slope for diphthongs, duration and intensity were obtained. Listeners judged intelligibility using a visual analog scale. Differences in measures among groups and conditions as well as relationships among measures were examined. Results Average and maximum F2 slope metrics were strongly correlated, but only average F2 slope consistently differed among groups and conditions, with shallower slopes for the PD group and steeper slopes for Clear speech versus Habitual and Loud. Clear and Loud speech were also characterized by lengthened durations, increased intensity and improved intelligibility versus Habitual. F2 slope and intensity were unrelated, and F2 slope was a significant predictor of intelligibility. Conclusion Average diphthong F2 slope was more sensitive than maximum F2 slope to articulatory mechanism involvement in mild dysarthria in PD. F2 slope holds promise as an objective measure of treatment-related changes in the articulatory mechanism for therapeutic techniques that focus on articulation. PMID:24504015

  1. Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners.

    PubMed

    Park, Hyojin; Ince, Robin A A; Schyns, Philippe G; Thut, Gregor; Gross, Joachim

    2015-06-15

    Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. PMID:26028433

  2. Modeling Pathological Speech Perception From Data With Similarity Labels.

    PubMed

    Berisha, Visar; Liss, Julie; Sandoval, Steven; Utianski, Rene; Spanias, Andreas

    2014-05-01

    The current state of the art in judging pathological speech intelligibility is subjective assessment performed by trained speech pathologists (SLP). These tests, however, are inconsistent, costly and, oftentimes suffer from poor intra- and inter-judge reliability. As such, consistent, reliable, and perceptually-relevant objective evaluations of pathological speech are critical. Here, we propose a data-driven approach to this problem. We propose new cost functions for examining data from a series of experiments, whereby we ask certified SLPs to rate pathological speech along the perceptual dimensions that contribute to decreased intelligibility. We consider qualitative feedback from SLPs in the form of comparisons similar to statements "Is Speaker A's rhythm more similar to Speaker B or Speaker C?" Data of this form is common in behavioral research, but is different from the traditional data structures expected in supervised (data matrix + class labels) or unsupervised (data matrix) machine learning. The proposed method identifies relevant acoustic features that correlate with the ordinal data collected during the experiment. Using these features, we show that we are able to develop objective measures of the speech signal degradation that correlate well with SLP responses. PMID:25435817

  3. Modeling Pathological Speech Perception From Data With Similarity Labels

    PubMed Central

    Berisha, Visar; Liss, Julie; Sandoval, Steven; Utianski, Rene; Spanias, Andreas

    2014-01-01

    The current state of the art in judging pathological speech intelligibility is subjective assessment performed by trained speech pathologists (SLP). These tests, however, are inconsistent, costly and, oftentimes suffer from poor intra- and inter-judge reliability. As such, consistent, reliable, and perceptually-relevant objective evaluations of pathological speech are critical. Here, we propose a data-driven approach to this problem. We propose new cost functions for examining data from a series of experiments, whereby we ask certified SLPs to rate pathological speech along the perceptual dimensions that contribute to decreased intelligibility. We consider qualitative feedback from SLPs in the form of comparisons similar to statements “Is Speaker A's rhythm more similar to Speaker B or Speaker C?” Data of this form is common in behavioral research, but is different from the traditional data structures expected in supervised (data matrix + class labels) or unsupervised (data matrix) machine learning. The proposed method identifies relevant acoustic features that correlate with the ordinal data collected during the experiment. Using these features, we show that we are able to develop objective measures of the speech signal degradation that correlate well with SLP responses. PMID:25435817

  4. Intelligent Fasteners

    NASA Technical Reports Server (NTRS)

    1997-01-01

    Under a Small Business Innovation Research contract from Marshall Space Flight Center, Ultrafast, Inc. developed the world's first, high-temperature resistant, "intelligent" fastener. NASA needed a critical-fastening appraisal and validation of spacecraft segments that are coupled together in space. The intelligent-bolt technology deletes the self-defeating procedure of having to untighten the fastener, and thus upset the joint, during inspection and maintenance. The Ultrafast solution yielded an innovation that is likely to revolutionize manufacturing assembly, particularly the automobile industry. Other areas of application range from aircraft, computers and fork-lifts to offshore platforms, buildings, and bridges.

  5. Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech

    NASA Astrophysics Data System (ADS)

    Iuzzini, Jenya

    There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were

  6. Voice intelligibility in satellite mobile communications

    NASA Technical Reports Server (NTRS)

    Wishna, S.

    1973-01-01

    An amplitude control technique is reported that equalizes low level phonemes in a satellite narrow band FM voice communication system over channels having low carrier to noise ratios. This method presents at the transmitter equal amplitude phonemes so that the low level phonemes, when they are transmitted over the noisey channel, are above the noise and contribute to output intelligibility. The amplitude control technique provides also for squelching of noise when speech is not being transmitted.

  7. Perception of Dialect Variation in Noise: Intelligibility and Classification

    PubMed Central

    Clopper, Cynthia G.; Bradlow, Ann R.

    2009-01-01

    Listeners can explicitly categorize unfamiliar talkers by regional dialect with above-chance performance under ideal listening conditions. However, the extent to which this important source of variation affects speech processing is largely unknown. In a series of four experiments, we examined the effects of dialect variation on speech intelligibility in noise and the effects of noise on perceptual dialect classification. Results revealed that, on the one hand, dialect-specific differences in speech intelligibility were more pronounced at harder signal-to-noise ratios, but were attenuated under more favorable listening conditions. Listener dialect did not interact with talker dialect; for all listeners, at a range of noise levels, the General American talkers were the most intelligible and the Mid-Atlantic talkers were the least intelligible. Dialect classification performance, on the other hand, was poor even with only moderate amounts of noise. These findings suggest that at moderate noise levels, listeners are able to adapt to dialect variation in the acoustic signal such that some cross-dialect intelligibility differences are neutralized, despite relatively poor explicit dialect classification performance. However, at more difficult noise levels, participants cannot effectively adapt to dialect variation in the acoustic signal and cross-dialect differences in intelligibility emerge for all listeners, regardless of their dialect. PMID:19626923

  8. Communication, Listening, Cognitive and Speech Perception Skills in Children with Auditory Processing Disorder (APD) or Specific Language Impairment (SLI)

    ERIC Educational Resources Information Center

    Ferguson, Melanie A.; Hall, Rebecca L.; Riley, Alison; Moore, David R.

    2011-01-01

    Purpose: Parental reports of communication, listening, and behavior in children receiving a clinical diagnosis of specific language impairment (SLI) or auditory processing disorder (APD) were compared with direct tests of intelligence, memory, language, phonology, literacy, and speech intelligibility. The primary aim was to identify whether there…

  9. Free Speech Yearbook 1980.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    The 11 articles in this collection deal with theoretical and practical freedom of speech issues. The topics covered are (1) the United States Supreme Court and communication theory; (2) truth, knowledge, and a democratic respect for diversity; (3) denial of freedom of speech in Jock Yablonski's campaign for the presidency of the United Mine…

  10. Free Speech. No. 38.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds Voted For Schorr Inquiry" by Richard Lyons, "Erosion of the…

  11. Tracking Speech Sound Acquisition

    ERIC Educational Resources Information Center

    Powell, Thomas W.

    2011-01-01

    This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

  12. Chief Seattle's Speech Revisited

    ERIC Educational Resources Information Center

    Krupat, Arnold

    2011-01-01

    Indian orators have been saying good-bye for more than three hundred years. John Eliot's "Dying Speeches of Several Indians" (1685), as David Murray notes, inaugurates a long textual history in which "Indians... are most useful dying," or, as in a number of speeches, bidding the world farewell as they embrace an undesired but apparently inevitable…

  13. Illustrated Speech Anatomy.

    ERIC Educational Resources Information Center

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  14. Migrations in Speech Recognition.

    ERIC Educational Resources Information Center

    Kolinsky, Regine; Morais, Jose

    1996-01-01

    Describes a new paradigm that may be appropriate for uncovering speech perceptual codes. Illusory words are detected by blending two dichotic stimuli. The paradigm's design allows for comparison of different speech units by the manipulation of the distribution of information between two inputs. (23 references) (Author/CK)

  15. Private Speech in Ballet

    ERIC Educational Resources Information Center

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  16. Teaching Freedom of Speech.

    ERIC Educational Resources Information Center

    McGaffey, Ruth

    1983-01-01

    The speech communication department at the University of Wisconsin, Madison, provides a rigorous and legally oriented course in freedom of speech. The objectives of the course are to help students gain insight into the historical and philosophical foundations of the First Amendment, the legal/judicial processes concerning the First Amendment, and…

  17. Free Speech Yearbook 1976.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The articles collected in this annual address several aspects of First Amendment Law. The following titles are included: "Freedom of Speech As an Academic Discipline" (Franklyn S. Haiman), "Free Speech and Foreign-Policy Decision Making" (Douglas N. Freeman), "The Supreme Court and the First Amendment: 1975-1976" (William A. Linsley), "'Arnett v.…

  18. Subjectless Sentences in Child Danish.

    ERIC Educational Resources Information Center

    Hamann, Cornelia; Plunkett, Kim

    1998-01-01

    Examined data for two Danish children to determine subject omission, verb usage, and sentence subjects. Found that children exhibit asymmetry in subject omission according to verb type as subjects are omitted from main verb utterances more frequently than from copula utterances. Concluded that treatment of child subject omission should involve…

  19. Nature and Nationhood: Danish Perspectives

    ERIC Educational Resources Information Center

    Schnack, Karsten

    2009-01-01

    In this paper, I shall discuss Danish perspectives on nature, showing the interdependence of conceptions of "nature" and "nationhood" in the formations of a particular cultural community. Nature, thus construed, is never innocent of culture and cannot therefore simply be "restored" to some pristine, pre-lapsarian state. On the other hand,…

  20. Speech processing standards

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1990-05-01

    Speech processing standards are given for 64, 32, 16 kb/s and lower rate speech and more generally, speech-band signals which are or will be promulgated by CCITT and NATO. The International Telegraph and Telephone Consultative Committee (CCITT) of the International body which deals, among other things, with speech processing within the context of ISDN. Within NATO there are also bodies promulgating standards which make interoperability, possible without complex and expensive interfaces. Some of the applications for low-bit rate voice and the related work undertaken by CCITT Study Groups which are responsible for developing standards in terms of encoding algorithms, codec design objectives as well as standards on the assessment of speech quality, are highlighted.

  1. Automatic speech recognition

    NASA Astrophysics Data System (ADS)

    Espy-Wilson, Carol

    2005-04-01

    Great strides have been made in the development of automatic speech recognition (ASR) technology over the past thirty years. Most of this effort has been centered around the extension and improvement of Hidden Markov Model (HMM) approaches to ASR. Current commercially-available and industry systems based on HMMs can perform well for certain situational tasks that restrict variability such as phone dialing or limited voice commands. However, the holy grail of ASR systems is performance comparable to humans-in other words, the ability to automatically transcribe unrestricted conversational speech spoken by an infinite number of speakers under varying acoustic environments. This goal is far from being reached. Key to the success of ASR is effective modeling of variability in the speech signal. This tutorial will review the basics of ASR and the various ways in which our current knowledge of speech production, speech perception and prosody can be exploited to improve robustness at every level of the system.

  2. Intelligence Studies

    ERIC Educational Resources Information Center

    Monaghan, Peter

    2009-01-01

    To make an academic study of matters inherently secret and potentially explosive seems a tall task. But a growing number of scholars are drawn to understanding spycraft. The interdisciplinary field of intelligence studies is mushrooming, as scholars trained in history, international studies, and political science examine such subjects as the…

  3. Perceptual restoration of degraded speech is preserved with advancing age.

    PubMed

    Saija, Jefta D; Akyürek, Elkan G; Andringa, Tjeerd C; Başkent, Deniz

    2014-02-01

    Cognitive skills, such as processing speed, memory functioning, and the ability to divide attention, are known to diminish with aging. The present study shows that, despite these changes, older adults can successfully compensate for degradations in speech perception. Critically, the older participants of this study were not pre-selected for high performance on cognitive tasks, but only screened for normal hearing. We measured the compensation for speech degradation using phonemic restoration, where intelligibility of degraded speech is enhanced using top-down repair mechanisms. Linguistic knowledge, Gestalt principles of perception, and expectations based on situational and linguistic context are used to effectively fill in the inaudible masked speech portions. A positive compensation effect was previously observed only with young normal hearing people, but not with older hearing-impaired populations, leaving the question whether the lack of compensation was due to aging or due to age-related hearing problems. Older participants in the present study showed poorer intelligibility of degraded speech than the younger group, as expected from previous reports of aging effects. However, in conditions that induce top-down restoration, a robust compensation was observed. Speech perception by the older group was enhanced, and the enhancement effect was similar to that observed with the younger group. This effect was even stronger with slowed-down speech, which gives more time for cognitive processing. Based on previous research, the likely explanations for these observations are that older adults can overcome age-related cognitive deterioration by relying on linguistic skills and vocabulary that they have accumulated over their lifetime. Alternatively, or simultaneously, they may use different cerebral activation patterns or exert more mental effort. This positive finding on top-down restoration skills by the older individuals suggests that new cognitive training methods

  4. Audibility-based predictions of speech recognition for children and adults with normal hearing.

    PubMed

    McCreery, Ryan W; Stelmachowicz, Patricia G

    2011-12-01

    This study investigated the relationship between audibility and predictions of speech recognition for children and adults with normal hearing. The Speech Intelligibility Index (SII) is used to quantify the audibility of speech signals and can be applied to transfer functions to predict speech recognition scores. Although the SII is used clinically with children, relatively few studies have evaluated SII predictions of children's speech recognition directly. Children have required more audibility than adults to reach maximum levels of speech understanding in previous studies. Furthermore, children may require greater bandwidth than adults for optimal speech understanding, which could influence frequency-importance functions used to calculate the SII. Speech recognition was measured for 116 children and 19 adults with normal hearing. Stimulus bandwidth and background noise level were varied systematically in order to evaluate speech recognition as predicted by the SII and derive frequency-importance functions for children and adults. Results suggested that children required greater audibility to reach the same level of speech understanding as adults. However, differences in performance between adults and children did not vary across frequency bands. PMID:22225061

  5. Voice and Speech after Laryngectomy

    ERIC Educational Resources Information Center

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  6. Sperry Univac speech communications technology

    NASA Technical Reports Server (NTRS)

    Medress, Mark F.

    1977-01-01

    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

  7. Speech Pathology Assistant. Trainee Manual.

    ERIC Educational Resources Information Center

    National Association for Hearing and Speech Action, Silver Spring, MD.

    Part of an instructional set which includes an instructor's guide, this trainee manual is designed to provide speech pathology students with some basic and essential knowledge about the communication process. The manual contains nine modules: (1) speech pathology assistant, (2) the bases of speech (structure and function of the speech mechanism,…

  8. Speech Correction in the Schools.

    ERIC Educational Resources Information Center

    Eisenson, Jon; Ogilvie, Mardel

    An introduction to the problems and therapeutic needs of school age children whose speech requires remedial attention, the text is intended for both the classroom teacher and the speech correctionist. General considerations include classification and incidence of speech defects, speech correction services, the teacher as a speaker, the mechanism…

  9. [Speech therapy intervention in phonological disorders from the psycholinguistic paradigm of speech processing].

    PubMed

    Cervera-Mérida, J F; Ygual-Fernández, A

    2003-02-01

    The aim of this study is to present a survey of speech therapy intervention in phonological disorders (PD). We will examine the concepts of normal phonological development and those involved in PD in order to understand how they have been dealt with, historically, in speech therapy intervention. Lastly, we will describe how evaluation and intervention are carried out from the speech processing paradigm. Phonetic phonological skills allow people to decode the phonic strings they hear so as to be able to gain access to their phonological form and meaning. These abilities also enable them to encode these strings from lexical representations to pronounce words. The greater part of their development takes place during approximately the first four years of life. Speech processing difficulties affect the phonetic phonological skills and occur throughout almost all language pathologies, although the effect they exert is not always the same. This can range from a lack of the capacity to speak to important problems of intelligibility or mild problems with certain phonemes. Their influence on learning to read and write has been shown in recent decades. Speech therapy intervention began from a model based on articulatory phonetics. In the 70s a linguistic model based on the process of speech simplification and phonological analysis was added and this gave rise to a marked improvement in the systems used for evaluation and intervention. At present we have assumed a psycholinguistic model that links the perceptive skills with productive ones and top-down or bottom-up processing (from lexical representations to perception or production of phonemes and vice-versa). PMID:12599102

  10. Speech Delay: Its Treatment by Speech Play.

    ERIC Educational Resources Information Center

    Craft, Michael

    Directed to parents, the text discusses normal and delayed speech development and considers the causes of delay. Suggestions are given for helping deaf, emotionally disturbed, brain damaged, and physically handicapped children. Additional suggestions are provided for parents of twins, of stutterers, and of mongoloid or multiply handicapped…

  11. Speech privacy and annoyance considerations in the acoustic environment of passenger cars of high-speed trains.

    PubMed

    Jeon, Jin Yong; Hong, Joo Young; Jang, Hyung Suk; Kim, Jae Hyeon

    2015-12-01

    It is necessary to consider not only annoyance of interior noises but also speech privacy to achieve acoustic comfort in a passenger car of a high-speed train because speech from other passengers can be annoying. This study aimed to explore an optimal acoustic environment to satisfy speech privacy and reduce annoyance in a passenger car. Two experiments were conducted using speech sources and compartment noise of a high speed train with varying speech-to-noise ratios (SNRA) and background noise levels (BNL). Speech intelligibility was tested in experiment I, and in experiment II, perceived speech privacy, annoyance, and acoustic comfort of combined sounds with speech and background noise were assessed. The results show that speech privacy and annoyance were significantly influenced by the SNRA. In particular, the acoustic comfort was evaluated as acceptable when the SNRA was less than -6 dB for both speech privacy and noise annoyance. In addition, annoyance increased significantly as the BNL exceeded 63 dBA, whereas the effect of the background-noise level on the speech privacy was not significant. These findings suggest that an optimal level of interior noise in a passenger car might exist between 59 and 63 dBA, taking normal speech levels into account. PMID:26723351

  12. Intelligibility in English: Of What Relevance Today to Intercultural Communication?

    ERIC Educational Resources Information Center

    Nair-Venugopal, Shanta

    2003-01-01

    Reviews the construct of speech intelligibility as expounded by Cathford half a century ago in a landmark treatise and in the collaborative efforts of Smith, Nelson, and Rafiqzad to examine its relevance to intercultural communication in a new millennium beset by the contradictory global tensions of homogeneity and fragmentation. (Author/VWL)

  13. Predicting the intelligibility of vocoded and wideband Mandarin Chinese

    PubMed Central

    Chen, Fei; Loizou, Philipos C.

    2011-01-01

    Due to the limited number of cochlear implantees speaking Mandarin Chinese, it is extremely difficult to evaluate new speech coding algorithms designed for tonal languages. Access to an intelligibility index that could reliably predict the intelligibility of vocoded (and non-vocoded) Mandarin Chinese is a viable solution to address this challenge. The speech-transmission index (STI) and coherence-based intelligibility measures, among others, have been examined extensively for predicting the intelligibility of English speech but have not been evaluated for vocoded or wideband (non-vocoded) Mandarin speech despite the perceptual differences between the two languages. The results indicated that the coherence-based measures seem to be influenced by the characteristics of the spoken language. The highest correlation (r= 0.91–0.97) was obtained in Mandarin Chinese with a weighted coherence measure that included primarily information from high-intensity voiced segments (e.g., vowels) containing F0 information, known to be important for lexical tone recognition. In contrast, in English, highest correlation was obtained with a coherence measure that included information from weak consonants and vowel∕consonant transitions. A band-importance function was proposed that captured information about the amplitude envelope contour. A higher modulation rate (100 Hz) was found necessary for the STI-based measures for maximum correlation (r = 0.94–0.96) with vocoded Mandarin and English recognition. PMID:21568429

  14. Intelligibility of an ASR-controlled synthetic talking face

    NASA Astrophysics Data System (ADS)

    Siciliano, Catherine; Williams, Geoff; Faulkner, Andrew; Salvi, Giampiero

    2001-05-01

    The goal of the SYNFACE project is to develop a multilingual synthetic talking face, driven by an automatic speech recognizer (ASR), to assist hearing-impaired people with telephone communication. Previous multilingual experiments with the synthetic face have shown that time-aligned synthesized visual face movements can enhance speech intelligibility in normal-hearing and hearing-impaired users [C. Siciliano et al., Proc. Int. Cong. Phon. Sci. (2003)]. Similar experiments are in progress to examine whether the synthetic face remains intelligible when driven by ASR output. The recognizer produces phonetic output in real time, in order to drive the synthetic face while maintaining normal dialogue turn-taking. Acoustic modeling was performed with a neural network, while an HMM was used for decoding. The recognizer was trained on the SpeechDAT telephone speech corpus. Preliminary results suggest that the currently achieved recognition performance of around 60% frames correct limits the usefulness of the synthetic face movements. This is particularly true for consonants, where correct place of articulation is especially important for visual intelligibility. Errors in the alignment of phone boundaries representative of those arising in the ASR output were also shown to decrease audio-visual intelligibility. [Work supported by the EU IST Project 2001-33327.

  15. Effects of Lexical Tone Contour on Mandarin Sentence Intelligibility

    ERIC Educational Resources Information Center

    Chen, Fei; Wong, Lena L. N.; Hu, Yi

    2014-01-01

    Purpose: This study examined the effects of lexical tone contour on the intelligibility of Mandarin sentences in quiet and in noise. Method: A text-to-speech synthesis engine was used to synthesize Mandarin sentences with each word carrying the original lexical tone, flat tone, or a tone randomly selected from the 4 Mandarin lexical tones. The…

  16. Predicting the intelligibility of vocoded and wideband Mandarin Chinese.

    PubMed

    Chen, Fei; Loizou, Philipos C

    2011-05-01

    Due to the limited number of cochlear implantees speaking Mandarin Chinese, it is extremely difficult to evaluate new speech coding algorithms designed for tonal languages. Access to an intelligibility index that could reliably predict the intelligibility of vocoded (and non-vocoded) Mandarin Chinese is a viable solution to address this challenge. The speech-transmission index (STI) and coherence-based intelligibility measures, among others, have been examined extensively for predicting the intelligibility of English speech but have not been evaluated for vocoded or wideband (non-vocoded) Mandarin speech despite the perceptual differences between the two languages. The results indicated that the coherence-based measures seem to be influenced by the characteristics of the spoken language. The highest correlation (r = 0.91-0.97) was obtained in Mandarin Chinese with a weighted coherence measure that included primarily information from high-intensity voiced segments (e.g., vowels) containing F0 information, known to be important for lexical tone recognition. In contrast, in English, highest correlation was obtained with a coherence measure that included information from weak consonants and vowel/consonant transitions. A band-importance function was proposed that captured information about the amplitude envelope contour. A higher modulation rate (100 Hz) was found necessary for the STI-based measures for maximum correlation (r = 0.94-0.96) with vocoded Mandarin and English recognition. PMID:21568429

  17. Portable Speech Synthesizer

    NASA Technical Reports Server (NTRS)

    Leibfritz, Gilbert H.; Larson, Howard K.

    1987-01-01

    Compact speech synthesizer useful traveling companion to speech-handicapped. User simply enters statement on board, and synthesizer converts statement into spoken words. Battery-powered and housed in briefcase, easily carried on trips. Unit used on telephones and face-to-face communication. Synthesizer consists of micro-computer with memory-expansion module, speech-synthesizer circuit, batteries, recharger, dc-to-dc converter, and telephone amplifier. Components, commercially available, fit neatly in 17-by 13-by 5-in. briefcase. Weighs about 20 lb (9 kg) and operates and recharges from ac receptable.

  18. Artificial Intelligence.

    PubMed

    Lawrence, David R; Palacios-González, César; Harris, John

    2016-04-01

    It seems natural to think that the same prudential and ethical reasons for mutual respect and tolerance that one has vis-à-vis other human persons would hold toward newly encountered paradigmatic but nonhuman biological persons. One also tends to think that they would have similar reasons for treating we humans as creatures that count morally in our own right. This line of thought transcends biological boundaries-namely, with regard to artificially (super)intelligent persons-but is this a safe assumption? The issue concerns ultimate moral significance: the significance possessed by human persons, persons from other planets, and hypothetical nonorganic persons in the form of artificial intelligence (AI). This article investigates why our possible relations to AI persons could be more complicated than they first might appear, given that they might possess a radically different nature to us, to the point that civilized or peaceful coexistence in a determinate geographical space could be impossible to achieve. PMID:26957450

  19. The Effect of SpeechEasy on Stuttering Frequency, Speech Rate, and Speech Naturalness

    ERIC Educational Resources Information Center

    Armson, Joy; Kiefte, Michael

    2008-01-01

    The effects of SpeechEasy on stuttering frequency, stuttering severity self-ratings, speech rate, and speech naturalness for 31 adults who stutter were examined. Speech measures were compared for samples obtained with and without the device in place in a dispensing setting. Mean stuttering frequencies were reduced by 79% and 61% for the device…

  20. Multi-channel spatial auditory display for speech communications

    NASA Technical Reports Server (NTRS)

    Begault, Durand; Erbe, Tom

    1993-01-01

    A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.